We should forget about small efficiencies, say about 97% of the time: premature optimization is the root of all evil. (Knuth, Donald. Structured Programming with go to Statements, ACM Journal Computing Surveys, Vol 6, No. 4, Dec. 1974. p.268.)
The message here is that getting good performance is about picking the right data structures and using profiling tools to locate and fix remaining performance bottlenecks when they occur. I recently encountered just such a bottleneck in my own application and wanted to get to the bottom of it; in order to do so I wanted to attach a remote profiler to my application. Unlike remote debugging, remote profiling is painful to configure for Java and this is compounded by some confusing documentation.
I am not an expert in the various Java tooling and instrumentation APIs but after hacking my way through to a working profiling environment, I have a general idea of what's what. For now, I just want to get this write up out of my queue so that nobody else has to go through this particular install hell and so that I can get on to the follow-up CSS article that I've been promising for a few weeks. If you want to correct any mistakes please drop me an email and I'll amend this document accordingly.
Choosing a profiling application
Initially I had planned to buy a profiling tool and use that, but the commercial offerings all suffered from some of the following:
- Required arcane registration processes
- Not compatible with Linux
- They used evil license management tools
- Couldn't run remotely
- Didn't support Java 6
- Bad documentation
In view of these problems I decided my time was better invested in getting up to speed with one of the open source tools. Eclipse is my preferred development environment, so some cursory research suggested that my best option would be to figure out how to use the Eclipse Tracing and Profiling Tools Project (TPTP) which has all the features that I wanted.
The Eclipse side projects suffer from acronym explosions, most of the discussions seem to take place in mailing lists1, and the documentation is often out of date. Despite this, they are often distinctly superior to commercial solutions particularly when the commercial solutions are lacking in those areas too! So, after thrashing around for a bit, I managed to get things working. It was a combination of Could Do Better and Shows Promise on the report card for Eclipse TPTP today. Better than the commercial tools but still not great. Mind you, it's free software, so complaining is ungracious: don't complain; fix it. Hence my rough and ready walk through to help you out when the existing documentation is perhaps a little vague.
Technology
This document does not describe the use of the Eclipse TPTP project once you have everything configured - the existing documentation for that is pretty good.
This document does not describe the configuration of Eclipse TPTP for profiling local applications or profiling applications running from within the Eclipse environment. I haven't tried this but as I understand it the configuration is a far simpler matter - trivial even.
The following table summarises the hardware and software that I started out with. I specify the hardware - normally irrelevant for Java development - because the profiling configuration includes components implemented in native code. When I initially tried an install to an underpowered machine with a VIA processor instead of Intel I saw some crashes that I suspect were incompatibilities between the VIA processor and an Intel-specific binary that I was using.
| Server | |
| Hardware: | Intel Core 2 Duo CPU E4500 @ 2.20GHz) |
| OS: | Debian Edgy (4.0) |
| JVM: | Sun Java SE 1.6.0_06 |
| App Server: | Apache Tomcat/5.5 |
| Client | |
| Hardware: | Intel Pentium M @ 1.60GHz |
| OS: | Kubuntu Hardy (8.04) |
| JVM: | Sun Java SE 1.6.0_06 |
| IDE: | Eclipse |
In addition to the existing software, I installed version 4.5.1 of the Agent Controller for Linux on IA32 to the server and the same version (4.5.1) of the TPTP plugins via the Eclipse update manager to my Eclipse client.
Background and Concepts
So why is the profiling configuration so complex when setting the JVM up for remote debugging is a matter of adding a few command line flags2 to the JVM invocation? In this section I'll outline some of the causes of this complexity and list the APIs and components required to carry out profiling with Eclipse TPTP in Java 6.
Causes of Complexity
The old profiling API (JVM PI) and debugging API (JVM DI) were deprecated as of Java 5 and removed as of Java 6. So if you have an old profiling tool you simply can't use it with the latest versions of the JVM. This makes a lot of older documentation misleading.
With Java 5 the new Java Platform Debugger Architecture API (JPDA) was introduced. The advantage that this has over the older APIs is that it is extensible - plugins can be added to JVMs that support JPDA without requiring intrusive changes to the JVM's implementation. The disadvantage is that by loosening the coupling between the JVM implementation and the instrumentation support a host of new APIs is introduced.
Profiling should not unduly influence the behaviour of the program being profiled - otherwise the developer will optimise their application for the special case of running under the profiler, perhaps missing pathological cases that do not arise in these circumstances. To avoid this problem, the profiler hooks should run as native code rather than running as Java code within the JVM itself - this adds another set of tools to configure; Java developers who are normally insulated from raw operating system differences by the JVM - or an application server environment built upon it - suddenly have to deal with installing native tools and libraries.
JPDA and TPTP
The basic technologies involved in our profiling scenario are therefore the standard JPDA APIs and the Eclipse and TPTP specific support hooking into them. On the server side, the TPTP support consists of a native code shared library plugin to the JVM and a server process that communicates with it. On the client side, the TPTP support consists of a Java library to talk to the TPTP server process over TCP/IP and a set of normal Eclipse plugins that communicate with this library. The relationship between these various components is illustrated in the following figure:

The Native Back End is the native code plugin to the JVM that gathers the performance information from the JVM and passes it to the Agent Controller Server. The Native Back End is implemented as a shared library - a so file on Unix systems or a dll file on Windows systems.
The Java Virtual Machine Tool Interface (JVM TI) is the standard API built into the JVM that allows the Native Back End to instrument the JVM and acquire data from it.
The Agent Controller Server is a stand-alone process that communicates (usually via shared memory) with the Native Back End. It moves such concerns as managing network connections and marshalling instructions from the profiling client out of the process space of the JVM. This decouples the metric gathering from the other concerns and improves the resilience of the Native Back End (fatal errors in the Agent Controller Server will not normally crash the JVM).
The Java Debug Wire Protocol (JDWP) is a standard protocol for communicating instructions from clients, such as debuggers and profilers, to their implementations in a JVM. The communications channel used to convey the JDWP commands is not specified, but it is typically TCP/IP and that is the case with the Eclipse TPTP implementation. In principle a client other than Eclipse TPTP could control the Agent Controller client without needing to be specially written for it.
The Front End is a Java client application that communicates with the native back end. In this scenario it is an Eclipse TPTP plugin component and communicates with the Agent Controller server over TCP/IP using the JDWP. Its purpose is to provide an implementation of the standard API that profiling and debugging tools use to communicate with the back end.
The Java Debug Interface (JDI) is the standard API provided by the Front End for use by client profiling and debugging tools.
The TPTP Client shown in the diagram is in fact the suite of graphical Eclipse plugins that hook into the JDI API provided by the Front End. They are somewhat independent of the rest of the configuration as an appropriate Front End implementation would allow them to communicate with profiling implementations from other providers (such as commercial vendors or future standardized parts of the JVM). With a few byzantine configuration tweaks it ought to be possible to persuade the TPTP Client to talk to any other vendor's Front End.
Of these, the standard APIs and protocols are JVM TI, JDWP, and JDI, all of which are part of the JPDA. The Eclipse TPTP components that hook into or provide concrete implementations of them are the Native Back End, the Agent Controller Server, the Front End, and the TPTP Client (in fact this last consists of several Eclipse plugins).
The server installation
This is essentially a view over my shoulder as I carried out the installation, with some minor extra commentary. You are expected to understand what my command prompt is, what is the output of the commands issued, and how to adjust the commands issued for your own system's configuration. If you're not comfortable with that, then I'm sorry, but you're going to need some help with this - find a colleague or friend who is familiar with Unix and get them to lend a hand. This is not a nice clean automated install process. I wish it was.
If the remote application that you will be testing doesn't run on Tomcat on a Debian derived Linux (this includes the various Ubuntu projects) then you will have some major changes to make and won't be able to follow the script particularly closely. If it's a Windows platform then you will have a lot of them! Sorry, but again, you're on your own here - however you should definitely skim through this as some of the environment variable that need to be configured are required and were not conspicuously documented.
Get the agent controller
From the download site http://www.eclipse.org/tptp/home/downloads/?ver=4.5.1 pull down a copy of the Agent Controller runtime for your platform (LINUX-IA32): http://www.eclipse.org/downloads/download.php?file=/tptp/4.5.1/TPTP-4.5.1/agntctrl.linux_ia32-TPTP-4.5.1.zip
dcminter@IS-5285:~$ wget http://www.mirrorservice.org/sites/download.eclipse.org/eclipseMirror/tptp/4.5.1/TPTP-4.5.1/agntctrl.linux_ia32-TPTP-4.5.1.zip
Unpack the Zip to a suitable directory - yes, it's one of those annoying Zip files that has lots of files and directories in the root of the archive instead of stashing everything in a nice versioned subdirectory.
dcminter@IS-5285:~$ mkdir agntctrl dcminter@IS-5285:~$ cd agntctrl/ dcminter@IS-5285:~/agntctrl$ unzip ../agntctrl.linux_ia32-TPTP-4.5.1.zip Archive: ../agntctrl.linux_ia32-TPTP-4.5.1.zip creating: Resources/ inflating: Resources/filters.txt inflating: Resources/jvmpi.pro inflating: about.html ... lib/libjavaBaseAgent.so -> libjavaBaseAgent.so.4 lib/libhcthread.so -> libhcthread.so.4 lib/libnamedPipeTL.so.4 -> libnamedPipeTL.so.4.5.0 lib/libtptpClient.so.4 -> libtptpClient.so.4.5.0 lib/libprocessControlUtil.so -> libprocessControlUtil.so.4 lib/libtptpConfig.so.4 -> libtptpConfig.so.4.5.0 lib/libhcbnd.so -> libhcbnd.so.4 lib/libtptpLogUtils.so.4 -> libtptpLogUtils.so.4.5.0 dcminter@IS-5285:~/agntctrl$
Configure the agent controller
The configuration of the agent controller is specified in the file config/serviceconfig.xml relative to the root of the unpacked zip file. You may want to change the port numbers that the agent controller will listen on, but even if you don't you will almost certainly want to change the hosts that are allowed to connect. By default, only local hosts are permitted. For the sake of simplicity I have changed this to "all" in the (abbreviated) configuration file shown, but for a more secure solution3 you should use specific host names or IP addresses instead.
<AgentControllerConfiguration>
...
<Connection>
...
<TransportLayer loadlib="tptpCCTL" type="TPTP_CCTL">
<Configuration>
<!-- OPTIONALLY CHANGE THESE PORTS! -->
<Port>10002</Port>
<SecuredPort>10003</SecuredPort>
<FilePort>10005</FilePort>
<IsDataMultiplexed>false</IsDataMultiplexed>
<ProcessPolling>true</ProcessPolling>
<Version>4.4.1</Version>
<SecurityEnabled>false</SecurityEnabled>
<Hosts configuration="default">
<!-- Changed from "LOCAL" to "ALL" : -- >
<Allow host="ALL"/>
</Hosts>
</Configuration>
...
</TransportLayer>
...
</Connection>
...
</AgentControllerConfiguration>
With the appropriate configuration file you might expect to be able to start the client, but it turns out that it has a dependency upon a specific libc version as the error in the following attempt to start it shows:
dcminter@IS-5285:~/agntctrl/bin$ ./RAStart.sh Starting Agent Controller. ACServer: error while loading shared libraries: libstdc++-libc6.2-2.so.3: cannot open shared object file: No such file or directory ACServer started successfully. This is a lie! dcminter@IS-5285:~$
D'oh! So, we need to install the library. You can't easily install just this shared library, but you can pull in a specific glibc version that seems to supply it as a dependency. I'm not completely sure if this is a sensible solution, but it does work. The origin of this tip is http://ubuntuforums.org/archive/index.php/t-1879.html)
dcminter@IS-5285:~$ sudo apt-get install libstdc++2.10-glibc2.2 Password: Reading package lists... Done Building dependency tree... Done The following NEW packages will be installed libstdc++2.10-glibc2.2 0 upgraded, 1 newly installed, 0 to remove and 16 not upgraded. Need to get 330kB of archives. After unpacking 1356kB of additional disk space will be used. Get: 1 http://ftp.uk.debian.org etch/main libstdc++2.10-glibc2.2 1:2.95.4-27 [330kB] Fetched 330kB in 0s (5685kB/s) Selecting previously deselected package libstdc++2.10-glibc2.2. (Reading database ... 19093 files and directories currently installed.) Unpacking libstdc++2.10-glibc2.2 (from .../libstdc++2.10-glibc2.2_1%3a2.95.4-27_i386.deb) ... Setting up libstdc++2.10-glibc2.2 (2.95.4-27) ... dcminter@IS-5285:~$
...and now you should get a clean start:
dcminter@IS-5285:~/agntctrl/lib$ cd .. dcminter@IS-5285:~/agntctrl$ cd bin dcminter@IS-5285:~/agntctrl/bin$ ./RAStart.sh Starting Agent Controller. Creating default Agent Controller configuration file. Security is turned off. Access is set to Local. Run the SetConfig script to change the default settings. ACServer started successfully. This time it's not a lie! dcminter@IS-5285:~/agntctrl/bin$
Check your process list to make sure that it really started up and didn't fail silently for some reason:
dcminter@IS-5285:~/agntctrl/bin$ ps -Af UID PID PPID C STIME TTY TIME CMD ... dcminter 6143 1 0 00:00 pts/1 00:00:00 ACServer dcminter 6174 6143 0 00:00 pts/1 00:00:00 /home/dcminter/agntctrl/bin/tptpProcessController ... dcminter@IS-5285:~/agntctrl/bin$
Configure Tomcat to run with the agent controller
Now you need to edit Tomcat's configuration to allow the JVM to talk to the agent controller. Fire up your preferred editor and for Edgy you can make changes to the /etc/default/tomcat5.5 configuration file to make a system-wide change to Tomcat's environment.
Amend the CATALINA_OPTS environment variable to load the Native Back End (known here as the JPI agent) by adding the following flag: -agentlib:JPIBootLoader=JPIAgent:server=enabled;CGProf
In my configuration I have headless mode set, a 256M PermSize (to avoid PermGen errors after numerous webapp reloads - a memory leak but a fairly benign one), and a 512M heap, so the resulting configuration line reads:
CATALINA_OPTS="-Djava.awt.headless=true -XX:MaxPermSize=256m -Xmx512M -agentlib:JPIBootLoader=JPIAgent:server=enabled;CGProf"
Add also the following entries to locate and configure various libraries needed by the JPIAgent. This is probably the missing step for a lot of people because it's required but doesn't seem to be in the Eclipse documentation. I found it lurking in a mailing list discussion.
# Make the native code TPTP libraries available to Tomcat export LD_LIBRARY_PATH=/home/dcminter/agntctrl/lib:/home/dcminter/agntctrl/plugins/org.eclipse.tptp.javaprofiler/ # Specify a directory for logging output. This IS required - I write it # to the normal Tomcat log directory on Edgy. Martini is an Eclipse technology name. export MARTINI_LOGGER_DIRECTORY=/var/log/tomcat5.5 # But unless I'm trying to debug the configuration I turn off most profiler debugging output. export MARTINI_LOGGER_LOG_LEVEL=0 # Make the Java code TPTP libraries available to the profiler backend export JAVA_PROFILER_HOME=/home/dcminter/agntctrl/plugins/org.eclipse.tptp.javaprofiler
As always, note that /home/dcminter should be replaced with your real install directory. This may be your own home directory (in which case $HOME can generally be used) or perhaps somewhere below /usr/share or /opt that will be accessible to multiple users on a shared machine.
Restart Tomcat to re-read the settings from the default config.
dcminter@IS-5285:~/agntctrl/bin$ sudo /etc/init.d/tomcat5.5 stop Password: Stopping Tomcat servlet engine: tomcat5.5 . . . . dcminter@IS-5285:~/agntctrl/bin$ sudo /etc/init.d/tomcat5.5 start Starting Tomcat servlet engine: tomcat5.5. dcminter@IS-5285:~/agntctrl/bin$
That's all for the server-side components; at this point you should be able to connect once you have configured the Eclipse TPTP client machine.
The TPTP Profiler Client
The client installation process is a great deal simpler than that for the server.
Installing the client
Load the workbench project for the application that you wish to debug. You will then need to go to the Software Updates/Find and Install dialog on the Eclipse's Help menu. Select Search for new features to install and then click the New Remote Site button. Add a Test and Performance Tools Platform (TPTP) Updates site with URL http://eclipse.org/tptp/updates/
Check the resulting entry in the list of update sites and click the finish button. You will be presented with a list of components to install (possibly after clicking through a few dialogs). I'm hazy about the exact meaning of the various component names, but I have the following:
- TPTP Monitoring Tools Project
- TPTP Platform Project
- TPTP Profiling for Web Applications
- TPTP Reporting with BIRT
- TPTP Testing Tools Project
- TPTP Tracing and Profiling Tools Project
It's also possible to download and install the components manually from the website but I don't normally do this. Even without adding components manually, the petulant update manager seems to encounter lots of dependency problems - but that's a rant for another time.
Configuring the client
Since this part will be somewhat unfamiliar I'll provide it as a screenshot walkthrough, but probably this is overkill - the configuration process is very simple as long as the agent controller server has been correctly configured.
Once the TPTP tools have been installed you should see the profiler icon on the Eclipse workbench toolbar.

Right click the icon, and from the resulting context menu select the Open Profile Dialog option.

You should now see the profile launch-configuration dialog shown below. This provides options for connection to various server types - although it is possible to set up connections to servers where Eclipse is managing the server deployment process, my assumption is that like me you are connecting to a completely "external" server such as a production environment.
Right click on the Attach To Agent option from the list on the left and select "New" from the popup context menu.

This will create a new set of named profiler configuration parameters. The details will be displayed on the right hand side with a name such as "New_configuration" at the top. To change this name edit the text at the top of the page and select apply.
By default the configuration parameters assume that you will be connecting to a profiling agent on the local machine (localhost) on port 10002. This cannot be removed, but remote hosts can be added to the list of places that agents can be found. Select the Add button to do so.

Selecting the Add button pops up a dialog (shown below) prompting for a server name and port. Put in the connection details of the server running the agent controller - the port is the one specified in the value of the Port element from the serviceconfig.xml file - unless changed it will be port 10002.

Once you click OK you should be returned to the profile dialog with the new server's details in the list.

Select the newly created connection details and click the Test Connection button. If the agent controller is correctly configured and running on the remote server you should see the dialog below confirming that a connection was successfully established to the listening agent controller.
If for some reason the configuration test cannot connect to the agent controller an error message will appear with some limited explanatory text. If this is the case, check the connectivity to the server (including any firewall/port settings and other network infrastructure) and that the agent controller is correctly configured.

Once you have a successful connection to the agent controller, switch to the Agents tab (shown below) to select the items that will be profiled. Press F5 to refresh the list if the list has not yet been populated (the details of the available agents are retrieved by Eclipse from the remote server).

Drill down the list of available profiling agents and select any and all that are of interest - in our case we have only configured the Call Graph agent, so only one option is listed. Others such as the heap profiler (to check for memory leaks) can be added during the server configuration.

With the specific agent(s) selected you have completed the necessary details of the launch configuration and should apply all the changes. Click the "Profile" button and off you go!
Footnotes.
1. I hate mailing lists. Email in my opinion is the wrong tool for this sort of thing. It should be managed using NNTP or forum software. Just my personal gripe, but searching online-archived mailing lists for archived conversations usually makes me want to pluck my eyes out.
2. Search engines being what they are, I think it's only right to supply the debug flags just in case you got here looking for that and don't give a fig about profiling. For a typical Tomcat server, running on the default port of 4142 you would add the following to the CATALINA_OPTS environment variable:
-Xdebug -Xrunjdwp:server=y,transport=dt_socket,address=4142,suspend=n
3. For a truly secure solution you would want to configure the agent controller and client to use SSL and certificate exchange, but I haven't investigated that as yet so you're on your own there!
I spent a whole day to figure this out!
Notes:
- Windows users: Instead of LD_LIBRARY_PATH use just the plain old PATH variable.
- Append also "c:\agntctrl\bin" (or whatever your directory is) to the PATH variable.