When java object count becomes a bottleneck CPU starts screaming
There is no typical distribution graph, and there is no typical explanation for bad performance. It is so much application and technology related, that you could probably write a small encyclopedia on the subject. One thing though that is available, are the starting points. And those are pretty much the same, giving you the hints where to look into for your performance problem.
I would organize them as following, descending by their priority:
- JVM Configuration and Garbage Collection Strategy
- Thread Pool Configuration
- JDBC Pool Configuration
- Database Server
- I/O usage
The list can of course continue, but it would be detailing of the above points.
As most of the performance experts in the market keep saying, JVM is responsable for 80 % of the performance problem of an application. Now, it is not called a Machine (java virtual machine) for nothing…This thing is so complex, so powerful, and, with the right tools and knowledge, so tunable that can REALLY make the difference between those maximum 30 Transactions and 300 Transactions / second.
Let me get slowly to the point.
While running a load test the previous days, i have noticed that my users were sucessfully calling a method, with good response times to the limit of ~40 transactions / second. Everything above this limit was having longer response times, and by 55 Transactions per second, my server would start refusing further incoming requests.
First thought, after fighting for some time with tuning the Oracle Server, was the database. Each transaction was querying the database for a specific object, and then once more for the children of that object, and that 10 times in a request. That would translate into 100 querys / transaction…so i thought maybe i had a bottleneck on my DB. After monitoring the CPU Usage and the I/O usage, i did not notice any waiting times in servicing requests, so that took Oracle out of the discussion.
Ok, what could be next?
I took a look at my jdbc connection pool, which was proportionally configured to my http thread pool on glassfish. I ran the test once again, by closely monitoring the jdbc pool usage, and the thread pool usage. The peak usage was somewhere about 40 threads used at once, (and 40 jdbc connections), without having any qued request in the connector que. That was clearly telling me that the problem was not in handling and processing the requests.
Therefore, contrary to the usual approach (where i start monitoring the jvm first), i took a closer look to the garbage collection. I noticed that on top of my test (when all users were logged in and firing requests), the Heap was filling up amazingly fast, causing a lot of young garbage collections. Causing young garbage collections was not necessarily the problem, but i was pretty worried about the rate that my Young Generation was filling.
I decided to increase my Young Generation, so i allocated 3 GB to my JVM, and 1 GB to my new generation, in order to see what exactly was happening.
I was little surprised to see that on top of the test (peak), my 1GB young generation was filling up in less than 3 seconds!!! That was an allocation rate of 200 MB / Second. Taking a parallel look to the CPU, i noticed that it was constantly hitting the 80 % line, with a http thread usage of less than 10 %
This machine was so busy allocating objects…
I decided to analyse the garbage collection behaviour using HP Garbage Collection Tool, in order to have a closer, detailed look to the allocation rate of objects. And this is how it looked like:
This test has been ran for 10 times. You can see how it gracefully starts and ends, and the ~ 200 seconds peak period. Well, take any peak period and look on the left. The allocation rate is bigger than 200 MB ! That means that my CPU was busy allocating 200 MB / second, only to clear them right away in a Garbage collection that was started at most 4 seconds later.
Monitoring the Glassfish Server returned this report:
You can see how the CPU was 80 % busy at the peak in the left panel. The lower points, the blue ones, are the garbage collections. Lots of them…You can see how the heap behaved in the right panel.
And for those of you interested, this is how the response times graph looks like:
Once again:
- the left y scale represents the average transactions / second
- first right scale is the average response times / request / second
- second right scale represents the maximum response time of a request in a second
- the x scale is the time scale
This is a sample taken from the main report, and shows the peak usage – this returned the following results:
And this is how the main report looks like, where you can see how at 55 Transactions / second the system hits the margin:
This is what Java world calls object cycling, and is defined as follows :
“objects are rapidly created and destroyed”
Object cycling can be detected in the heap either through a visual tool that samples memory very frequently (once a second or less) you will observe very narrow choppy marks in the heap or through garbage collection logs. The key indicator to look for in reading garbage collection logs is frequent minor collections that reclaim small amounts of memory. Consider that the performance impact of garbage collection on your application is the result of two factors:
* The frequency of garbage collection
* The duration of each garbage collection
Object cycling impacts the frequency of garbage collection and can be therefore be detected by calculating the frequency of garbage collection.
Next step? Identifying the cycling objects, and analysing them for improvement (maybe caching? ) But that is another story.
So there it was, explained, how crazy allocation of object can have a huge impact on Java performance, regardless of how powerfull your server is.
Hope you enjoyed my post,
Cheers,
Alex





wow amazing stuff bro.
Hi,
I’ve seen a few of your posts regarding glassfish and monitoring and figured I’d ask this as I’ve been out of the optimization field for a while and figured i could use some help. We have a glassfish v2.1 app containter with portlets running. We found that upon login on the portal with no extra portlet added, a login will take about 0.7 secs. If we add a portlet, a login with 1 user will take around 3 secs. CPU, memory and I/O are all the same (which would be expected i suppose with that small a load). However, I’d like to be able to explain the time difference. Any tips on how i could do this?
Cheers!
Hi Richard,
if you say that the CPU and Memory is the same, i would expect you have a thread contention problem. If you have left your thread pool with the default configuration of minimum 2 threads and maximum 5 threads, it may be very well possible that the business case of the portlet could be the one occupying the resources. Now if i think about it real quickly, i would also bet on the jdbc connection pool, in the case that the portlet is accessing the database. A poorly programmed portlet could be occupying all 5 jdbc threads at once, putting your main application in a wait state.
You should be able to see this by following one of the two (or both) test scenarios. What you first need is Visual VM to have a feeling of the usage of your threads.
1. Run your test with a single user, checking on the thread usage in Visual VM. If you see all 5 of the threads being occupied at once, than you know where your performance bottleneck is, and you can switch to profiling.
2. Profile directly, using a profiler like JProbe or YourKit, with Thread Analysis enabled. YourKit also has JSP Monitoring and SQL Performance Tracing. You could for example see how many queries are being run against the database by your portlet (and application)
3. Run your test with increased number of users, checking the usage of the resources in VisualVM
Please let me know if this helps.
Alex
Hi,
The information definitely helps! A few updates and a couple of questions. After using mpstat instead of iostat for CPU, there’s is a difference in CPU between the 2 scenarios. With the extra portlet, the server has 16 CPUs and one of them goes down to around 15% idle.
To monitor the threads in VisualVM, what threads should i be looking for? I’m running the application on port 8080. I see 5 HttpSSLWorkerThread-8080 threads.
I had ran YourKit profile against the app already but I’ll have to retry as i don’t think i had thread Analysis enabled. The profile seemed to have a fair amout of overhead as the execution time increased. I noticed there was a query that had 1,000 invocations and jdbc2 connections being closed almost 3,000 times. Seems high but i don’t really know what the portlets are doing. They are definitely connection to the database. This is for one single login for a user.
Thanks for the suggestions! I had only did optimizations on asp and windows/linux before so I’m learning as I go.
Cheers,
Charles
Hi Charles,
happy to see you’re getting to it and that it helps.
A query of 1000 invocations is really, but really bad. I just can’t see why one query would be running 1000 invocations in order to probably retrieve 1000 resultsets. If you use YourKit with SQL Profiling enabled, you will also see what the queries are. You can also set a driver between your application and the server, something like the p6spy. It’s a little bit of work to configure it, but i’m sure you’ll get it.
You really need to see what that query is.
Second, one thing that denotes poor programming is the use of straight sql statements. Try to make sure that you are using prepared statements, therefore taking advantage of the databases’s caching of queries.
Let me now if i can help further on
Cheers,
Alex
Hi Alex,
I’ve seen what the worst query is through the profiler (YourKit) but when i do an explain plan on it, it seems ok (DB is MySQL). The profiler says the query took 1.2 secs each time but I’m wondering if that’s not because of table locking. I’ve looked into the JDBC Connection pool but that seems ok as well (through asadmin get command).
I’ve also noticed the Http service property CountTotalQueued gets incremented by almost 500 on every single login i do which seems odd.
At this point, it looks like the HTTP service is waiting on something, quite probably the database but not sure how to prove that at the time being. I tried doing a kill -3 on the glassfish thread to see what threads were waiting but i don’t know how to distinguish which threads i need to analyze. My analysis might not be correct either.
Thanks on all the help so far, I might not have the problem solved yet but I’m learning things as I go and feel that I’m starting to understand the methodology i would take to solve such problems.
Cheers,
Charles
Hi Charles,
if the query is taking 1,2 seconds, and is called only once, than there is no problem. If on the other hand, your query is being run repeatedly, then you might have a problem.
Try the following. Set the http thread pool and jdbc connection pool to an equal size of maximum 2. Run your login scenario, and do a thread dump. The threads you want to check are “http-thread-pool-1 and http-thread-pool-2″ You can then follow the stack trace to see who is blocking.
I can’t believe that mysql is blocking on “read” queries. I cannot imagine any database blocking on read, not these days. So the problem might be that each thread has to retrieve the information, but is being blocked by the low number of jdbc connections. I mean, if you have 500 requests waiting in the queue, they can be waiting on a thread, or a database connection.
The other thing you might want to try to prove this theory, is the other way around, increasing the http thread pool and jdbc pool to 200 for example. You can enable monitoring of http thread usage and jdbc connection pool usage via mbeans in visualgc (amx -> module monitoring levels -> http service(high), thread-pool(high), jdbc-connection-pool(high)) Once you enable monitoring, you will have at the end a peak value that you can check against your scenario.
Now, on the other hand, why would you send tenth, hundreds of queries to the database for retrieving a piece of information? I mean, conceptually this is wrong. Why not try to retrieve the data via one query, and extract the information in your portlet based on your business needs?
And to answer your question regarding what to monitor – if you run your app directly over glassfish (no web server between), then you want to monitor the threads called “http-thread-x”
Other things to monitor are the ejb cache and ejb pool. Are you using entity beans for your application?
Let me know how this works,
Cheers,
Alex
Hi,
Sorry but my priorities got realigned. Still curious to find out what was going on so I’m going i can get back to it in a couple of weeks.
Cheers and thank you,
Charles