Command Line Glassfish Monitoring in Jasper Reports using Glassfish Rest Monitoring
Recently i have spent some time thinking how i can integrate active monitoring of Glassfish critical resources into my reactive Performance Reports. I was having a performance report showing me both response times and throughput, but i needed to know why, at some points in time, my throughput was decreasing. There could have been several reasons, like JMS, JDBC, too many open connections , etc… I needed to have those values in my report, so i could easily identify problems without proactively monitoring while testing. So i started thinking about ways to do it.
At first, i wanted to monitor the following resources (there are lots to follow, but now i’ll just stick to these ones, as these ones were the ones i needed in the beginning)
- jdbc usage – the number of connections used at runtime
- threads busy – the number of busy threads at runtime
- open http connections – the number of open connections at runtime
- keep alive connections – the number of connections in keep-alive mode at runtime
- http peak queued requests – the peak number of requests that had to wait in a queue before being processed
- count of specific beans in our software
I also needed to develop a solution that would allow me adding new monitoring statistics later, in a “plug-in” fashion.
Basically, there are three ways (of which i am aware of at least) that you can do it:
- Using JMX programatically
Basically, you use the JMX Api, by registering the MBean server, and then interrogating each managed bean according to your needs. As i prefer shell scripting to developing (sic!), i turned my back to this solution, and started looking into the next one. Those who want to see this in action can check it out here: - Using a custom built JMX command line monitor
I needed a tool that could interrogate the MBean server, and return the result into a file, on a timely fashion. It had to be able to be controlled through my shell script, which would control what and when to interrogate. There are several solutions on the market (i am talking about the open source ones) like:- jmxterm – http://wiki.cyclopsgroup.org/jmxterm – I think of all that i have tried, this one was the easiest and most straight-forward to use. The best about it is that you can define commands to be sent into a file, and use the file as command input. That way, if your target is to interrogate the server in a high-pause (let’s say every 5 minutes) fashion, then jmxterm is a pretty good solution.
- command line jmx client - http://crawler.archive.org/cmdline-jmxclient/ (quite old, not very flexible, only one command at a time
- Using the REST Management Interface provided by Glassfish
Basically, Glassfish provides two interfaces, one for managing resources, the other one for monitoring.- Glassfish Management Interface – http://yourserver:4848/management/domain
- Glassfish Monitoring Interface – http://yourserver:4848/monitoring/domain
The fact that every attribute can be monitored via an HTTP request, makes this the best candidate for my purposes. Let me detail that
Glassfish comes preconfigured with every monitoring level set as low. This means that in order to start monitoring, we need to enable monitoring for specific modules. Let’s do that, one by one, for the modules that we need. You need to call the management interface of Glassfish:
http://yourserver:4848/management/domain (or directly http://yourserver:4848/management/domain/configs/config/server-config/monitoring-service/module-monitoring-levels for going directly to the monitoring levels page) Set the desired modules to HIGH:
The changes are dynamic, you do not need to restart the server. Let’s check if this worked out. I am going to request the number of active connections, by calling the monitoring interface of the http listener in a browser:
http://yourserver:4848/monitoring/domain/server/network/http-listener-1/keep-alive
The result lookis like this:
This works for every attribute that you normal query through the MBeans Browser in JConsole or VisualVM. Now, i was speaking about needing a way to control the requests from command line (shell script), the delays, and a way to import and present the data in the same report with my performance report. I needed some way to make the request from the command line, so i turned to wget. Unfortunately, wget cannot append the result into a file, therefore i switched to curl. Therefore, the request that i just sent above, using a browser, can now be sent as a curl request:
curl -s -u admin:adminadmin http://yourserver:4848/monitoring/domain/server/network/http-listener-1/keep-alive
Now, this is the last level of granularity that you will obtain. This request will provide you with all keep alive monitoring statistics:
- countconnections
- counttimeouts
- secondstimeouts
- maxrequests
- countflushes
- counthits
- countrefusals
Now, if we are only interested in the number of keep alive connections, we need to extract that from the answer. Nothing easier, when using regular expressions. Let’s do that for “countconnections’. We will store the result into a variable, using shell scripting:
HTTP_KEEP_ALIVE_CONNECTIONS=`curl -s -u admin:adminadmin http://yourserver:4848/monitoring/domain/server/network/http-listener-1/keep-alive | grep countconnections | grep -o -E ‘”count”:[0-9]*’ | sed ‘s/["]*[a-z]*["][:]*//’`
This will return the value of the keep alive connections parameter alone. If we wanted to check on the number of http connections we would use:
HTTP_CONNECTIONS_OPEN=`curl -s -u admin:adminadmin http://yourserver:4848/monitoring/domain/server/network/http-listener-1/connection-queue | grep countopenconnections | grep -o -E ‘”count”:[0-9]*’ | sed ‘s/["]*[a-z]*["][:]*//’`
That gives you so much flexibility, doesn’t it? Let’s just collect all this information on a time basis, using a function. We will collect statistics every x seconds, as long as the monitoring process is enabled. We enable the monitoring process by creating a temporary file called “/tmp/glassfish_stats”. The idea behind this is to start monitoring when we start the load test, and stop monitoring once the last request of the load test has been sent (when we would then remove the /tmp/glassfish_stats file, therefore stopping the monitoring process
function trace_gf_statistics
{
CMD_PARAM_PROTOCOL=$1
CMD_PARAM_SERVER=$2
CMD_PARAM_PORT=$3
CMD_PARAM_INTERVAL=$4
#Check on temporary file. If it exists, stop monitoring. Otherwise, monitor every x defined seconds
status=`ls /tmp | grep glassfish_stats`
while [ "$status" != "" ];
do
MONITOR_TIMESTAMP=`date +%H-%M-%S`
#JDBC Monitoring
JDBC_CONN_USED=`curl -s -u admin:adminadmin http://
$CMD_PARAM_SERVER:$CMD_PARAM_PORT/monitoring/domain/server/resources/
mypool| grep numconnused |
grep -o -E '"current":[0-9]*' | sed 's/["]*[a-z]*["][:]*//'`
#Thread Pool Monitoring
HTTP_THREAD_POOL_THREAD_COUNT=`curl -s -u admin:adminadmin http://
$CMD_PARAM_SERVER:$CMD_PARAM_PORT/monitoring/domain/server/network/
http-listener-1/thread-pool | grep currentthreadsbusy |
grep -o -E '"count":[0-9]*' | sed 's/["]*[a-z]*["][:]*//'`
#Keep Alive Connections
HTTP_KEEP_ALIVE_CONNECTIONS=`curl -s -u admin:adminadmin http://
$CMD_PARAM_SERVER:$CMD_PARAM_PORT/monitoring/domain/server/network/
http-listener-1/keep-alive | grep countconnections |
grep -o -E '"count":[0-9]*' | sed 's/["]*[a-z]*["][:]*//'`
#Open connections
HTTP_CONNECTIONS_OPEN=`curl -s -u admin:adminadmin http://
$CMD_PARAM_SERVER:$CMD_PARAM_PORT/monitoring/domain/server/network/
http-listener-1/connection-queue | grep countopenconnections |
grep -o -E '"count":[0-9]*' | sed 's/["]*[a-z]*["][:]*//'`
HTTP_CONNECTIONS_PEAK_QUEUED=`curl -s -u admin:adminadmin http://
$CMD_PARAM_SERVER:$CMD_PARAM_PORT/monitoring/domain/server/network/
http-listener-1/connection-queue | grep peakqueued |
grep -o -E '"count":[0-9]*' | sed 's/["]*[a-z]*["][:]*//'`
#Bean Monitoring
BEANMON_MYTESTBEAN_CURRENT=`curl -s -u admin:adminadmin http://
$CMD_PARAM_SERVER:$CMD_PARAM_PORT/monitoring/domain/server/
applications/myapplication/mytestbean/bean-cache |
grep numbeansincache | grep -o -E '"current":[0-9]*' |
sed 's/["]*[a-z]*["][:]*//'`
BEANMON_MYTESTBEAN_PEAK=`curl -s -u admin:adminadmin http://
$CMD_PARAM_SERVER:$CMD_PARAM_PORT/monitoring/domain/server/
applications/myapplication/mytestbean/bean-cache |
grep numbeansincache | grep -o -E '"highwatermark":[0-9]*' |
sed 's/["]*[a-z]*["][:]*//'`
echo $MONITOR_TIMESTAMP":JDBC - Connections used:
"$JDBC_CONN_USED >> ${JMETER_RESULTS}/glassfish_stats.log
echo $MONITOR_TIMESTAMP":HTTP - Thread Usage:
"$HTTP_THREAD_POOL_THREAD_COUNT >> ${JMETER_RESULTS}/glassfish_stats.log
echo $MONITOR_TIMESTAMP":HTTP - Keep Alive Connections:
"$HTTP_KEEP_ALIVE_CONNECTIONS >> ${JMETER_RESULTS}/glassfish_stats.log
echo $MONITOR_TIMESTAMP":HTTP - Open Connections:
"$HTTP_CONNECTIONS_OPEN >> /${JMETER_RESULTS}/glassfish_stats.log
echo $MONITOR_TIMESTAMP":HTTP - Peak Queued Connections:
"$HTTP_CONNECTIONS_PEAK_QUEUED >> /${JMETER_RESULTS}/glassfish_stats.log
echo $MONITOR_TIMESTAMP":Beans - Mytestbean Current:
"$BEANMON_MYTESTBEAN_CURRENT >> /${JMETER_RESULTS}/glassfish_stats.log
echo $MONITOR_TIMESTAMP":Beans - Mytestbean Peak:
"$BEANMON_MYTESTBEAN_PEAK >> /${JMETER_RESULTS}/glassfish_stats.log
sleep $CMD_PARAM_INTERVAL
status=`ls /tmp | grep glassfish_stats`
done
cp ${JMETER_RESULTS}/glassfish_stats.log ${JMETER_TRANSFORMATION}glassfish_stats.log
}
So basically, i am querying the resources every x seconds, appending them to an export file, that i will use in the end for importing and transforming. As one can see, i use a “:” delimited file, where i export the following:
- timestamp
- monitored resource
- value
In the end, it looks something like this:
16-07-05:JDBC - Connections used:0 16-07-05:HTTP - Thread Usage:0 16-07-05:HTTP - Keep Alive Connections:2 16-07-05:HTTP - Open Connections:16783 16-07-05:HTTP - Peak Queued Connections:51 16-07-05:Beans - Mytestbean Current:0 16-07-05:Beans - Mytestbean Peak:177
Now, all i have to do is to import the results into the database, and prepare the report. And this is how it looks in the end:
And now bean monitoring:
And this is how the final report looks like (sorry, i will get back with a clear view of it. Haven’t had the time to adapt this one to paper size yet, so i only have it in extended format )
The best part in this, is that if you need to monitor a new resources, it will all come to two things:
- Setting the variable and the curl request
- Adding the result to the export file
The structure of the database, organised as (test_id,timestamp,label,value) will take this on the fly, regardless of the number of monitored resources. You can add as many subreports as you want, monitoring mainly everything that you need. The timestamps will help you check what happened at a specific point in time (for example when the throughput decreased…)
This kicks the hell out of commercial tools, doesn’t it ?
Good luck with setting your monitoring environment, and let me know if you encounter any problems.
cheers,
Alex





Hi very cool work ! i’m sticking around it
Hi I’ve tried it but I just can’t grep the output of curl since it is all in a single line and I’ve got alway null for each resource
Returning nulls looks like your monitoring is not turned on, could it be possible?
Could you send me the curl statement and the result?
Thank you very miuch for yor response! The monitoring is on since the response of curl is ok but as I said on a single line so grep fails.
here is the curl statement
curl -s -k -u admin:glassfish https://xxx.kkk.uuu.yyy:4848/monitoring/domain/server/resources/KKKK | grep -o ‘numconnused’ |grep -o -E ‘”current”:[0-9]*’ | sed ‘s/["]*[a-z]*["][:]*//’
and this the result
echo ‘Oracle GlassFish Server 3.1.2 REST Interface Oracle GlassFish Server 3.1.2 REST Interfaceaverageconnwaittimecount : 6lastsampletime : 1365082035194description : Average+wait-time-duration+per+successful+connection+requestunit : millisecondname : AverageConnWaitTimestarttime : 1365010472285connrequestwaittimehighwatermark : 1355lastsampletime : 1365070557193description : The+longest+and+shortest+wait+times+of+connection+requests.+The+current+value+indicates+the+wait+time+of+the+last+request+that+was+serviced+by+the+pool.unit : millisecondname : ConnRequestWaitTimestarttime : 1365010472285current : 1lowwatermark : 0frequsedsqlquerieslastsampletime : -1description : Most+frequently+used+sql+queriesunit : Listname : FreqUsedSqlQueriesstarttime : 1365010472347current : numconnacquiredcount : 228lastsampletime : 1365070557193description : Number+of+logical+connections+acquired+from+the+pool.unit : countname : NumConnAcquiredstarttime : 1365010472285numconncreatedcount : 526lastsampletime : 1365081870707description : The+number+of+physical+connections+that+were+created+since+the+last+reset.unit : countname : NumConnCreatedstarttime : 1365010472285numconndestroyedcount : 518lastsampletime : 1365081870107description : Number+of+physical+connections+that+were+destroyed+since+the+last+reset.unit : countname : NumConnDestroyedstarttime : 1365010472285numconnfailedvalidationcount : 0lastsampletime : -1description : The+total+number+of+connections+in+the+connection+pool+that+failed+validation+from+the+start+time+until+the+last+sample+time.unit : countname : NumConnFailedValidationstarttime : 1365010472285numconnfreehighwatermark : 8lastsampletime : 1365081870707description : The+total+number+of+free+connections+in+the+pool+as+of+the+last+sampling.unit : countname : NumConnFreestarttime : 1365010472282current : 8lowwatermark : 0numconnnotsuccessfullymatchedcount : 0lastsampletime : -1description : Number+of+connections+rejected+during+matchingunit : countname : NumConnNotSuccessfullyMatchedstarttime : 1365010472285numconnreleasedcount : 228lastsampletime : 1365070557202description : Number+of+logical+connections+released+to+the+pool.unit : countname : NumConnReleasedstarttime : 1365010472285numconnsuccessfullymatchedcount : 0lastsampletime : -1description : Number+of+connections+succesfully+matchedunit : countname : NumConnSuccessfullyMatchedstarttime : 1365010472285numconntimedoutcount : 0lastsampletime : -1description : The+total+number+of+connections+in+the+pool+that+timed+out+between+the+start+time+and+the+last+sample+time.unit : countname : NumConnTimedOutstarttime : 1365010472285numconnusedhighwatermark : 2lastsampletime : 1365070557202description : Provides+connection+usage+statistics.+The+total+number+of+connections+that+are+currently+being+used%2C+as+well+as+information+about+the+maximum+number+of+connections+that+were+used+%28the+high+water+mark%29.unit : countname : NumConnUsedstarttime : 1365010472282current : 0lowwatermark : 0numpotentialconnleakcount : 0lastsampletime : -1description : Number+of+potential+connection+leaksunit : countname : NumPotentialConnLeakstarttime : 1365010472285numpotentialstatementleakcount : 11lastsampletime : 1365043843550description : The+total+number+of+potential+Statement+leaksunit : countname : NumPotentialStatementLeakstarttime : 1365010472347numstatementcachehitcount : 0lastsampletime : -1description : The+total+number+of+Statement+Cache+hits.unit : countname : NumStatementCacheHitstarttime : 1365010472347numstatementcachemisscount : 0lastsampletime : -1description : The+total+number+of+Statement+Cache+misses.unit : countname : NumStatementCacheMissstarttime : 1365010472347waitqueuelengthcount : 0lastsampletime : -1description : Number+of+connection+requests+in+the+queue+waiting+to+be+serviced.unit : countname : WaitQueueLengthstarttime : 1365010472285Child ResourcesArmItemGAPromotionHk‘ | grep numconnused
I’ve tried to overcome to this using
asadmin –user admin –passwordfile pwdf get –monitor server.resources.KKKK.*| grep numconnused-current |grep -o E’”numconnused-current”=[0-9]*’ | sed ‘s/["]*[a-z]*["][:]*//’
the result has newlines so it should be OK. Unfortunately I’ve very few time now to experiment with it, the only info is missing is the time information
Thanks again !
Hi Dario,
you could query the attribute directly, for example the number of connections used:
curl -s -k -u admin:adminadmin https://myserver:11048/monitoring/domain/server/resources/mypool/numconnused | grep -e ‘numconnused’ | grep -Eo ‘lastsampletime : [0-9]*|current : [0-9]*’
This will return:
lastsampletime : 1365495480248
current : 0
See the last grep statement that returns the timestamp too. That is the quick solution i can think of. When i’ll have some more time, i will try so find the solution to have it all in one.
Hope that helps for now, otherwise let me know.
Alex
Thank you again ! when I’l have some time I’ll try to stcik with it
Dario