Classic Java OutOfMemoryError

Few months back, one of the performance testing team in the hospitality industry encountered OutOfMemoryError during the endurance runs at the end of 2 hours. They repeated the endurance run again, but the end result was same.  Without a second thought, the performance testing team increased the heap to 8 GB from the existing 4 GB. This time the OutOfMemoryError had occurred at the end of 4 hours.

About the application: The AUT (Application under test) was very simple, users login into the system and does 3000 iterations of invoking a static page and then logs out.

Requirements: At any given point of time 8000 users have to be logged into the system. There are around 100K users available in the system

Analysis: Ideally there should not be any problem with such simple applications. Knowing this team for a while, asked them few questions:

  • What does the application perform?
  • How the users login and what does the user do?
  • Will the same users logins into the system or a new user logins every time?

Based on the responses provided by the team, came to a conclusion that the session is not getting in-validated when the user logs out.

Looked at the Session Keep-Alive time and found that this value has been configured to 24 hours.

After reducing the Session Keep-Alive time to 30 minutes, the OutOfMemoryError has been resolved even without looking at the heapdump (lots of time has been saved!!).

Conclusion: As a performance tester, very first thing is understand the application behavior.  Never change the values without fully understanding the impact of the change.


Commands to execute SQL tuning advisor based on SQL Text

l_sql VARCHAR2(32766);
l_sql_tune_task_id VARCHAR2(1000);
l_sql := ‘SELECT c1, c2 from Emp where c3=”Something”;
l_sql_tune_task_id := DBMS_SQLTUNE.create_tuning_task (
sql_text => l_sql,
user_name => ‘scott’,
scope => DBMS_SQLTUNE.scope_comprehensive,
time_limit => 1800,
task_name => ‘Task_r10’,
description => ‘Task 10.’);
DBMS_OUTPUT.put_line(‘l_sql_tune_task_id: ‘ || l_sql_tune_task_id);


EXEC DBMS_SQLTUNE.execute_tuning_task(task_name => ‘Task_r10’);


SET LONG 1000000;
SELECT DBMS_SQLTUNE.report_tuning_task(‘Task_r10’) AS recommendations FROM dual;

For executing SQL Tuning Advisor based on SQL_ID Click Here

Hope it helps!!!

Identifying Incompetent Managers…

How do we identify incompetent managers or credit stealing managers?

Watch out for the word ‘I’. The way these people communicate to their managers or in the team meetings or in the emails is “I have done this”, even though the team has done it. They take the credit for the teams work. For them, they come first instead of the team.

When there is a SUCCESS, the manager takes the credit, if there is a FAILURE, the individual gets the credit.

One more pointer in identifying these persons is, they don’t take the ownership, but expects the others to take ownership.

How  to deal with such people is posted here.


Dealing with Incompetent Managers…

Just dealing with credit stealing manager should not be much  of problem (even though it’s a headache) .. One way to overcome this issue to provide minimum information and when needed provide all information in front of larger audience. This is  one of way this situation can be dealt with.

What if this credit stealing manager is also an egoistic? Two options, 1. to have patience. 2. Leave the current job

What if this credit stealing manager is egoistic, without communication skills like phone etiquette? In addition to the above mentioned approaches, bear with tantrums.

What if this credit stealing manager is egoistic, without communication skills like phone etiquette, lack of stakeholder management skills? In addition to the above approaches, for the sake of project keep on begging with the stakeholders for support. Ignore the manager and have direct contacts with the stakeholders when support is required

What if this credit stealing manager is egoistic, without communication skills like phone etiquette, lack of stakeholder management skills and cribbing for everything with higher management, about the colleagues? Mostly depends on the higher management how they approach this situation.

What if this credit stealing manager is non-technical and believes to be technical (and claims as a Architect) is egoistic, without communication skills like phone etiquette, lack of stakeholder management skills, cribbing for everything with higher management, about the colleagues, team  & on the stakeholders for not supporting?

There is no point in talking to the higher management as they themselves are incompetent. Had they been competent enough, this situation would not have arrived. It’s just the persons hard luck.

On top of this, information hiding … What these managers don’t understand is, they are making the team incompetent without sharing the information. At the end of  the day, the team would  lose interest. It’s not loss to the team, its loss to them only.  Manager’s role is to enable the team members. Provide comfortable environment for the team to work.

I have been thinking of posting this for long time … more than 3 month. But somehow could  post now only.

How to identify such person  is posted here.

Suggestions are most welcome.

A potential memory leak – Metaspace leak

For the last few weeks, I have been working on Benchmarking a Product that was devloped using JDK 1.8u65 and deployed on Weblogic 12C.

Started the initial load tests  just to see how the application is behaving and during that process was observing DB and JVM performance. Found few issues during the load testing process.

  1. Inserts in the DB was happening 1 at a time
  2. Complete Refresh was happening every 30 seconds.
  3. Full GC’s were observed

The issue with point 1 is, too many network round trips to the database and also the log file sync wait event is the top wait event. Informed the dev team to make changes in the code so that the records get commited to the database in batches, thereby reducing the no. of network round trips.

The issue with point 2 is, MV refresh happens for the  all the records in the database every time the refresh happens. We dont see much impact when the number of records in the database size is in few MBs. The real problem would arise when the database grows to few GB/TBs. Asked the Dev team to implement incremental refresh.  Only for the first time complete refresh happens there onwards refresh happens for the delta records.

In the meantime, started analyzing  the GC logs and the reason for the Full GCs. At first glance, full gcs seems to be okay. On further analyzing the full gcs, the metaspace was also getting resized (space getting reclaimed) which is not normal. We know that young and old region gets resized (space getting reclaimed) during minor and major gcs. But why is metaspace is getting resized that too in a steady state.

There might be two reasons for this.

  1. The load testing process (the way the requests are sent). This can be ruled out as I do the inilization once and then the main transactions run.
  2. New instances are getting created again and again. This is the only option left to be explored. At this point, I cannot involve the developers as they would ask me what are those instances?

As a next step, took 3 complete heap dumps when the heap occupancy was at different stages (immediately after the full gc, when heap is half full and when the heap is about get full gc’ed).

Analyzed all the heap dumps, and found out a particular class instance was responsible for this. I could see the  no. of instance of this particular class increasing in all those three dumps.

Now, I had the required data, checked with the developer if the particular class instance is being used in the code, he confirmed that his part of the code is creating the class instances.

Oooola.. got the fix.  Again repeated the same process. This time I could see only a few instances of those class. The leak is fixed.

Hope it helps.

Debugging JavaDB Query Performance

— turn on RUNTIMESTATISTICS for connection:

— Indicate that statistics information should be captured into
— database tables in the SomeSchema schema:

—execute  the queries:
SELECT * FROM <SomeTable> WHERE someName = ‘XYZ’ ;

–turn off runtime statistics:
–Retrieve the text of statements which were captured, in order by the time when the statistics were captured:
select stmt_text, xplain_time from SomeSchema.sysxplain_statements    order by xplain_time;

–Retrieve the text of statements which were captured, showing the statements which took the longest time to execute first:
select s.stmt_text, st.execute_time from SomeSchema.sysxplain_statements s, SomeSchema.sysxplain_statement_timings st where s.timing_id = st.timing_id    order by st.execute_time desc

–Show the statements that were executed, together with the result sets that each statement required:
select st.stmt_text, rs.op_identifier from SomeSchema.sysxplain_statements st  join SomeSchema.sysxplain_resultsets rs  on st.stmt_id = rs.stmt_id

–Find statements which resulted in an external sort being performed:
select s.stmt_text, s.stmt_id, rs.op_identifier, srt.no_input_rows from SomeSchema.sysxplain_sort_props srt, SomeSchema.sysxplain_resultsets rs, SomeSchema.sysxplain_statements s where rs.stmt_id = s.stmt_id and rs.sort_rs_id = srt.sort_rs_id and srt.sort_type = ‘EX’

–Find statements which resulted in a tablescan:
select st.stmt_text, sp.no_visited_pages, sp.no_visited_rows from SomeSchema.sysxplain_scan_props sp, SomeSchema.sysxplain_resultsets rs, SomeSchema.sysxplain_statements st where st.stmt_id = rs.stmt_id and  rs.scan_rs_id = sp.scan_rs_id and  rs.op_identifier = ‘TABLESCAN’ and sp.scan_object_name = ‘<SomeTable>’

Tracing Derby SQL statements

I am working on a project which has a derby db alias apache derby alias javadb as one of the components which stores the configurations (javadb is suitable for in-memory operations for small no. of clients). The application which accesses the derby db has high response times.
To troubleshoot the high response times, I have enabled tracing of sql statements that are fired from the application.

  • Created file and added a property “derby.language.logStatementText=true”
  • Restarted the application server including the derby database.

After that, all the sql statements will be logged into derby.log file.

Hope it helps

Note: tracing enables lots of logging of sql statements. Be careful while  dealing with the logs that are generated due to tracing

Configuring GC log Issue

I am trying to configure GC logs in weblogic startup scripts on one of the linux servers (oel7)

JDK: 1.7u80

MEM_ARGS=”-Xms2048m -Xmx2048m -XX:PermSize=512m -XX:MaxPermSize=512m -Xloggc:/scratch/oracle/middleware/user_projects/domains/oamdomain/servers/oam_server2/logs/oamserver2$$-gc.log -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution”

With the above parameters, I am unable to start weblogic servers… getting the below error.

Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Invalid file name for use with -Xloggc: Filename can only contain the characters [A-Z][a-z][0-9]-_.%[p|t] but it has been oamserver2-gc.log -XX:+PrintGC
Note %p or %t can only be used once

MEM_ARGS has been split into two parts:

export MEM_ARGS=” -Xms2048m -Xmx2048m -XX:PermSize=512m -XX:MaxPermSize=512m -Xloggc:/scratch/oracle/middleware/user_projects/domains/oamdomain/servers/oam_server2/logs/oamserver2$$-gc.log”

export JAVA_OPTIONS=”-XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution $JAVA_OPTIONS”

The weblogic startup script was creating problems while starting the server when we have everything part of mem_args. May be some sort of validation was happening somewhere 😉

Hope it helps.

IDM servers going down

Problem Statement:
All java process pertaining to IDM stack were getting killed very periodically. And the servers had to be restarted very often.

Weblogic 11g, IDM stack  java 1.70.79


  • There were no thread dumps or core dumps.
  • System logs were clean.
  • Greped the  weblogic logs for errors, could not find anything
  • After carefully going through each and every line of the weblogic logs, I could find below messages:
    ####<Jul 26, 2017 6:36:47 PM IST> <Notice> <WebLogicServer> <> <AdminServer> <Thread-1> <<WLS Kernel>> <> <ae8121cbb691c6b3:-fc73acf:15d7e34bd34:-8000-00000000000002ff> <1501074407979> <BEA-000388> <JVM called WLS shutdown hook. The server will force shutdown now>
    ####<Jul 26, 2017 6:36:47 PM IST> <Alert> <WebLogicServer> <> <AdminServer> <Thread-1> <<WLS Kernel>> <> <ae8121cbb691c6b3:-fc73acf:15d7e34bd34:-8000-00000000000002ff> <1501074407980> <BEA-000396> <Server shutdown has been requested by <WLS Kernel>>

The logs clearly indicates that the JVM itself initiated the shutdown of the servers and OS has  nothing to do with it.

The problem might occur due to incompatibility of softwares. In our case there was no customization or any of our own developed softwares deployed. Ours was clean install.
This points me to JDK version. After pointing to the correct java, the issue has been resolved.

Hope it helps.