Debugging JavaDB Query Performance

— turn on RUNTIMESTATISTICS for connection:
CALL SYSCS_UTIL.SYSCS_SET_RUNTIMESTATISTICS(1);
CALL SYSCS_UTIL.SYSCS_SET_STATISTICS_TIMING(1);

— Indicate that statistics information should be captured into
— database tables in the SomeSchema schema:
CALL SYSCS_UTIL.SYSCS_SET_XPLAIN_SCHEMA(‘SomeSchema’);

—execute  the queries:
SELECT * FROM <SomeTable> WHERE someName = ‘XYZ’ ;

–turn off runtime statistics:
VALUES SYSCS_UTIL.SYSCS_GET_RUNTIMESTATISTICS();
CALL SYSCS_UTIL.SYSCS_SET_RUNTIMESTATISTICS(0);
–Retrieve the text of statements which were captured, in order by the time when the statistics were captured:
select stmt_text, xplain_time from SomeSchema.sysxplain_statements    order by xplain_time;

–Retrieve the text of statements which were captured, showing the statements which took the longest time to execute first:
select s.stmt_text, st.execute_time from SomeSchema.sysxplain_statements s, SomeSchema.sysxplain_statement_timings st where s.timing_id = st.timing_id    order by st.execute_time desc

–Show the statements that were executed, together with the result sets that each statement required:
select st.stmt_text, rs.op_identifier from SomeSchema.sysxplain_statements st  join SomeSchema.sysxplain_resultsets rs  on st.stmt_id = rs.stmt_id

–Find statements which resulted in an external sort being performed:
select s.stmt_text, s.stmt_id, rs.op_identifier, srt.no_input_rows from SomeSchema.sysxplain_sort_props srt, SomeSchema.sysxplain_resultsets rs, SomeSchema.sysxplain_statements s where rs.stmt_id = s.stmt_id and rs.sort_rs_id = srt.sort_rs_id and srt.sort_type = ‘EX’

–Find statements which resulted in a tablescan:
select st.stmt_text, sp.no_visited_pages, sp.no_visited_rows from SomeSchema.sysxplain_scan_props sp, SomeSchema.sysxplain_resultsets rs, SomeSchema.sysxplain_statements st where st.stmt_id = rs.stmt_id and  rs.scan_rs_id = sp.scan_rs_id and  rs.op_identifier = ‘TABLESCAN’ and sp.scan_object_name = ‘<SomeTable>’

Advertisements

Tracing Derby SQL statements

I am working on a project which has a derby db alias apache derby alias javadb as one of the components which stores the configurations (javadb is suitable for in-memory operations for small no. of clients). The application which accesses the derby db has high response times.
To troubleshoot the high response times, I have enabled tracing of sql statements that are fired from the application.

  • Created derby.properties file and added a property “derby.language.logStatementText=true”
  • Restarted the application server including the derby database.

After that, all the sql statements will be logged into derby.log file.

Hope it helps

Note: tracing enables lots of logging of sql statements. Be careful while  dealing with the logs that are generated due to tracing

Configuring GC log Issue

I am trying to configure GC logs in weblogic startup scripts on one of the linux servers (oel7)

JDK: 1.7u80

MEM_ARGS=”-Xms2048m -Xmx2048m -XX:PermSize=512m -XX:MaxPermSize=512m -Xloggc:/scratch/oracle/middleware/user_projects/domains/oamdomain/servers/oam_server2/logs/oamserver2$$-gc.log -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution”

With the above parameters, I am unable to start weblogic servers… getting the below error.

Error: Could not create the Java Virtual Machine.
Error: A fatal exception has occurred. Program will exit.
Invalid file name for use with -Xloggc: Filename can only contain the characters [A-Z][a-z][0-9]-_.%[p|t] but it has been oamserver2-gc.log -XX:+PrintGC
Note %p or %t can only be used once

Solution:
MEM_ARGS has been split into two parts:

export MEM_ARGS=” -Xms2048m -Xmx2048m -XX:PermSize=512m -XX:MaxPermSize=512m -Xloggc:/scratch/oracle/middleware/user_projects/domains/oamdomain/servers/oam_server2/logs/oamserver2$$-gc.log”

export JAVA_OPTIONS=”-XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution $JAVA_OPTIONS”

The weblogic startup script was creating problems while starting the server when we have everything part of mem_args. May be some sort of validation was happening somewhere 😉

Hope it helps.

IDM servers going down

Problem Statement:
All java process pertaining to IDM stack were getting killed very periodically. And the servers had to be restarted very often.

Softwares:
Weblogic 11g, IDM stack  java 1.70.79

Analysis:

  • There were no thread dumps or core dumps.
  • System logs were clean.
  • Greped the  weblogic logs for errors, could not find anything
  • After carefully going through each and every line of the weblogic logs, I could find below messages:
    ####<Jul 26, 2017 6:36:47 PM IST> <Notice> <WebLogicServer> <myhost.me.me.com> <AdminServer> <Thread-1> <<WLS Kernel>> <> <ae8121cbb691c6b3:-fc73acf:15d7e34bd34:-8000-00000000000002ff> <1501074407979> <BEA-000388> <JVM called WLS shutdown hook. The server will force shutdown now>
    ####<Jul 26, 2017 6:36:47 PM IST> <Alert> <WebLogicServer> <myhost.me.me.com> <AdminServer> <Thread-1> <<WLS Kernel>> <> <ae8121cbb691c6b3:-fc73acf:15d7e34bd34:-8000-00000000000002ff> <1501074407980> <BEA-000396> <Server shutdown has been requested by <WLS Kernel>>

The logs clearly indicates that the JVM itself initiated the shutdown of the servers and OS has  nothing to do with it.

Solution:
The problem might occur due to incompatibility of softwares. In our case there was no customization or any of our own developed softwares deployed. Ours was clean install.
This points me to JDK version. After pointing to the correct java, the issue has been resolved.

Hope it helps.

Installing Python on Linux systems when /usr/local/bin directory is read-only

We are given  linux systems which  has Python 2.7.5 installed.  And our application requires Python  version greater than 2.7.9. When we try to upgrade or install the Python to the latest version, it fails as the default installations requires an entry to be made on the read only file system  /usr/local/bin/. Even with root access we cannot modify the /usr/local/bin directory.

readonly

So how do we upgrade or install a new version of Python? Below are the steps:

  • Download the version of Python which we are interested
  • untar the binary to a directory.
  • cd <go  to the directory where it has been untared>
  • run the configure  command: ./configure –prefix=/myworks/softwares/pythonV2.7.13 –enable-shared –with-ensurepip=yes && make
  • run make command: make altinstall. This command will install the python to the directory specified in prefix path.
  • go to the .bash_profile and add the  statement: export LD_LIBRARY_PATH=/myworks/softwares/pythonV2.7.13/lib:$LD_LIBRARY_PATH
  • Reload the .bash_profile. Its done

Hope it helps

 

 

 

 

Connecting to remote Derby Database from Windows

One of my application uses Derby Database to store configuration information. I just wanted to see what tables are present in derby db. As I was using Derby for the first time, I did not know the basic commands to check the list of tables from the   linux system (where my derby db has been installed) sql prompt. I wanted to have a SQL Client which can connect to the remote db and then list the available tables and the data in it. This is when I started googling for SQL Client.  I found the suitable SQL client installer (squirrel-sql-3.7.1-standard.jar).

Installing and configuring the SQL Client:

  • Run the command “java -jar squirrel-sql-3.7.1-standard.jar” a gui would open and  click on next.
  • Download Apace Derby (db-derby-10.10.2.0-bin.zip)
  • Unzip the the derby zip  file to a folder.
  • Click on the “squirrel-sql.bat” file from the location where it is installed, it would open “SQuirreL SQL Client Version 3.7.1” GUI Editor.
  • Under Drivers tab on the top left of the editor, click on drivers and then double click  on Apache Derby Client.
  • Click on the “Extra Class Path” tab on the new window and add the derbyclient.jar from the folder where we have unzipped the derby package previously. Click OK. The driver would be registered.
  • Configure-Derby
  • Click on Alias, and then “add Alias” and input  the required details. TEST the connection and click on OK.
  • addalias

Hope it helps.

Commands to execute SQL tuning profiler

set serveroutput on

DECLARE
stmt_task VARCHAR2(64);
BEGIN
stmt_task := DBMS_SQLTUNE.CREATE_TUNING_TASK(sql_id => ‘8bk0dw24d58jg’);
dbms_output.Put_line(stmt_task);
END;

——–

EXEC DBMS_SQLTUNE.execute_tuning_task(task_name => ‘TASK_1342’);

———

set long 9999999
SET PAGESIZE 1000
SET LINESIZE 32767
SELECT DBMS_SQLTUNE.report_tuning_task(‘TASK_1342’) AS recommendations FROM dual;

Important Configuration Parameters for Tuning Apache Spark Job

Below are few of the  spark configurations that I have used while tuning the Apache Spark Job.

  • “-XX:MetaspaceSize=100M”: Before adding the parameter, Full GC’s were observed due to the metaspace resizing. Added the parameter and after that no  full gc on account of metaspace resizing observed
  • “-XX:+DisableExplicitGC”: In Standalone mode, System.gc is being invoked by the code every 30 minutes which does a full GC (not a right practice). After adding this parameter no full GC on account of System.gc observed
  • “-Xmx2G”: OOM was observed with the default heap size (1G)  when executing the run with more than  140K messages in the aggregation window. After heap has been increased to 2GB (the maximum allowed in this box) I was able to process 221K messages successfully in the aggregation window. At 250k messages we are getting OOM.
  • spark.memory.fraction – 0.85: In spark 1.6.0, the default value of 0.75 by  storage/executor memory. This value has been increased to give more memory to the storage/executor memory, this is done to avoid  OOM.
  • Storage level has been changed to ‘Disk_Only’:Before the change, we were getting OOM when processing 250K messages during the aggregation window of 300 seconds. After the change,  we could process 540K messages in the aggregation window without getting OOM.  Even though, IN-Memory gives better performance, due to limitation of the hardware availability i had to implement Disk-Only.
  • spark.serializer is set to KryoSerializer: Java serilizer has bigger memory footprint, To avoid the high memory footprint and for better performance we used this serializer
  • “-Xloggc:~/etl-gc.log -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCCause” : This parameters needs to be added as part of good performance practice. Also, it will be helpful to diagnose the problem by looking at the gc logs. The overhead of these parameters are very minimal in production
  • spark.streaming.backpressure.enabled – true: This enables the Spark Streaming to control the receiving rate based on the current batch scheduling delays and processing times so that the system receives only as fast as the system can process.
  • spark.io.compression.codec set to org.apache.spark.io.LZFCompressionCodec: The codec used to compress internal data such as RDD partitions, broadcast variables and shuffle outputs.

Hope it helps.

Setting Timeout in Elastic Load Balancer

AWS Elastic Load Balancer has an idle timeout value set at 60 seconds. If there is no activity for 60 seconds, then the connection is teared down and HTTP error code 504 was thrown to the customer. Here are the  steps to change the timeout value in the AWS Elastic Load Balancer:

  1. Sign in to AWS Console
  2. Go to EC2 Services
  3. On the left panel, click on the Load Balancing > Load Balancers
  4. In the top panel, select the Load Balancer for which you want to change the idle timeout
  5. Now in the bottom panel, under the ‘Attributes’ section, click on the ‘Edit idle timeout’ button. The default value would be 60 seconds. Change it to the value that you would like. (say 180 seconds)
  6. Click on ‘Save’ button