set serveroutput on
stmt_task := DBMS_SQLTUNE.CREATE_TUNING_TASK(sql_id => ‘8bk0dw24d58jg’);
EXEC DBMS_SQLTUNE.execute_tuning_task(task_name => ‘TASK_1342’);
set long 9999999
SET PAGESIZE 1000
SET LINESIZE 32767
SELECT DBMS_SQLTUNE.report_tuning_task(‘TASK_1342’) AS recommendations FROM dual;
Below are few of the spark configurations that I have used while tuning the Apache Spark Job.
- “-XX:MetaspaceSize=100M”: Before adding the parameter, Full GC’s were observed due to the metaspace resizing. Added the parameter and after that no full gc on account of metaspace resizing observed
- “-XX:+DisableExplicitGC”: In Standalone mode, System.gc is being invoked by the code every 30 minutes which does a full GC (not a right practice). After adding this parameter no full GC on account of System.gc observed
- “-Xmx2G”: OOM was observed with the default heap size (1G) when executing the run with more than 140K messages in the aggregation window. After heap has been increased to 2GB (the maximum allowed in this box) I was able to process 221K messages successfully in the aggregation window. At 250k messages we are getting OOM.
- spark.memory.fraction – 0.85: In spark 1.6.0, the default value of 0.75 by storage/executor memory. This value has been increased to give more memory to the storage/executor memory, this is done to avoid OOM.
- Storage level has been changed to ‘Disk_Only’:Before the change, we were getting OOM when processing 250K messages during the aggregation window of 300 seconds. After the change, we could process 540K messages in the aggregation window without getting OOM. Even though, IN-Memory gives better performance, due to limitation of the hardware availability i had to implement Disk-Only.
- spark.serializer is set to KryoSerializer: Java serilizer has bigger memory footprint, To avoid the high memory footprint and for better performance we used this serializer
- “-Xloggc:~/etl-gc.log -XX:+PrintGC -XX:+PrintGCDetails -XX:+PrintGCDateStamps -XX:+PrintTenuringDistribution -XX:+PrintGCCause” : This parameters needs to be added as part of good performance practice. Also, it will be helpful to diagnose the problem by looking at the gc logs. The overhead of these parameters are very minimal in production
- spark.streaming.backpressure.enabled – true: This enables the Spark Streaming to control the receiving rate based on the current batch scheduling delays and processing times so that the system receives only as fast as the system can process.
- spark.io.compression.codec set to org.apache.spark.io.LZFCompressionCodec: The codec used to compress internal data such as RDD partitions, broadcast variables and shuffle outputs.
Hope it helps.
AWS Elastic Load Balancer has an idle timeout value set at 60 seconds. If there is no activity for 60 seconds, then the connection is teared down and HTTP error code 504 was thrown to the customer. Here are the steps to change the timeout value in the AWS Elastic Load Balancer:
- Sign in to AWS Console
- Go to EC2 Services
- On the left panel, click on the Load Balancing > Load Balancers
- In the top panel, select the Load Balancer for which you want to change the idle timeout
- Now in the bottom panel, under the ‘Attributes’ section, click on the ‘Edit idle timeout’ button. The default value would be 60 seconds. Change it to the value that you would like. (say 180 seconds)
- Click on ‘Save’ button
Here is the usecase: I have 3 scenarios named A, B and C which are to be load tested with 6, 3 and 1 threads respectively.
These 3 scenarios have 7 use cases (T1 to T7)and are to be executed using defined percentages as shown below:
How do we configure it in JMeter? Had it been just the users with 3 scenarios we would have configured them in ThreadGroup. But how abt T1 to T7?
First create 3 threads groups with desired no. of users as shown below:
Then under the thread group add Throughput controller which is under logical controller. Configure the percentage and add the request to the throughput controller as show below:
Hope this helps.
- Excessive Layering – Most of the underlying performance starts with the excessive layering antipattern. The application design has grown over the usage of controllers, commands and facades. In order to decouple each layer, the designers are adding facades at each of the tiers. Now, for every request at the web tier, the request call goes through multiple layers just to fetch the results. Imagine doing this for thousands of requests coming in and the load the JVM need to handle to process these requests. The number of objects that get created and destroyed when making these calls add to the memory overhead. This further limits the amount of requests that can be handled by each server node. Based on the size of the application, deployment model, the number of user’s, appropriate decision need to be taken to reduce the number of layers. E.g. if the entire application gets deployed in the same container, there is no need to create multiple layers of process beans, service beans(business beans), data access objects etc. Similarly, when developing an internet scale application, large number of layers start adding overheads to the request processing. Remember, large number of layers means large number of classes which effectively start impacting the overall application maintainability.
- Round Tripping– With the advent of ORM mappings, Session/DAO objects, the programmer starts making calls to beans for every data. This leading to excessive calls between the layers. Another side issue is the number of method calls each layer start having to support this model. Worse case is, when the beans are web service based. Client tier making multiple web service calls within a single user request have a direct impact on the application performance. To reduce the round tripping, the application needs to handle or combine multiple requests at the business tier.
- Overstuffed Session– Session object is a feature provided by the JEE container to track user session during the web site visit. The application start with the promise of putting very minimal information in the session but over a period of time, the session object keeps on growing. Too much of data or wrong kind of data is stuffed into the session object. Large data objects will mean that the objects placed in the session will linger on till the session object is destroyed. This impacts the number of user’s that can be served by the application server node. Further, I have seen, application using session clustering to support availability requirements but adding significant overheads to the network traffic and ability of application to handle higher number of users. To unstuff the session object, take an inventory of what all goes there, see what is necessary, what objects can be defaulted to request scope. For others, remove the objects from session when their usage is over.
- Golden Hammer (Everything is a Service) – With the advent of SOA, there is tendency to expose the business services, which can be orchestrated into process services. In the older applications, one can observe similar pattern being implemented with EJBs. This pattern coupled with the bottom up design approach at times, means exposing each and every data entity as a business service. This kind of design might be working correctly functionally, but from the performance and maintenance point of view, it soon becomes a night mare. Every web service call adds overhead in terms of data serialization and deserialization. At times, the data(XML) being passed with web service calls is also huge leading to performance issues. The usage of services or ejb’s should to be evaluated from application usage perspective. Attention needs to be paid on the contract design.
- Chatty Services – Another pattern observed is the way the service is implemented via multiple web service calls each of which is communicating a small piece of data. This results in explosion of web services and which leads to degradation of performance and unmaintainable code. Also, from the deployment perspective, the application starts running into problems. I have come across projects which have hundred plus services all getting crammed into a single deployment unit. When the application comes up, the base heap requirement is already in 2Gb range leaving not much space for application to run. If the application is having too many fine grained services, then it an indication towards the application of this antipattern.
Refer to : https://www.linkedin.com/pulse/application-performance-antipatterns-munish-kumar-gupta
How to troubleshoot and monitor a Java and JEE application having performance and scalability problem. Here are the techniques used for production systems.
- Perform a series of JDK thread dump to locate the following possible problems:
- Application bottleneck: Identify application bottlenecks by locating the most common stack trace. Optimize requests that happen most often on the stack trace.
- Bad SQLs: If most threads are in the waiting state for the JDBC calls, trace down the bad SQLs to the DB.
- Slow DB: If many SQLs are having problem, conduct a DB profiling to locate the DB problem.
- DB or external system outages: Check if a lot of threads are in the waiting state of making external connection.
- Concurrency issue: Check if many stack trace are waiting in the same code for a lock.
- Infinite loop: Verify if threads remaining running over minutes at similar part of the source code.
- Connectivity problem: Un-expected low idling thread count indicates the requests are not reaching the application server.
- Thread count mis-configuration: Increase thread count if CPU utilization is low yet most thread are in runnable state.
- Monitor CPU utilization
- High CPU utilization implies design or coding in-efficiency. Execute a thread dump to locate bottleneck. If no problems are found, the system may reach full capacity.
- Low CPU utilization with abnormal high response time implies many threads are blocked. Execute a thread dump to narrow down the problem.
- Monitor process health including the Java application server
- Monitor whether all web servers, application servers, middle tier systems and DB server is running. Configure the system as service so it can be automatically re-started when the process die suddenly.
- Monitor the Java Heap Utilization
- Monitor the amount of Java Heap memory that can be re-claimed after a major garbage collections. If the re-claimed amount keep dropping consistently, the application is leaking memory. Perform memory profiling in locating the memory leak. If no memory is leaking but yet major garbage collection is frequent, tune the Java heap accordingly.
- Monitor un-usual exception in application log & application server log
- Monitor and resolve any exceptions detected in the application and server log. Examine the source code to ensure all resources, in particular DB, file, socket and JMS resources, are probably closed when the application throws an exception.
- Monitor memory & paging activities
- Growing residence (native) memory implies leaking memory in the native code. The source of leaking may include the application non-java native code, C code in the JVM and third party libraries. Also monitor the paging activities closely. Frequent paging means memory mis-configuration.
- Perform DB profiling
- Monitor the following matrix closely
- Identify the top SQLs in logical reads, latency and counts – Re-write or tune poorly performed SQLs or DB programming code.
- Top DB waiting and latch events – Identify bad DB coding or bad DB instance or table configuration.
- Amount of hard parses – Identify scalability problem because of improper DB programming.
- Hit ratio for different buffers and caches – Proof of bad SQLs or improper buffer size configuration.
- File I/O statistics – Proof of bad SQLs, or disk mis-configuration or layout
- Rollback ratio – Identify improper application logic
- Sorting efficiency – Improper sorting buffer configuration
- Undo log or rollback segment performance – Identify DB tuning problem
- Amount of SQL statements and transactions per second – A sudden jump reviews any bad application coding
- JMS Resources
- Monitor the Queue length and resource utilization
- Poison messages: Check if many messages un-processed and staying in the queues for a long time.
- JMS queue deadlocks: Check if no messages can be de-queued and finished.
- JMS listener problems: Check if no messages are processed in a particular queue.
- Memory consumption: Ensure queues having a large amount of pending messages can be paged out of the physical memory.
- JMS retry: Ensure the failed messages are not re-processed immediately. Otherwise, poison messages may consumes most of the CPU.
- Monitor file I/O performance
- Trend the I/O access and wait time. Re-design or re-configure the disk layout if necessary for better I/O performance in particular for the DB server.
- Monitor resource utilization including file descriptor
- Monitor resources closely to identify any application code is depleting OS level resources.
- Monitor HTTP access
- Monitor the top IP address in accessing the system. Detect any intruder trying to steal the content and data in the web site. Use the access log to trace any non 200 HTTP response.
- Monitor security access log
- Monitor OS level security log and web server log to detect hacker intrusion. It also gives hints on how hackers are attacking the system.
- Monitor network connectivity and TCP status
- Run netstat constantly to monitor the TCP socket state.
- High amount of TCP idle wait state implies TCP mis-configuration.
- High amount of TCP in SYNC or FIN state implies possible denial of service attack (DoS).
What are the most common performance and scalability problems for a J2EE (Java EE) Web application? Here are the most common tips and problems found in real production systems.
- Bad Caching Strategy: It is rare that users require absolutely real time information. Simply refreshing HTML content with a 60 second cache can already dramatically reduce the load to the application server and most important the DB for a high traffic web site. Cache HTML segment for the home page and most visited pages. Implement other caching strategy in the business service layer or the DB layer. For example, use Spring AOP to cache data returned from a business service or configure hibernate to cache DB query result.
- Missing DB indexes: After a new code push, indexes may be missing for the new SQL codes. The data query may be slow if the table is huge and the missing index forces a full table scan. Most development DB has a very small data set and therefore the problem is un-detected. Check the DB log or profile in production for long executed SQLs and add index if needed.
- Bad SQLs: The second most common DB performance problem is bad SQLs. Check the DB log or profile for long executed query. Most problems can be resolved by re-written the SQLs. Paid attentions to sub-query or SQLs with complicated joins. Occasionally, DB table tuning may be required.
- Too many fine grain calls to the service, data or the DB layer: Developers may use an iteration loop in retrieving a list of data. Each iteration may make a middle tier call which results in multiple SQL calls. If the list is long, the total DB requests can be huge. Developers should write a new service call and retrieve the list in a single DB call.
- All application server threads are waiting for the DB or external system connection: Web server has a limited number of threads. When a HTTP request is processed, a thread will exclusively dedicate to a request until it is completed. Hence, if an external system like DB is very slow, all web server threads may be waiting. When this happens, the web server will pause all new incoming requests. From a end user perspective, the system seems not responding. Add timeout logic when communicate with external system. Increasing the thread counts will only delay the problem and in some cases counter productive.
- SQLs retrieve too many rows of data: Do not retrive hundreds row of data to just display a few of them. Check the DB log or profile constantly for un-expected high usage of SQLs that retrieve a lot of rows .
- Do not use prepared statement for the DB: Always use prepared statement to avoid DB side SQL hard parsing. SQL hard parsing causes a lot of DB scalability problem when DB requests increases.
- Lack or improper pagination of data: Implement pagination to display a long list of data. Do not retrieve all the data from the database and use the Java code to filter out the data. Always use the database for data filtering and pagination.
- Non-optimize connection pool configuration: The maximum / minimum pool size and the retaining policy of idling pool thread can significant impact an application performance. The web server will be idle waiting for a DB connection if the pool size is too low. The retaining policy is important since most DB pool creation code has very low concurrency and cannot handle a sudden surge of concurrent requests.
- Frequent garbage collection caused by memory leak: When memory is leaking, the Java JVM will perform frequent garbage collection (GC) even they cannot reclaim too many memory. Eventually, the web server spend most of the time executing the GC rather than processing HTTP requests. Rebooting the server can temporarily release the problem but only stopping the leak can solve the problem.
- Do not process large amount of data at once: For request involving large amount of data, in particular batch process, sub-divide the large data set into chunk and process it separately. Otherwise, the request may deplete the Java heap or stack memory and crashes the JVM.
- Concurrency problems in the synchronization block: Code synchronization block carefully. Use established library to manage system and application resources like DB connection pool. For system with concurrency problem, the CPU utilization remains low even significantly increase the traffic.
- Bad DB tuning: If DB response is slow regardless of SQLs, DB instance tuning is needed. Monitor the memory paging activity closely in identifying any memory mis-configuration. Also monitor the file I/O wait time and DB memory usage closely.
- Process data in batch: To reduce DB requests, combine DB requests together and process those in a single batch. Use SQL batch if necessary instead of large volume of small SQL requests.
- JMS or application deadlock: Avoid a cyclic loop in making JMS requests. A request may send to Queue A which then send a message to Queue B and then again to Queue A. This circle loop will trigger deadlock in high volume requests.
- Bad Java heap configuration: Configure the maximum heap size, the minimum heap size, the young generation heap and the garbage collection algorithm correctly. The bigger is not the better and it is often depends on the application.
- Bad application server thread configuration: Too high of a thread count triggers high context switching overhead while low thread count causes low concurrency. Tuning it according to the application needs and behavior. Configure the connection pool thread count according to the amount of thread count.
- Internal bugs in the third party libraries or the application server: If new third party libraries are added to the application, monitor any concurrency and memory leak issue closely.
- Out of file descriptors: If the application does not close file or network resources correctly in particular within exception handling, the application may ran out of file descriptors and stop processing new requests.
- Infinite loop in the application code: An iteration loop may run into an infinite loop and trigger high CPU utilization. It can be data sensitive and happen to a small set of traffic. If the CPU utilization remains high during low traffic time, monitor the thread closely.
- Wrong firewall configuration: Some firewall configuration limits the amount of concurrent access from a single IP. This can be problematic if a web server is connected to another DB server through a firewall. Verify the firewall configuration if the application achieves much higher concurrency if tested within in a local network.
- Bad TCP tuning: In-proper TCP tuning causes un-resonable high amount of socket waiting to be closed (TIME_WAIT). New version of OS is usually tuned correctly for Web server. Make changes to the default TCP tuning parameters only if needed. Direct TCP programming may sometimes need special programming parameters for short but frequent TCP messages.
Types of OOM:
- java.lang.OutOfMemoryError: Java heap space
- java.lang.OutOfMemoryError: PermGen space
- java.lang.OutOfMemoryError: GC overhead limit exceeded
- java.lang.OutOfMemoryError: unable to create new native thread
- java.lang.OutOfMemoryError: nativeGetNewTLA
- java.lang.OutOfMemoryError: Requested array size exceeds VM limit
- java.lang.OutOfMemoryError: request <size> bytes for <reason>. Out of swap
- java.lang.OutOfMemoryError: <reason> <stack trace> (Native method)
- java.lang.OutOfMemoryError: Metaspace
Here are the typical cause of Java Memory Leak:
- Do not close DB, file, socket, JMS resources and other external resources properly
- Do not close resources properly when an exception is thrown
- Keep adding objects to a cache or a hash map or hashtable, or vector or ArrayLIst without expiring the old one
- Do not implement the hash and equal function correctly for the key to a cache
- Session data is too large
- Leak in third party library or the application server
- In an infinite application code loop (likely cause for high cpu)
- Leaking memory in the native code
Few of the questions that can help in solving performance issues:
Load balance issues
- What type of load balancing scheme is used? (Round robin, sticky IP, least connections, subnet based?)
- What is the timeout of LB table?
- Does it do any connection pooling?
- Is it doing any content filtering?
- Is it checking for HTTP response status?
- Are there application dependencies associated with the LB timeout settings?
- What failover strategies are employed?
- What is the connection persistence timeout?
- Are there application dependencies associated with the LB timeout settings?
- What are the timeouts for critical functions?
- What is the throughput capacity?
- What is the connection capacity and rate?
- What is the DMZ operation?
- What are the throughput policies from a single IP?
- What are the connection policies from a single IP?
Firewalls and multiple DMZs
- Does the firewall do content filtering?
- Is it sensitive to inbound and/or outbound traffic?
- What is its upper connection limit?
- Are there policies associated with maximum connection or throughput per IP address?
- Are there multiple firewalls in the architecture (multiple DMZs)?
- If it has multiple DMZs, is it sensitive to data content?
Web server issues
- How many connections can the server handle?
- How many open file descriptors or handles is the server configured to handle?
- How many processes or threads is the server configured to handle?
- Does it release and renew threads and connections correctly?
- How large is the server’s listen queue?
- What is the server’s “page push” capacity?
- What type of caching is done?
- Is there any page construction done here?
- Is there dynamic browsing?
- Are there any SSL acceleration devices in front of the web server?
- Are there any content caching devices in front of the web server?
- Can server extensions and their functions be validated? (ASP, JSP, PHP, Perl, CGI, servlets, ISAPI filter/app, etc.)
- Monitoring (Pools: threads, processes, connections, etc. Queues: ASP, sessions, etc. General: CPU, memory, I/O, context switch rate, paging, etc.)
Application server issues
- Is there any page construction done here?
- How is session management done and what is the capacity?
- Are there any clustered configurations?
- Is there any load balancing done?
- If there is software load balancing, which one is the load balancer?
- What is the page construction capacity?
- Do components have a specific interface to peripheral and external systems?
Database server issues
- Have both small and large data sets been tested?
- What is the connection pooling configuration?
- What are its upper limits?
The experienced performance engineer asks the questions;
- Why is the application updating all these tables on an order creation?
- Why is it calling the remote pricing call three times?
- Why are you creating a new object for the same customer or product?
- Why is the database connection handler making so many connections for a static number of users?
- Did you expect your users/customer to come from a slow wireless connection? Did you test for that?
- Did you realize the Application Servers where in one data center and the database was in another data center?
- Who set the JVM memory configuration?
- Why are the indexes on the same volumes as the files?
- The performance testing database was one quarter size the production database.
- How many physical CPU’s did you really allocate to the Database Server?
- How was the peak volume determined?