Performance antipatterns for cloud applications

A performance antipattern is a common practice that is likely to cause scalability problems when an application is under pressure.

An application behaves well during performance testing. It’s released to production, and begins to handle real workloads. At that point, it starts to perform poorly — rejecting user requests, stalling, or throwing exceptions.

Busy Database antipattern
Offloading processing to a database server can cause it to spend a significant proportion of time running code, rather than responding to requests to store and retrieve data.

Runtime costs may be excessive if the data store is metered. That’s particularly true of managed database services. Databases have finite capacity to scale up, and it’s not trivial to scale a database horizontally. Therefore, it may be better to move processing into a compute resource, such as a VM or App Service app, that can easily scale out.

This antipattern typically occurs because:
The database is viewed as a service rather than a repository. An application might use the database server to format data, manipulate string data, or perform complex calculations.
Developers try to write queries whose results can be displayed directly to users. For example a query might combine fields, or format dates, times, and currency according to locale.
Developers are trying to correct the Extraneous Fetching antipattern by pushing computations to the database.
Stored procedures are used to encapsulate business logic, perhaps because they are considered easier to maintain and update.

How to fix the problem
Move processing from the database server into other application tiers. Ideally, you should limit the database to performing data access operations, using only the capabilities that the database is optimized for, such as aggregation in an RDBMS.

Considerations
Many database systems are highly optimized to perform certain types of data processing, such as calculating aggregate values over large datasets. Don’t move those types of processing out of the database.

Do not relocate processing if doing so causes the database to transfer far more data over the network. See the Extraneous Fetching antipattern.

If you move processing to an application tier, that tier may need to scale out to handle the additional work.

Busy Front End antipattern
Performing asynchronous work on a large number of background threads can starve other concurrent foreground tasks of resources, decreasing response times to unacceptable levels.

Resource-intensive tasks can increase the response times for user requests and cause high latency. One way to improve response times is to offload a resource-intensive task to a separate thread. This approach lets the application stay responsive while processing happens in the background. However, tasks that run on a background thread still consume resources. If there are too many of them, they can starve the threads that are handling requests.

This problem typically occurs when an application is developed as monolithic piece of code, with all of the business logic combined into a single tier shared with the presentation layer.

How to fix the problem
Move processes that consume significant resources to a separate back end.

With this approach, the front end puts resource-intensive tasks onto a message queue. The back end picks up the tasks for asynchronous processing. The queue also acts as a load leveler, buffering requests for the back end. If the queue length becomes too long, you can configure autoscaling to scale out the back end.

Considerations
This approach adds some additional complexity to the application. You must handle queuing and dequeuing safely to avoid losing requests in the event of a failure.
The application takes a dependency on an additional service for the message queue.
The processing environment must be sufficiently scalable to handle the expected workload and meet the required throughput targets.
While this approach should improve overall responsiveness, the tasks that are moved to the back end may take longer to complete.

Chatty I/O antipattern
The cumulative effect of a large number of I/O requests can have a significant impact on performance and responsiveness.

Network calls and other I/O operations are inherently slow compared to compute tasks. Each I/O request typically has significant overhead, and the cumulative effect of numerous I/O operations can slow down the system. Here are some common causes of chatty I/O.

Reading and writing individual records to a database as distinct requests
Implementing a single logical operation as a series of HTTP requests
Reading and writing to a file on disk
An application that continually reads and writes small amounts of information to a file will generate significant I/O overhead. Small write requests can also lead to file fragmentation, slowing subsequent I/O operations still further.

The following example uses a FileStream to write a Customer object to a file. Creating the FileStream opens the file, and disposing it closes the file. (The using statement automatically disposes the FileStream object.) If the application calls this method repeatedly as new customers are added, the I/O overhead can accumulate quickly.

private async Task SaveCustomerToFileAsync(Customer cust)
{
using (Stream fileStream = new FileStream(CustomersFileName, FileMode.Append))
{
BinaryFormatter formatter = new BinaryFormatter();
byte [] data = null;
using (MemoryStream memStream = new MemoryStream())
{
formatter.Serialize(memStream, cust);
data = memStream.ToArray();
}
await fileStream.WriteAsync(data, 0, data.Length);
}
}

How to fix the problem
Reduce the number of I/O requests by packaging the data into larger, fewer requests.
Fetch data from a database as a single query, instead of several smaller queries.

Considerations
The first two examples make fewer I/O calls, but each one retrieves more information. You must consider t
When reading data, do not make your I/O requests too large. An application should only retrieve the information that it is likely to use.
Sometimes it helps to partition the information for an object into two chunks, frequently accessed data that accounts for most requests, and less frequently accessed data that is used rarely. Often the most frequently accessed data is a relatively small portion of the total data for an object, so returning just that portion can save significant I/O overhead.
When writing data, avoid locking resources for longer than necessary, to reduce the chances of contention during a lengthy operation
If you buffer data in memory before writing it, the data is vulnerable if the process crashes. If the data rate typically has bursts or is relatively sparse, it may be safer to buffer the data in an external durable queue
Consider caching data that you retrieve from a service or a database. This can help to reduce the volume of I/O by avoiding repeated requests for the same data.

–Symptoms of chatty I/O include high latency and low throughput. End users are likely to report extended response times or failures caused by services timing out, due to increased contention for I/O resources.
–Look for any of these symptoms:
A large number of small I/O requests made to the same file.
A large number of small network requests made by an application instance to the same service.
A large number of small requests made by an application instance to the same data store.
Applications and services becoming I/O bound.

Extraneous Fetching antipattern
Retrieving more data than needed for a business operation can result in unnecessary I/O overhead and reduce responsiveness.
This antipattern can occur if the application tries to minimize I/O requests by retrieving all of the data that it might need. This is often a result of overcompensating for the Chatty I/O antipattern.

How to fix the problem
Avoid fetching large volumes of data that may quickly become outdated or might be discarded, and only fetch the data needed for the operation being performed.
Instead of getting every column from a table and then filtering them, select the columns that you need from the database.

Considerations
In some cases, you can improve performance by partitioning data horizontally. If different operations access different attributes of the data, horizontal partitioning may reduce contention. Often, most operations are run against a small subset of the data, so spreading this load may improve performance
For operations that have to support unbounded queries, implement pagination and only fetch a limited number of entities at a time. For example, if a customer is browsing a product catalog, you can show one page of results at a time.
When possible, take advantage of features built into the data store. For example, SQL databases typically provide aggregate functions.
If you’re using a data store that doesn’t support a particular function, such as aggregration, you could store the calculated result elsewhere, updating the value as records are added or updated, so the application doesn’t have to recalculate the value each time it’s needed.
If you see that requests are retrieving a large number of fields, examine the source code to determine whether all of these fields are actually necessary. Sometimes these requests are the result of poorly designed SELECT * query.
Similarly, requests that retrieve a large number of entities may be sign that the application is not filtering data correctly. Verify that all of these entities are actually needed. Use database-side filtering if possible, for example, by using WHERE clauses in SQL.
Offloading processing to the database is not always the best option. Only use this strategy when the database is designed or optimized to do so. Most database systems are highly optimized for certain functions, but are not designed to act as general-purpose application engines.

—Look for any of these symptoms:

Frequent, large I/O requests made to the same resource or data store.
Contention in a shared resource or data store.
An operation that frequently receives large volumes of data over the network.
Applications and services spending significant time waiting for I/O to complete.

Improper Instantiation antipattern
It can hurt performance to continually create new instances of an object that is meant to be created once and then shared.
Many libraries provide abstractions of external resources. Internally, these classes typically manage their own connections to the resource, acting as brokers that clients can use to access the resource.

These classes are intended to be instantiated once and reused throughout the lifetime of an application.
Continually creating and destroying instances of this class might adversely affect the scalability of the system.

How to fix the problem
If the class that wraps the external resource is shareable and thread-safe, create a shared singleton instance or a pool of reusable instances of the class.

Considerations
The key element of this antipattern is repeatedly creating and destroying instances of a shareable object. If a class is not shareable (not thread-safe), then this antipattern does not apply.
The type of shared resource might dictate whether you should use a singleton or create a pool. The HttpClient class is designed to be shared rather than pooled. Other objects might support pooling, enabling the system to spread the workload across multiple instances.
Objects that you share across multiple requests must be thread-safe. The HttpClient class is designed to be used in this manner, but other classes might not support concurrent requests, so check the available documentation.
Be careful about setting properties on shared objects, as this can lead to race conditions. For example, setting DefaultRequestHeaders on the HttpClient class before each request can create a race condition. Set such properties once (for example, during startup), and create separate instances if you need to configure different settings.
Some resource types are scarce and should not be held onto. Database connections are an example. Holding an open database connection that is not required may prevent other concurrent users from gaining access to the database.

—Symptoms of this problem include a drop in throughput or an increased error rate, along with one or more of the following:
An increase in exceptions that indicate exhaustion of resources such as sockets, database connections, file handles, and so on.
Increased memory use and garbage collection.
An increase in network, disk, or database activity.

————————————-
Monolithic Persistence antipattern
Putting all of an application’s data into a single data store can hurt performance, either because it leads to resource contention, or because the data store is not a good fit for some of the data.

No Caching antipattern
In a cloud application that handles many concurrent requests, repeatedly fetching the same data can reduce performance and scalability.
When data is not cached, it can cause a number of undesirable behaviors, including:

Repeatedly fetching the same information from a resource that is expensive to access, in terms of I/O overhead or latency.
Repeatedly constructing the same objects or data structures for multiple requests.
Making excessive calls to a remote service that has a service quota and throttles clients past a certain limit.
In turn, these problems can lead to poor response times, increased contention in the data store, and poor scalability

How to fix the problem
The most popular caching strategy is the on-demand or cache-aside strategy.

On read, the application tries to read the data from the cache. If the data isn’t in the cache, the application retrieves it from the data source and adds it to the cache.
On write, the application writes the change directly to the data source and removes the old value from the cache. It will be retrieved and added to the cache the next time it is required.

Considerations
If the cache is unavailable, perhaps because of a transient failure, don’t return an error to the client. Instead, fetch the data from the original data source. However, be aware that while the cache is being recovered, the original data store could be swamped with requests, resulting in timeouts and failed connections.
For web APIs, you can support client-side caching by including a Cache-Control header in request and response messages, and using ETags to identify versions of objects.
You don’t have to cache entire entities. If most of an entity is static but only a small piece changes frequently, cache the static elements and retrieve the dynamic elements from the data source. This approach can help to reduce the volume of I/O being performed against the data source.
In some cases, if volatile data is short-lived, it can be useful to cache it. For example, consider a device that continually sends status updates. It might make sense to cache this information as it arrives, and not write it to a persistent store at all.
To prevent data from becoming stale, many caching solutions support configurable expiration periods, so that data is automatically removed from the cache after a specified interval. You may need to tune the expiration time for your scenario. Data that is highly static can stay in the cache for longer periods than volatile data that may become stale quickly.
If the caching solution doesn’t provide built-in expiration, you may need to implement a background process that occasionally sweeps the cache, to prevent it from growing without limits.
It might be useful to prime the cache when the application starts. Populate the cache with the data that is most likely to be used.
Always include instrumentation that detects cache hits and cache misses. Use this information to tune caching policies, such what data to cache, and how long to hold data in the cache before it expires.

Synchronous I/O antipattern
Blocking the calling thread while I/O completes can reduce performance and affect vertical scalability.

A synchronous I/O operation blocks the calling thread while the I/O completes. The calling thread enters a wait state and is unable to perform useful work during this interval, wasting processing resources.

Common examples of I/O include:
Retrieving or persisting data to a database or any type of persistent storage.
Sending a request to a web service.
Posting a message or retrieving a message from a queue.
Writing to or reading from a local file.

This antipattern typically occurs because:

It appears to be the most intuitive way to perform an operation.
The application requires a response from a request.
The application uses a library that only provides synchronous methods for I/O.
An external library performs synchronous I/O operations internally. A single synchronous I/O call can block an entire call chain.

How to fix the problem
Replace synchronous I/O operations with asynchronous operations. This frees the current thread to continue performing meaningful work rather than blocking, and helps improve the utilization of compute resources. Performing I/O asynchronously is particularly efficient for handling an unexpected surge in requests from client applications.

For libraries that don’t provide asynchronous versions of operations, it may be possible to create asynchronous wrappers around selected synchronous methods. Follow this approach with caution. While it may improve responsiveness on the thread that invokes the asynchronous wrapper, it actually consumes more resources. An extra thread may be created, and there is overhead associated with synchronizing the work done by this thread.

Considerations
I/O operations that are expected to be very short lived and are unlikely to cause contention might be more performant as synchronous operations. An example might be reading small files on an SSD drive. The overhead of dispatching a task to another thread, and synchronizing with that thread when the task completes, might outweigh the benefits of asynchronous I/O. However, these cases are relatively rare, and most I/O operations should be done asynchronously.
Improving I/O performance may cause other parts of the system to become bottlenecks. For example, unblocking threads might result in a higher volume of concurrent requests to shared resources, leading in turn to resource starvation or throttling. If that becomes a problem, you might need to scale out the number of web servers or partition data stores to reduce contention.

Performance Anti-Patterns:
– Fixing Performance at the End of the Project
– Measuring and Comparing the Wrong Things
– Algorithmic Antipathy
– Reusing Software
– Iterating Because That’s What Computers Do Well
– Premature Optimization
– Focusing on What You Can See Rather Than on the Problem
– Software Layering
– Excessive Numbers of Threads
– Asymmetric Hardware Utilization
– Not Optimizing for the Common Case
– Needless Swapping of Cache Lines Between CPUs