A story about performance issue identification and solution only on the developer End.

Sajidur Rahman
4 min readMay 11, 2023

--

Stories: Recently we are facing a problem when high traffic in our application shows the following scenario

  1. Showing Execution Timeout expired with the SQL server database. But there are no expensive/queries in the database but have a huge process with AWAITING COMMAND by following the command
  2. If the database server is restarted or all services are shut down for a while and the start problem is gone for a few days.

Platform:

Infrastructure: Docker in Redhat Enterprise

Language & Framework: .Net 6 with Entity framework

Datbase: SQL Server

So what is going on here? there could be lots of factors causing the issue. Here are some potential reasons and troubleshooting steps you can follow:

  1. Locking and Blocking: Check if there are any locks or blocking processes occurring in the database. Use SQL Server tools such as Activity Monitor or dynamic management views (DMVs) like sys.dm_tran_locks and sys.dm_os_waiting_tasks to identify any blocking processes. Resolve any blocking issues to allow the affected processes to proceed.
  2. Resource Contention: Evaluate the server’s resources, such as CPU, memory, and disk I/O. High resource utilization can lead to queries being delayed in the “AWAITING COMMAND” state. Monitor the server’s performance using tools like SQL Server Profiler, Extended Events, or third-party monitoring tools. Identify any resource bottlenecks and consider scaling up your infrastructure if necessary.
  3. Network Latency: Check for any network issues that could be causing delays in the communication between your application and the SQL Server. Ensure that the network connection is stable and there are no latency issues affecting the queries’ execution. It is one of the best tools to find out the problems related to SQL server network issues https://github.com/microsoft/CSS_SQL_Networking_Tools/wiki/0300-Intermittent-or-Periodic-Network-Issue
  4. Connection Pooling and Connection Limits: Review your connection pooling configuration in the application code. Verify that you have configured appropriate connection pool settings such as maximum pool size and connection timeout. If the connection pool is exhausted, it can result in delays in acquiring a connection, leading to the “AWAITING COMMAND” state.
  5. Application Code or Infrastructure: Examine your application code and infrastructure for any bottlenecks or performance issues that could be causing delays. Consider profiling your application code and optimizing any areas that might be causing the delays.
  6. SQL Server Configuration: Review the SQL Server configuration settings, such as the maximum worker threads, the maximum degree of parallelism (MAXDOP), or other relevant settings. Ensure that the configuration is suitable for your workload and hardware resources. Consult with database administrators or performance experts to fine-tune the SQL Server configuration if needed.

There are lots of issues we identify by investigating the above steps. Here I am sharing only application end changes that improve our timeout expiration issue. another end like database, infrastructure/network needs to be optimized to solve the exact problem.

By investigating these potential causes, we are able to identify the underlying issue leading to the “AWAITING COMMAND” status in SQL Server. Addressing these factors can help improve the performance and responsiveness of your application.

  1. Cacheing: We implemented Redis and .Net 6 Memory cache. As already we notice that in high traffic application show expire timeout error so first introduce caching in those API endpoints. you can check details on how will you implement cache in your application https://learn.microsoft.com/en-us/aspnet/core/performance/caching/overview?view=aspnetcore-7.0
  2. We implement Non-Blocking Asynchronous Operations: Maybe you can think about why and how async help here. By using asynchronous methods, you can perform non-blocking database operations. When a request is made to your API, it can initiate the database query and return a Task or a Task<T> immediately without waiting for the database to respond. This allows the API to handle other incoming requests, and when the database operation completes, it can resume processing the response.

A web server has a limited number of threads available, and in high-load situations, all of the available threads might be in use. When that happens, the server can’t process new requests until the threads are freed up. With synchronous code, many threads may be tied up while they aren’t actually doing any work because they’re waiting for I/O to complete. With asynchronous code, when a process is waiting for I/O to complete, its thread is freed up for the server to use for processing other requests. As a result, asynchronous code enables server resources to be used more efficiently, and the server is enabled to handle more traffic without delays.

Asynchronous code does introduce a small amount of overhead, but for low-traffic situations, the performance hit is negligible, while for high-traffic situations, the potential performance improvement is substantial.

3. Connection Pooling: Connection pooling allows the reuse of existing database connections instead of creating new ones for each request, reducing the overhead involved in establishing connections. So we set maximum and minimum Connection Pooling.

Connection Timeout=180;Command Timeout=120;Max Pool Size=6000;Min Pool Size=500;Connection Lifetime=300;

And finally, after doing some load tests we found that it’s working better than the previews. and the error is almost gone with a large number of requests that are generated from JMeter.

Do you have any other way? If I miss anything pls comments here.

You can learn also

--

--

No responses yet