I recently had to brush up my WinDbg knowledge due to a performance issue that occurred in production environment.
Normally you don’t have to go to the memory dumps route to get an idea on what’s causing the performance bottleneck in your application, if you have an APM tool such as New Relic you would be able to tell the hotspots in your application – if you don’t have an APM tool then at a minimum you need to use Windows performance counters to gather metrics around hardware utilization (CPU, Memory, Disk I/O) and the ASP.NET performance counters.
I especially like the New Relic Thread Profiler feature, which provides a unique ability to run a profiler against your application.. even in production environment – no I didn’t made a mistake there. You can read more about it https://docs.newrelic.com/docs/apm/applications-menu/events/thread-profiler-tool
And oftentimes NR Thread Profiler analysis will provide me with the information that I need to pinpoint the application bottleneck and work with the team to come up with optimization plan.
But there are times where you just need more information on what’s going on with your application in a lot more detail manner, that extra information that would be able to help you validate your hypothesis. Enter the memory dump analysis..
Capturing memory dumps
There’s an existing Sitecore KB article that goes and explains how to capture memory dumps using different tools which you can refer to https://kb.sitecore.net/articles/488758
Memory dumps analysis
There’s a couple of tools that you can use to analyze your memory dump files, I’ll go through each one and explain what I use them for.
This tool is very useful as it will find problematic areas of your application based on the memory dump file that you provided, you can also set multiple memory dump files which is taken from the same server consecutively to get better result on performance analysis.
Once you have the memory dump file available, what you want to do is to open it using Debug Diag, select the analysis rules and ran an analysis.
It will then generate a .mthml file that can be opened with Internet Explorer and when you open it you will able to see the summary of the analysis outcome and the stack trace of the offending threads.
WinDbg is the main tool in which you will spend most of your time doing memory dump analysis as it’s very powerful and has plenty of useful commands and extensions that can help you narrow down the cause of the problem.
There’s a new version of WinDbg called WinDbg preview available through the Windows Store which has more modern look to it, but the commands that I reference in this section should apply to the older version as well
The first thing that you want to do is to load the SOS extension
.loadby sos clr
And the Mex extension – which you can download at https://www.microsoft.com/en-us/download/details.aspx?id=53304
And the Sosex extension – which you can download at http://www.stevestechspot.com/
This command allows you to check CPU utilization when the memory dump was taken and the number of active worker threads and IOCP threads.
This command will display all the threads that were running and order them by the execution time that it has taken until the memory dump was taken
This command allows you to set the context to the specified thread id which allows you to run commands against that thread
If the current thread execute a .NET managed code then it will dump the full stack trace – which is useful to help identify problematic code in your application
This command dumps the object information of the specified memory address
This command will display all the threads that own a lock, if there’s a thread that has a high MonitorHeld value then it suggests that you might have a deadlock or a lock contention issue. The value is either 0,1, or any odd numbers. For example, if the value is 89 then this thread owns a lock which is represented by a value of 1, and the remaining 88 count means that it is waited by other 44 threads – each waiting threads holds a value of 2
This command will analyze whether you have a deadlock that occurred between the threads.
Similar to the !clrstack command above but better.
This command displays the information around .NET SQL Pool objects which is useful to find out how many SQL connections were open.
This command will display all the running and completed aspx http request that was processed during the time of the snapshot. You can sort the result by the time longest time it took so that you can isolate your area of performance problem.
You can open your application source code files in Visual Studio and the memory dump file in order to be able to switch between the threads and see what line of code they are executing. I normally combine this with WinDbg where I use WinDbg to isolate problematic threads and have the convenience of Visual Studio to be able to explore the source code.
You can also open the memory dump with Jetbrains DotMemory to get an idea of what objects that was created by your application. You want to get a good idea of what’s normal for your application here, how many custom objects do you expect and whether there’s some anomaly that occurs.
If you’re investigating memory leak, you would spend more time investigating this area.