Getting memory dump from .Net core Linux container on AKS

Guide to get a memory dump from a .Net core conainter on AKS Linux and analyze it with Visual Studio

When running .Net core on Linux containers it should be easy to get an memory dump. However it is challenging to find a end to end guide on how to do this. We, Oscar Obeso and me, give you the steps in this blog post from indenting your issue, get a dump and than analyze the dump to get the root cause.

Monitoring

We are running our .Net core containers (3.x) on an AKS Linux cluster (1.18.x). To monitor the containers we have monitoring in place. On a dashboard (powerbi and azure alerts) we monitor the trends in calls, memory and cpu. Keeping track of the health enables us to identify issues before they are crashing your containers. When looking at our memory profile we noticed a gradual grow between new deployments. This could indicate a memory issue:

Memory monitoring on max memory for service pod in AKS cluster

You have a number of option on how to approach the issue. Check all your source code in review, try to reproduce locally or get a dump from a container where you know the issue is happening. In this guide we choose to go for getting a dump. This has the most change of identifying the issue correctly in a limited amount of time.

Get a dump

By following these steps you can get a dump from a .Net core container:

  • Log on to azure
  • Log on to the AKS cluster
  • Get the running pods for finding something to dump
  • Log on the pod
  • Install tools
  • Make dump (gc dump)
  • Download dump

Log on to azure

az logon
az account set --subscription <subscription_guid>

log on to AKS cluster

az aks get-credentials --resource-group <resource_group_name> --name <cluster_name>

Get the running pods for finding something to dump

kubectl get pods --namespace <namespace>

Log on to the pod

kubectl exec -it <pod_name> -c <container_name> --namespace <namespace> /bin/sh

Install tools (gc dump)

# Install bash
apk add bash

# Update wget. the pod already has it but it has an outdated version, and needs to be updated to install dotnet sdk
apk add wget

# Download DotnetSDK Installer
wget -O sdk_install.sh https://dot.net/v1/dotnet-install.sh

# Add permissions to file
chmod 777 sdk_install.sh

# Install sdk. -c argument also takes a 'Current' but it points to dotnet 5
./sdk_install.sh -c 3.1

# Go to the dotnet folder. In order to use SDK the process needs to be run directly
cd /root/.dotnet

# install dotnet gcdump or install dotnet-dump to get a full dump
./dotnet tool install --global dotnet-gcdump 

# Go to the tools folder
cd tools

Create dump

# This will list all the current processes where a garbage collector dump can be obtained from
./dotnet-gcdump ps 

# This will create the garbage collector dump for the given PID, in this case 1
./dotnet-gcdump collect -p 1 

# Exit the container, we are ready here
exit

Download dump

# Download the dump file from your pod to you local machine
kubectl exec -n <namespace> <pod_name> -- cat /root/.dotnet/tools/dumpresult1.gcdump > dumpresult1.gcdump

Analyze the dump file

When you downloaded the dump file you are can open it in Visual Studio. Then sort on memory usage, the root cause probably goes to the top of your list:

Visual Studio opens gcdump file

Conclusion

Getting started to get a dump file is far more intimating than actual getting one. Having the step to do so make it actually very easy. If you know the root cause, just fix you code and wait till the next time something gets out of control!

2 thoughts on “Getting memory dump from .Net core Linux container on AKS”

  1. Thanks for your post. May I confirm if it is just for management memory? Are there any tools/articles for dumping unmanaged memory?

    Like

  2. Hi! this was really helpful. Just a comment I had to use kubectl cp instead of your cat command otherwise my file was getting corrupted.

    Like

Leave a comment

This site uses Akismet to reduce spam. Learn how your comment data is processed.