Recently we are switching from java8 to 17, but with zgc enabled, most of our microservice used more memory and leading to OOM kill by docker.
I investigated into this, and there is some differenice I noticed:
- With zgc, the java process tends to use a lot more shared memory because zgc use tmpfs to create the heap when linux kernel version is below 3.17
- tmpfs only goes into cache entry of memory.stat, so rss is most composed of off-heap memory.
- Some microserivce with heavy load is having a constant grow of rss but nmt result show committed memory don’t grow much (nmt committed memory without heap is only about 500MB, but rss grows to 783MB).
- There is similar problem like this one.
- Our java version is 17.0.7, I didn’t find any bugfix is the release notes of newer version of java17.
cat /sys/fs/cgroup/memory/memory.stat
cache 1764929536
rss 821440512 // Don't know why it is so large and keeps growing
rss huge 92274688
shmem 1073741824 // almost consistent of 1024MB heap
mapped file 1014423552
dirty 38055936
writeback O
swap 0
pgpgin 9505349
pgpgout 8896394
pgfault 9380913
pgmajfault 105
inactive anon 994127872
active anon 901025792
inactive file 606601216
active file 84586496
unevictable O
hierarchical memory limit 2621440000
hierarchical memsw limit 2621440000
total cache 1764929536
total rss 821440512
total rss huge 92274688
total shmem 1073741824
total mapped file 1014423552
total dirty 38055936
total writeback o
total swap o
total pgpgin 9505349
total pgpgout 8896394
total pgfault 9380913
total pgmajfault 105
total inactive anon 994127872
total active anon 901025792
total inactive file 606601216
total active file 84586496
total unevictable o
Native Memory Tracking:
(Omitting categories weighting less than 1KB)
Total: reserved=59804324KB, committed=1561672KB
malloc: 202620KB #928892
mmap: reserved=59601704KB, committed=1359052KB