I tried to get pprof of one process running on my ubuntu 22.04 with the command as following:
go tool pprof http://localhost:9091/debug/pprof/profile
well, when the target process running at a light workload, this command can give me a valid pprof result which is meaningless.
And when the target process running heavily, consuming about 200% cpu as I expected. The command above will stuck and cannot end at all.
ENV:
unbuntu 22.04
go: 1.22.3
pprof: latest installed with(go install github.com/google/pprof@latest)
I tried to get the stack for the stucked pprof as following:
#0 runtime.futex () at /usr/local/go/src/runtime/sys_linux_amd64.s:558
#1 0x0000000000439bf0 in runtime.futexsleep (addr=0xfffffffffffffe00, val=0, ns=4671651) at /usr/local/go/src/runtime/os_linux.go:69
#2 0x0000000000411ac7 in runtime.notesleep (n=0xd2c5c0 <runtime.m0+320>) at /usr/local/go/src/runtime/lock_futex.go:170
#3 0x0000000000445133 in runtime.mPark () at /usr/local/go/src/runtime/proc.go:1761
#4 runtime.stoplockedm () at /usr/local/go/src/runtime/proc.go:3026
#5 0x000000000044745a in runtime.schedule () at /usr/local/go/src/runtime/proc.go:3847
#6 0x0000000000447aac in runtime.park_m (gp=0xc0001be380) at /usr/local/go/src/runtime/proc.go:4036
#7 0x0000000000470a6e in runtime.mcall () at /usr/local/go/src/runtime/asm_amd64.s:458
#8 0x00007fff5315c688 in ?? ()
#9 0x00000000004753ff in runtime.newproc (fn=0x47096f <runtime.rt0_go+303>) at <autogenerated>:1
#10 0x00000000004709e5 in runtime.mstart () at /usr/local/go/src/runtime/asm_amd64.s:394
#11 0x000000000047096f in runtime.rt0_go () at /usr/local/go/src/runtime/asm_amd64.s:358
#12 0x0000000000000002 in ?? ()
#13 0x00007fff5315c6d8 in ?? ()
#14 0x00007fff5315c6d0 in ?? ()
#15 0x0000000000000002 in ?? ()
#16 0x00007fff5315c6d8 in ?? ()
#17 0x0000777de028b2ca in _dl_start_user () from /lib64/ld-linux-x86-64.so.2
#18 0x0000000000000002 in ?? ()
#19 0x00007fff5315da7b in ?? ()
#20 0x00007fff5315daa4 in ?? ()
#21 0x0000000000000000 in ?? ()
It seems that pprof is sleeping for schedule, hard to know why.
I’m sure this stuck is related to workload, because when adjusting the workload by set a sleep period on the client side such as ‘sleep(10)’, the pprof will also stuck for a period but can finally return a result.
Has anyone got experience for such cases?