so we've got a memory leak in a yada api process... i suspect it's in direct memory rather than heap - we get no OOMEs logged, and heap telemetry seems well within limits, but our process gets oom-killed by k8s despite the sum of -XX:MaxDirectMemorySize
and -Xmx
being somewhat less than the cgroups limit
i note that we are using :raw-streams? true
on our aleph server ('cos streaming uploads don't work without it), but i also note that yada doesn't seem to do any releasing of netty ByteBuf
s anywhere i can find, so i'm starting to suspect our yada handler is leaking ByteBuf
s
anyone else noticed anything similar ?
hmm. maybe it gets buffer-releasing behaviour from ztellman/byte-streams
yeah, looks like byte-streams/to-byte-array
will use the transform defined in aleph.netty
which releases ByteBufs
ok, time to get some allocation instrumentation going then
It's quite a while since I wrote the byte buffer streaming code (and some versions of aleph have gone by, which may have changed behaviour), but I do remember double-checking that all buffers were deallocated.
i don't think i'll get any further without some instrumentation - it seems likely it's something in aleph or yada or our usage thereof, since we have no memory leaks in our kafka-streams apps, and they use largely the same model codebase
it's very annoying that we get oom-killed by cgroups rather than getting an OOME though. no clues whatsoever to follow