#1315 Memory leak in persistent worker threads under load

๐Ÿ› bug โš ๏ธ high priority ๐Ÿ”ง core
A
alex.dev opened this issue 2 hours ago

We've identified a memory leak occurring in the worker thread pool when handling async callbacks. The leak appears after ~4 hours of runtime under heavy load, eventually triggering the OOM killer.

Reproduction

Steps to reproduce the issue:

  1. Start worker with --pool-size=16
  2. Run stress test suite for 4 hours
  3. Monitor memory usage via /metrics endpoint

Logs

[INFO] Memory: 245MB | Threads: 16/16
[INFO] Memory: 312MB | Threads: 16/16
[WARN] Memory: 512MB | Threads: 16/16
[WARN] Memory: 890MB | Threads: 16/16
[CRIT] Memory: 1.2GB | Threads: 16/16
[CRIT] OOM Killer invoked

Initial investigation suggests closure retention in the callback queue is preventing garbage collection of completed request objects.

M
maintainer_mia commented 45 minutes ago

Thanks for the detailed report, alex.dev. I can reproduce this on v2.4.1.

Looking at the heap dump, it seems related to the reference cycle in the callback queue. The worker holds a strong reference to the context object even after the callback completes. I'll draft a PR to fix the reference cycle and add a memory cap to the queue.

U
Leave a comment