Race condition in concurrent merge resolver on high-load clusters
alex_kim added labels
bug
performance
backend
2 hours ago
system automatically assigned to
sarah_ops
2 hours ago
marcus_dev commented
I can repro this locally on `main`. Looks like the new async queue implementation isn't releasing locks on timeout. I'll spin up a PR shortly.
45 minutes ago
Leave a comment
Markdown supported
Description
When processing concurrent merges on nodes with >64GB RAM, the
merge_resolverservice occasionally throws a deadlock exception. This seems to happen specifically whenbinary_blob_size > 500MB.Steps to Reproduce:
DEADLOCK_DETECTED.Stack Trace:
This is blocking our v3.2 release. Tagging @backend-team for immediate review.