Came across this while I was working on something similar. Good work.
I built Inferential to handle multi-robot inference orchestration on Ray Serve. Deadline-aware scheduling scores requests across cadence urgency (EMA-tracked), robot-reported priority, anti-starvation age, trajectory progress, and static priority. ZMQ ROUTER/DEALER transport with protobuf+numpy wire format -- under 1ms serialize-transmit-deserialize vs ~100ms over gRPC. Request TTL, overflow policies, automatic disconnect detection, dispatch retry.
Will try to benchmark Inferential as well.
Open source. SDKs in Python, C++, and Rust.
github.com/nalinraut/inferential
Full writeup: nalinraut.github.io/blog/2026/inferential/