Post
120
Continuous batching just landed in TRL for GRPO!
At 64 generations it runs faster and uses less VRAM than plain generate, no vLLM needed
How it works and when to reach for it, below
https://huggingface.co/blog/sergiopaniego/cb-trl-grpo
At 64 generations it runs faster and uses less VRAM than plain generate, no vLLM needed
How it works and when to reach for it, below
https://huggingface.co/blog/sergiopaniego/cb-trl-grpo