pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-planning 8B • Updated 4 days ago • 9
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-planning 8B • Updated 4 days ago • 9
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-reasoning-strategies 8B • Updated 5 days ago • 18
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-reasoning-strategies 8B • Updated 5 days ago • 18
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-self-correct 8B • Updated 5 days ago • 8
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-self-correct 8B • Updated 5 days ago • 8
Nemotron-Cascade Collection Scaling Cascaded Reinforcement Learning for General-Purpose Reasoning Models • 18 items • Updated 9 days ago • 44
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-from-rl 8B • Updated 26 days ago • 13
pittawat/qwen2.5-7b-instruct-new-math-1k-grpo-with-length-0.1-cot-prompt-v6-from-rl 8B • Updated 26 days ago • 13
pittawat/qwen2.5-14b-instruct-still-3-1k-grpo-with-length-0.1-cot-prompt-v6 15B • Updated Dec 3, 2025 • 1
pittawat/qwen2.5-14b-instruct-still-3-1k-grpo-with-length-0.1-cot-prompt-v6 15B • Updated Dec 3, 2025 • 1
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-124 8B • Updated Dec 3, 2025 • 1
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-124 8B • Updated Dec 3, 2025 • 1
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-93 8B • Updated Dec 3, 2025 • 4
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-93 8B • Updated Dec 3, 2025 • 4
pittawat/qwen2.5-7b-instruct-math-1k-grpo-cot-prompt-new-intermediate-ckpt-62 8B • Updated Dec 3, 2025 • 1