3PO Models 3PO family methods trained on DapoMath-17k using Olmo3-IVON-SFT-7B and Qwen2.5Math-IVON-SFT-7B BayesRL/Olmo3-B3PO-7B Text Generation ⢠7B ⢠Updated 7 days ago ⢠12 BayesRL/Olmo3-M3PO-7B Text Generation ⢠7B ⢠Updated 7 days ago ⢠14 BayesRL/Olmo3-C3PO-7B Text Generation ⢠7B ⢠Updated 7 days ago ⢠15 BayesRL/Olmo3-M3POPlus-7B Text Generation ⢠7B ⢠Updated 7 days ago ⢠15
Warm-started Checkpoints A collection of three models trained on the Nemotron Post Training Dataset for reasoning tasks with IVON BayesRL/Llama3.1-IVON-SFT-8B Text Generation ⢠8B ⢠Updated 7 days ago ⢠71 BayesRL/Qwen2.5Math-IVON-SFT-7B 8B ⢠Updated Apr 7 ⢠2.97k BayesRL/Olmo3-IVON-SFT-7B Text Generation ⢠7B ⢠Updated 7 days ago ⢠1.79k
3PO Models 3PO family methods trained on DapoMath-17k using Olmo3-IVON-SFT-7B and Qwen2.5Math-IVON-SFT-7B BayesRL/Olmo3-B3PO-7B Text Generation ⢠7B ⢠Updated 7 days ago ⢠12 BayesRL/Olmo3-M3PO-7B Text Generation ⢠7B ⢠Updated 7 days ago ⢠14 BayesRL/Olmo3-C3PO-7B Text Generation ⢠7B ⢠Updated 7 days ago ⢠15 BayesRL/Olmo3-M3POPlus-7B Text Generation ⢠7B ⢠Updated 7 days ago ⢠15
Warm-started Checkpoints A collection of three models trained on the Nemotron Post Training Dataset for reasoning tasks with IVON BayesRL/Llama3.1-IVON-SFT-8B Text Generation ⢠8B ⢠Updated 7 days ago ⢠71 BayesRL/Qwen2.5Math-IVON-SFT-7B 8B ⢠Updated Apr 7 ⢠2.97k BayesRL/Olmo3-IVON-SFT-7B Text Generation ⢠7B ⢠Updated 7 days ago ⢠1.79k