datasets datasets of interest bigcode/the-stack-dedup Viewer • Updated Aug 17, 2023 • 237M • 16.4k • 392 liwu/MNBVC Updated about 1 month ago • 96.3k • 608 code-search-net/code_search_net Viewer • Updated Feb 23 • 4.14M • 18.9k • 327 HuggingFaceH4/ultrachat_200k Viewer • Updated Oct 16, 2024 • 515k • 47.2k • 692
paper reading LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 61 RL's Razor: Why Online Reinforcement Learning Forgets Less Paper • 2509.04259 • Published Sep 4, 2025 • 7
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 61
RL's Razor: Why Online Reinforcement Learning Forgets Less Paper • 2509.04259 • Published Sep 4, 2025 • 7
datasets datasets of interest bigcode/the-stack-dedup Viewer • Updated Aug 17, 2023 • 237M • 16.4k • 392 liwu/MNBVC Updated about 1 month ago • 96.3k • 608 code-search-net/code_search_net Viewer • Updated Feb 23 • 4.14M • 18.9k • 327 HuggingFaceH4/ultrachat_200k Viewer • Updated Oct 16, 2024 • 515k • 47.2k • 692
paper reading LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 61 RL's Razor: Why Online Reinforcement Learning Forgets Less Paper • 2509.04259 • Published Sep 4, 2025 • 7
LLM Pruning and Distillation in Practice: The Minitron Approach Paper • 2408.11796 • Published Aug 21, 2024 • 61
RL's Razor: Why Online Reinforcement Learning Forgets Less Paper • 2509.04259 • Published Sep 4, 2025 • 7