-
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU
Paper • 2403.06504 • Published • 56 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 152 -
Slamming: Training a Speech Language Model on One GPU in a Day
Paper • 2502.15814 • Published • 69
u f
udif
AI & ML interests
None yet
Organizations
None yet
papers
-
Adding NVMe SSDs to Enable and Accelerate 100B Model Fine-tuning on a Single GPU
Paper • 2403.06504 • Published • 56 -
Can 1B LLM Surpass 405B LLM? Rethinking Compute-Optimal Test-Time Scaling
Paper • 2502.06703 • Published • 152 -
Slamming: Training a Speech Language Model on One GPU in a Day
Paper • 2502.15814 • Published • 69
models 0
None public yet
datasets 0
None public yet