Fully open Whisper-style speech foundation models developed by CMU WAVLab: https://www.wavlab.org/activities/2024/owsm/
Yifan Peng
pyf98
AI & ML interests
Multimodal LLMs, Speech-to-Speech, Speech Recognition
Recent Activity
liked a model about 17 hours ago
nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-FP8 upvoted a collection about 17 hours ago
NVIDIA Nemotron v3 liked a model about 17 hours ago
nvidia/Nemotron-3-Nano-Omni-30B-A3B-Reasoning-NVFP4