Learning Video LLM with Streaming Speech Transcription at Scale (CVPR 2025)
Joya Chen
chenjoya
AI & ML interests
Video LLM
Recent Activity
upvoted a paper about 11 hours ago
Dream.exe: Can Video Generation Models Dream Executable Robot Manipulation? upvoted a paper 21 days ago
AnyFlow: Any-Step Video Diffusion Model with On-Policy Flow Map Distillation liked a model 23 days ago
nvidia/AnyFlow-FAR-Wan2.1-1.3B-Diffusers