OpenGVLab/VideoChat-Flash-Qwen2_5-7B_InternVideo2-1B
Video-Text-to-Text • 9B • Updated
• 1.13k • 7
Computer Vision
RIVER: A Real-Time Interaction Benchmark for Video LLMs
InternVideo-Next: Towards General Video Foundation Models without Video-Text Supervision