malaysia-ai/Qwen3-1.7B-Multilingual-TTS Text Generation • 2B • Updated Nov 27, 2025 • 460 • 8
Apollo: An Exploration of Video Understanding in Large Multimodal Models Paper • 2412.10360 • Published Dec 13, 2024 • 147 • 13
Running 196 Vidore Leaderboard 🥇 196 Compare and rank visual document retrieval models across different benchmarks
mikeyandfriends/PixelWave_FLUX.1-dev_03 Text-to-Image • 12B • Updated Nov 5, 2024 • 637 • 194