Music Flamingo
🎵
144
Answer questions about any song from audio or YouTube link
The secrets to building world-class LLMs
Multimodal OCR model for complex document understanding.
Generate natural-sounding speech in European languages with voice cloning
Create 3D models from images using depth estimation
Decompose a 3D model into its individual parts
Generate realistic dialogue from a script, using Dia!