Melvin Vivas PRO
AI & ML interests
Recent Activity
Organizations
- Runtime error998
Open VLM Leaderboard
🌎998VLMEvalKit Evaluation Results Collection
- Running on ZeroFeatured425
DeepSeek OCR 2 Demo
🚀425Try out DeepSeek-OCR-2 on your PDFs or images
- Running on ZeroMCP64
Multimodal OCR3
🌖64demo of a collection of impressive ocr models on the hub
-
Qwen/Qwen3-VL-30B-A3B-Instruct
Image-Text-to-Text • Updated • 2.01M • • 540
-
openai/whisper-large-v3
Automatic Speech Recognition • Updated • 6.05M • • 5.43k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition • Updated • 3.93M • • 2.84k - Running on ZeroMCPFeatured816
Whisper Large V3
🤫816Transcribe or translate audio and YouTube videos to text
- Running on ZeroFeatured85
Kugel Audio
👀85Generate natural-sounding speech in European languages with voice cloning
- Running on ZeroFeatured231
Omni Video Factory
🏆231text to video, image to video, video extend
- Running on ZeroMCP1.01k
Wan2.2 14B Preview
🐌1.01kgenerate a video from an image with a text prompt
- Running on ZeroMCPFeatured2.8k
Wan2.2 14B Fast
🎥2.8kgenerate a video from an image with a text prompt
- Running1
Qwen-3-VL-8B OCR Receipts
🚀1structured data parser from receipt images
- RunningFeatured255
Qwen3 Omni Demo
⚡255Chat with multimodal AI using text, audio, images, and video
- Running on ZeroFeatured113
VLM Object Understanding
🦀113Explore object detection, visual grounding, keypoint Detecti
- Sleeping2
Dataset Card Drafter
😻2Create dataset descriptions and open PRs automatically
- Running on ZeroFeatured171
VibeVoice-Realtime-0.5B
🐨171Generate natural speech from text with selectable voices
-
microsoft/VibeVoice-1.5B
Text-to-Speech • 3B • Updated • 210k • 2.23k - RunningFeatured382
Qwen3 TTS Demo
🚀382Generate speech from text with selectable voices
-
mradermacher/Qwen3-1.7B-Multilingual-TTS-GGUF
2B • Updated • 3.84k • 5
- Running on ZeroFeatured231
Omni Video Factory
🏆231text to video, image to video, video extend
- Running on ZeroMCP1.01k
Wan2.2 14B Preview
🐌1.01kgenerate a video from an image with a text prompt
- Running on ZeroMCPFeatured2.8k
Wan2.2 14B Fast
🎥2.8kgenerate a video from an image with a text prompt
- Running1
Qwen-3-VL-8B OCR Receipts
🚀1structured data parser from receipt images
- RunningFeatured255
Qwen3 Omni Demo
⚡255Chat with multimodal AI using text, audio, images, and video
- Running on ZeroFeatured113
VLM Object Understanding
🦀113Explore object detection, visual grounding, keypoint Detecti
- Sleeping2
Dataset Card Drafter
😻2Create dataset descriptions and open PRs automatically
- Runtime error998
Open VLM Leaderboard
🌎998VLMEvalKit Evaluation Results Collection
- Running on ZeroFeatured425
DeepSeek OCR 2 Demo
🚀425Try out DeepSeek-OCR-2 on your PDFs or images
- Running on ZeroMCP64
Multimodal OCR3
🌖64demo of a collection of impressive ocr models on the hub
-
Qwen/Qwen3-VL-30B-A3B-Instruct
Image-Text-to-Text • Updated • 2.01M • • 540
- Running on ZeroFeatured171
VibeVoice-Realtime-0.5B
🐨171Generate natural speech from text with selectable voices
-
microsoft/VibeVoice-1.5B
Text-to-Speech • 3B • Updated • 210k • 2.23k - RunningFeatured382
Qwen3 TTS Demo
🚀382Generate speech from text with selectable voices
-
mradermacher/Qwen3-1.7B-Multilingual-TTS-GGUF
2B • Updated • 3.84k • 5
-
openai/whisper-large-v3
Automatic Speech Recognition • Updated • 6.05M • • 5.43k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition • Updated • 3.93M • • 2.84k - Running on ZeroMCPFeatured816
Whisper Large V3
🤫816Transcribe or translate audio and YouTube videos to text
- Running on ZeroFeatured85
Kugel Audio
👀85Generate natural-sounding speech in European languages with voice cloning