view article Article Welcome Gemma 4: Frontier multimodal intelligence on device +5 3 days ago • 504
view article Article Introducing Cohere-transcribe: state-of-the-art speech recognition 9 days ago • 34
IndicConformer Collection A collection of ASR models for 22 scheduled languages of India • 23 items • Updated Mar 2 • 29
OWLS: Scaling Laws for Speech Recognition and Translation Collection 🦉 A suite of Whisper-style models from 250M to 18B parameters. Trained on up to 360K hours of data. 16k sampling rate. • 8 items • Updated May 3, 2025 • 7
view article Article ColPali: Efficient Document Retrieval with Vision Language Models 👀 Jul 5, 2024 • 317
LLM Compiler Collection Meta LLM Compiler is a state-of-the-art LLM that builds upon Code Llama with improved performance for code optimization and compiler reasoning. • 4 items • Updated Jun 27, 2024 • 157
impira/layoutlm-document-qa Document Question Answering • 0.1B • Updated Mar 18, 2023 • 25.3k • 1.17k
google/pix2struct-docvqa-base Visual Question Answering • 0.3B • Updated Dec 24, 2023 • 2.86k • 44