Real-ESRGAN: Training Real-World Blind Super-Resolution with Pure Synthetic Data Paper • 2107.10833 • Published Jul 22, 2021 • 2
GLiNER Guard: Unified Encoder Family for Production LLM Safety and Privacy Paper • 2605.05277 • Published 24 days ago • 4
OneFormer: One Transformer to Rule Universal Image Segmentation Paper • 2211.06220 • Published Nov 10, 2022 • 1
Towards Universal Video MLLMs with Attribute-Structured and Quality-Verified Instructions Paper • 2602.13013 • Published Feb 13 • 55
DocLayout-YOLO: Enhancing Document Layout Analysis through Diverse Synthetic Data and Global-to-Local Adaptive Perception Paper • 2410.12628 • Published Oct 16, 2024 • 42
YOLO-World: Real-Time Open-Vocabulary Object Detection Paper • 2401.17270 • Published Jan 30, 2024 • 44
VideoTemp-o3: Harmonizing Temporal Grounding and Video Understanding in Agentic Thinking-with-Videos Paper • 2602.07801 • Published Mar 13 • 1
Qwen-Image-Bench: From Generation to Creation in Text-to-Image Evaluation Paper • 2605.28091 • Published 3 days ago • 3
Recurrent Drafter for Fast Speculative Decoding in Large Language Models Paper • 2403.09919 • Published Mar 14, 2024 • 22
VideoAgent: Long-form Video Understanding with Large Language Model as Agent Paper • 2403.10517 • Published Mar 15, 2024 • 37
Gemini 1.5: Unlocking multimodal understanding across millions of tokens of context Paper • 2403.05530 • Published Mar 8, 2024 • 65
MM1: Methods, Analysis & Insights from Multimodal LLM Pre-training Paper • 2403.09611 • Published Mar 14, 2024 • 130
Finetuned Multimodal Language Models Are High-Quality Image-Text Data Filters Paper • 2403.02677 • Published Mar 5, 2024 • 19
InfiMM-HD: A Leap Forward in High-Resolution Multimodal Understanding Paper • 2403.01487 • Published Mar 3, 2024 • 17
When Scaling Meets LLM Finetuning: The Effect of Data, Model and Finetuning Method Paper • 2402.17193 • Published Feb 27, 2024 • 27
Sora: A Review on Background, Technology, Limitations, and Opportunities of Large Vision Models Paper • 2402.17177 • Published Feb 27, 2024 • 88