Document Understanding - a CIMAI Collection

CIMAI 's Collections

Document Understanding

VL Embedding Models

VL Embedding (multi-vec) Models

VL Instruct Models

VL Reasoning Models

VL Reranker Models

Text Embedding Models

Text Instruct Edge Models

Text Instruct Models

Text Reasoning Models

Text Reranking Models

Speech-to-Text Models

Document Understanding

updated Feb 20

https://www.2077ai.com/dataset/dataset-omnidocbench

zai-org/GLM-OCR

Image-to-Text • Updated 19 days ago • 8.36M • • 1.69k
deepseek-ai/DeepSeek-OCR-2

Image-Text-to-Text • 3B • Updated Feb 3 • 1.52M • 938
PaddlePaddle/PaddleOCR-VL-1.5

Image-Text-to-Text • 1.0B • Updated 3 days ago • 101k • 605
lightonai/LightOnOCR-2-1B-base

Image-Text-to-Text • 1B • Updated Jan 21 • 7.06k • 11
opendatalab/MinerU2.5-2509-1.2B

Image-Text-to-Text • 1B • Updated 25 days ago • 1.49M • 356

Note agpl-3.0 license: "If you use AGPL-3.0 licensed software in a network-accessible application, you must make the entire source code of your application available to users of that application." :(
rednote-hilab/dots.ocr

Image-Text-to-Text • 3B • Updated Oct 31, 2025 • 185k • 1.3k