dots.ocr GGUF

GGUF conversions of rednote-hilab/dots.ocr for use with llama.cpp.

Files

File Size Description
Dots.Ocr-1.8B-Q8_0.gguf 1.8 GB Text model, 8-bit quantized
Dots.Ocr-1.8B-F16.gguf 3.4 GB Text model, float16
mmproj-Dots.Ocr-F16.gguf 2.4 GB Vision encoder (mmproj), float16

Architecture

dots.ocr = Qwen2 text backbone (1.7B params, 28 layers) + modified Qwen2-VL vision encoder (1.2B params, 42 layers).

Key differences from Qwen2-VL:

  • Text model is standard Qwen2 with 1D RoPE (not M-RoPE)
  • Vision uses RMSNorm, SiLU gated MLP, Conv2D patches, no attention bias
  • 2D M-RoPE internal to vision encoder only

Usage with llama.cpp

Note: Requires llama.cpp with dots.ocr support (pending upstream merge: https://github.com/ggml-org/llama.cpp/pull/19882/changes)

Downloads last month
246
GGUF
Model size
2B params
Architecture
qwen2
Hardware compatibility
Log In to add your hardware

8-bit

16-bit

Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for anthonym21/dots.ocr-GGUF

Quantized
(5)
this model