dots.ocr GGUF
GGUF conversions of rednote-hilab/dots.ocr for use with llama.cpp.
Files
| File | Size | Description |
|---|---|---|
| Dots.Ocr-1.8B-Q8_0.gguf | 1.8 GB | Text model, 8-bit quantized |
| Dots.Ocr-1.8B-F16.gguf | 3.4 GB | Text model, float16 |
| mmproj-Dots.Ocr-F16.gguf | 2.4 GB | Vision encoder (mmproj), float16 |
Architecture
dots.ocr = Qwen2 text backbone (1.7B params, 28 layers) + modified Qwen2-VL vision encoder (1.2B params, 42 layers).
Key differences from Qwen2-VL:
- Text model is standard Qwen2 with 1D RoPE (not M-RoPE)
- Vision uses RMSNorm, SiLU gated MLP, Conv2D patches, no attention bias
- 2D M-RoPE internal to vision encoder only
Usage with llama.cpp
Note: Requires llama.cpp with dots.ocr support (pending upstream merge: https://github.com/ggml-org/llama.cpp/pull/19882/changes)
- Downloads last month
- 246
Hardware compatibility
Log In to add your hardware
8-bit
16-bit
Model tree for anthonym21/dots.ocr-GGUF
Base model
rednote-hilab/dots.ocr