Instructions to use PaddlePaddle/PP-DocLayoutV3 with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- PaddleOCR
How to use PaddlePaddle/PP-DocLayoutV3 with PaddleOCR:
# 1. See https://www.paddlepaddle.org.cn/en/install to install paddlepaddle # 2. pip install paddleocr from paddleocr import LayoutDetection model = LayoutDetection(model_name="PP-DocLayoutV3") output = model.predict(input="path/to/image.png", batch_size=1) for res in output: res.print() res.save_to_img(save_path="./output/") res.save_to_json(save_path="./output/res.json") - Notebooks
- Google Colab
- Kaggle
PP-DocLayoutV3 Inference Benchmark: SafeTensor vs ONNX vs PaddlePaddle
#8
by andynoodles - opened
I benchmarked all three inference backends for PP-DocLayoutV3 on an NVIDIA RTX 5060 Ti (16 GB), Linux, CUDA 13
5 warmup + 50 timed runs, preprocessing excluded, GPU sync on all frameworks.
| Metric | SafeTensor (PyTorch) | ONNX Runtime | PaddlePaddle |
|---|---|---|---|
| End-to-end (mean) | 41.7 ms | 55.4 ms | 64.3 ms |
| Throughput | 24.0 FPS | 18.1 FPS | 15.6 FPS |
| Latency stdev | 0.7 ms | 0.2 ms | 1.2 ms |
| RAM (total) | 2,634 MB | 3,213 MB | 3,844 MB |
| GPU (peak) | 1,534 MB | 2,062 MB | 1,658 MB |
All three backends produce 13 detections with matching labels and bounding boxes (scores differ by < 0.01).
Key findings:
- SafeTensor/PyTorch is 1.3β1.5x faster end-to-end, even with post-processing outside the graph
- ONNX has the most consistent latency (0.2 ms stdev)
- Important: the ONNX model expects mean=[0,0,0], std=[1,1,1] (rescale by 1/255 only) β NOT ImageNet normalization. Using ImageNet norm drops detections from 13 to 12
Full code, scripts, and methodology:
https://github.com/andynoodles/PPDocLayout-V3-Benchmark