Instructions to use echo840/MonkeyOCR with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- MonkeyOCR
How to use echo840/MonkeyOCR with MonkeyOCR:
# No code snippets available yet for this library. # To use this model, check the repository files and the library's documentation. # Want to help? PRs adding snippets are welcome at: # https://github.com/huggingface/huggingface.js
- Notebooks
- Google Colab
- Kaggle
Problems
Works fine, eventually, even above average quality! A few suggestions, however...
You have a shit ton of dependencies. While this is partly unavoidable due to relying on magic, etc., try to reduce them drastically...
The program as a whole requires numpy no greater than 1.26.4. My traceback indicates this is because of the fast_langdetect library, which was archived last year and obviously won't be updated anymore. Try to circumvent this by including their code in your code or another creative way. You need numpy 2+ support.
Recent warnings from transformers state that a video processor should be saved in "video_preprocessor.json" now but you're still using "preprocessor.json".
The image processor within transformers is being called without the "use_fast" parameter. It's best practice to set use_fast to "True".
I'll post more issues as I continue to experiment with this, but overall nice job.
Hello, thank you very much for your feedback on our model. We’ll definitely take these issues into consideration. However, our current focus is on releasing a better version of the model. You're very welcome to submit a PR to help address these problems in the meantime!