Excited to share that I've joined the Hugging Face Fellows program! π€
Looking forward to contributing to & working more closely with the open-source ecosystem - huge thanks to everyone who's supported me on this journey! π
deepseek-ai/DeepSeek-OCR is out! π₯ my take β€΅οΈ > pretty insane it can parse and re-render charts in HTML > it uses CLIP and SAM features concatenated, so better grounding > very efficient per vision tokens/performance ratio > covers 100 languages
The purpose here is to get an idea of the profile of the models with the greatest impact in open source (we are not interested in closed models here!).
IBM just released small swiss army knife for the document models: granite-docling-258M on Hugging Face π₯
> not only a document converter but also can do document question answering, understand multiple languages π€― > best part: released with Apache 2.0 license π use it with your commercial projects! > it supports transformers, vLLM and MLX from the get-go! π€ > built on SigLIP2 & granite-165M