Instructions to use ml6team/byt5-base-dutch-ocr-correction with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use ml6team/byt5-base-dutch-ocr-correction with Transformers:
# Load model directly from transformers import AutoTokenizer, AutoModelForSeq2SeqLM tokenizer = AutoTokenizer.from_pretrained("ml6team/byt5-base-dutch-ocr-correction") model = AutoModelForSeq2SeqLM.from_pretrained("ml6team/byt5-base-dutch-ocr-correction") - Notebooks
- Google Colab
- Kaggle
YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
ByT5 Dutch OCR Correction
This model is a finetuned byT5 model that corrects OCR mistakes found in dutch sentences. The google/byt5-base model is finetuned on the dutch section of the OSCAR dataset.
Usage
from transformers import AutoTokenizer, T5ForConditionalGeneration
example_sentence = "Ben algoritme dat op ba8i8 van kunstmatige inte11i9entie vkijwel geautomatiseerd een tekst herstelt met OCR fuuten."
tokenizer = AutoTokenizer.from_pretrained('ml6team/byt5-base-dutch-ocr-correction')
model_inputs = tokenizer(example_sentence, max_length=128, truncation=True, return_tensors="pt")
model = T5ForConditionalGeneration.from_pretrained('ml6team/byt5-base-dutch-ocr-correction')
outputs = model.generate(**model_inputs, max_length=128)
tokenizer.decode(outputs[0])
- Downloads last month
- 19
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐ Ask for provider support