Stefan Schweter's picture

In a Training Loop 🔄

Stefan Schweter PRO

stefan-it

·

https://schweter.bayern

AI & ML interests

Flair Library 💕, NER & PoS Tagging, LM Pretraining (mostly encoder-only & encoder-decoder), Historical Language Models, German Language Models, Bavarian NLP 🥨

Recent Activity

upvoted a collection 1 day ago

reacted to hannayukhymenko's post with 🔥 1 day ago

Do you translate your benchmarks from English correctly? 🤔 Turns out, for many languages it is much harder than you can imagine! Introducing Recovered in Translation 🌍 together with @aalexandrov ritranslation.insait.ai Translating benchmarks is a painful process, requiring a lot of manual inspection and adjustments. You start from setting up the whole pipeline and adapting to every format type, including task specifics. There already exist some massive benchmarks, but they still have some simple (and sometimes silly) bugs, which can hurt the evaluations :( We present a novel automated translation framework to help with that! Eastern and Southern European languages introduce richer linguistic structures compared to English and for benchmarks which heavily rely on grammatical coherence machine translation presents a risk of harming evaluations. We discover potential answer leakage or misleading through grammatical structure of the questions. Some benchmarks are also just outdated and need to be retranslated with newer and better models. We present a framework with novel test-time scaling methods which allow to control time and cost investments, while at the same time mitigate the need for human-in-the-loop verification. While working on Ukrainian-focused MamayLM models, we had to translate 10+ benchmarks in a short span of time. Finding human evaluators is costly and time-consuming, same goes for using professional translators. With our pipeline we were able to do it in 3 days🏎️ We hope our findings will help enable stronger multilingual evaluations and developments. We release all produced benchmarks on Hugging Face together with the source code and Arxiv paper 🤗 Paper: https://huggingface.co/papers/2602.22207 Code: https://github.com/insait-institute/ritranslation Benchmarks: https://huggingface.co/collections/INSAIT-Institute/multilingual-benchmarks

reacted to hannayukhymenko's post with ❤️ 1 day ago

Do you translate your benchmarks from English correctly? 🤔 Turns out, for many languages it is much harder than you can imagine! Introducing Recovered in Translation 🌍 together with @aalexandrov ritranslation.insait.ai Translating benchmarks is a painful process, requiring a lot of manual inspection and adjustments. You start from setting up the whole pipeline and adapting to every format type, including task specifics. There already exist some massive benchmarks, but they still have some simple (and sometimes silly) bugs, which can hurt the evaluations :( We present a novel automated translation framework to help with that! Eastern and Southern European languages introduce richer linguistic structures compared to English and for benchmarks which heavily rely on grammatical coherence machine translation presents a risk of harming evaluations. We discover potential answer leakage or misleading through grammatical structure of the questions. Some benchmarks are also just outdated and need to be retranslated with newer and better models. We present a framework with novel test-time scaling methods which allow to control time and cost investments, while at the same time mitigate the need for human-in-the-loop verification. While working on Ukrainian-focused MamayLM models, we had to translate 10+ benchmarks in a short span of time. Finding human evaluators is costly and time-consuming, same goes for using professional translators. With our pipeline we were able to do it in 3 days🏎️ We hope our findings will help enable stronger multilingual evaluations and developments. We release all produced benchmarks on Hugging Face together with the source code and Arxiv paper 🤗 Paper: https://huggingface.co/papers/2602.22207 Code: https://github.com/insait-institute/ritranslation Benchmarks: https://huggingface.co/collections/INSAIT-Institute/multilingual-benchmarks

View all activity

Organizations

liked 2 datasets 6 days ago

windprak/steuerllm_instruct_dataset

Preview • Updated 19 days ago • 41 • 1

castorini/NanoKnow-Fineweb-Edu-Index

Updated 6 days ago • 1.37k • 2

liked a dataset 7 days ago

BabyLM-community/babylm-deu

Viewer • Updated Oct 15, 2025 • 36.6k • 53 • 2

liked a dataset 11 days ago

Eurolingua/HPLT3_DE_0.9_Quantile_Adult_Filtered

Viewer • Updated 11 days ago • 9.99M • 28 • 1

liked a dataset 12 days ago

turkish-nlp-suite/BellaTurca

Viewer • Updated 12 days ago • 53.9M • 1.04k • 10

liked 3 datasets 13 days ago

sentence-transformers/s2orc

Viewer • Updated May 6, 2024 • 132M • 1.52k • 16

openeurollm/propella-annotations

Viewer • Updated about 2 hours ago • 5.85B • 9.59k • 13

scrapegraphai/scrapegraphai-100k

Viewer • Updated Dec 21, 2025 • 93.7k • 71 • 23

liked a model 19 days ago

windprak/open_steuerllm

Text Generation • 28B • Updated 15 days ago • 33 • 2

liked a dataset 25 days ago

utter-project/EuroBlocks-SFT-2512

Viewer • Updated 26 days ago • 1.09M • 706 • 17

liked a dataset 26 days ago

fineinstructions/fineinstructions_nemotron

Viewer • Updated Jan 30 • 1.23B • 2.71k • 4

liked a dataset about 1 month ago

fineinstructions/finetemplates

Viewer • Updated Jan 30 • 18.6M • 266 • 2

liked a Space about 1 month ago

OCR Dataset Generator

Generate synthetic OCR datasets for low-resource languages

liked a model about 2 months ago

nvidia/Nemotron-Orchestrator-8B

Text Generation • Updated Dec 2, 2025 • 16k • 558

liked a dataset about 2 months ago

HuggingFaceFW/finetranslations

Viewer • Updated Jan 9 • 3.33B • 33.1k • 272

liked a dataset 2 months ago

bltlab/open-ner-standardized

Viewer • Updated Dec 19, 2025 • 831k • 357 • 2

liked a dataset 3 months ago

minilingua-ai/mcqa-minilingua-sft

Viewer • Updated Jul 27, 2025 • 17.2k • 86 • 1

liked 3 models 3 months ago

minilingua-ai/MiniLingua-1b

Updated Dec 27, 2025 • 108 • 2

Cognitive-Lab/NetraEmbed

Visual Document Retrieval • 4B • Updated Dec 10, 2025 • 383 • 24

Cognitive-Lab/ColNetraEmbed

Visual Document Retrieval • Updated Dec 10, 2025 • 384 • 4