Training Data
#1
by
ayymen - opened
ⴰⵢⵢⵓⵣ ⵏⵏⴽ ⵖⴼ ⵜⵡⵓⵔⵉ ⴰⴷ 👏
Do you have any plans to expand the training data? like adding the HPLT v3 dataset?
Azul ayymen tanmirt f l input nnk! Definitely, expanding the training data is the main goal for v2. I'm actually looking into HPLT v3 and other crawled corpora to improve the model's coherence , If you have any suggestions or want to contribute, feel free to share ayyuz nk
I would love to collaborate/contribute. I invite you to join us at Tamazight NLP! we would love to have more people working on Language Models for Tamazight, it really is the logical next step for the language.