cyberagent
/

CAT-Translate-0.8b

@@ -13,7 +13,7 @@ base_model:
 Tiny Language Model For Japanese and English Bidirectional Translation
-- **Purrs on your lap** 🐱: Small and efficient! 0.8-3.3B models that run on edge devices.
 - **Swift and Feline Sharp** 🐾: Beats TranslateGemma-12B on text-to-text translation quality.
 - **Adopt and adapt** 🐈: Open source (MIT License) models you can customize and extend.
@@ -28,6 +28,7 @@ All models are available on Hugging Face:
 - [CAT-Translate-0.8B](https://huggingface.co/cyberagent/CAT-Translate-0.8b/)
 - [CAT-Translate-1.4B](https://huggingface.co/cyberagent/CAT-Translate-1.4b/)
 - [CAT-Translate-3.3B](https://huggingface.co/cyberagent/CAT-Translate-3.3b/)
 ## Evaluation
@@ -44,22 +45,24 @@ We conducted evaluation on the translation subsets of the following benchmarks:
 We chose these tasks as benchmarks because (1) they are derived from real world applications and (2) are less overoptimized compared to popular datasets (e.g., WMT).
 The results are below.
-Overall, our 1.4B model achieved the best overall scores.
-The 0.8B, 1.4B, and 3.3B-beta models achieved the best scores among all models (including closed source) within their respective sizes for both En-Ja and Ja-En translation tasks.
 | Model                                            | Avg. BLEU | Avg. BLEU Ja->En | Avg. BLEU En->Ja | BSD (Ja-En) | Court (Ja-En) | JMed (Ja-En) | PFMT (Ja-En) | wat-pat-2025 (Ja-En) | BSD (En-Ja) | JMed (En-Ja) | PFMT (En-Ja) | wat-pat-2025 (En-Ja) |
 |:-------------------------------------------------|----------:|-----------------:|-----------------:|------------:|--------------:|-------------:|-------------:|------------------:|------------:|-------------:|-------------:|------------------:|
 | CyberAgent/CAT-Translate-1.4B                      |   33.73  |   33.26  |   34.19  |  31.28  |  43.84  |  24.08  |  36.55  |  30.57  |  15.71  |  26.92  |  51.53  |  42.58  |
 | Unbabel/Tower-Plus-9B                              |   32.41  |   36.84  |   27.99  |  15.43  |  40.54  |  29.13  |  58.00  |  41.10  |  10.00  |  18.80  |  53.00  |  30.16  |
 | google/translategemma-12b-it                       |   32.24  |   35.81  |   28.68  |  31.58  |  34.30  |  23.46  |  48.75  |  40.97  |  15.92  |  21.79  |  52.53  |  24.47  |
-| CyberAgent/CAT-Translate-3.3B                 |   30.60  |   30.32  |   30.88  |  17.20  |  38.65  |  23.96  |  40.58  |  31.22  |  16.63  |  26.68  |  53.40  |  26.80  |
 | CyberAgent/CAT-Translate-0.8B                    |   30.42  |   29.71  |   30.68  |  29.63  |  33.19  |  22.96  |  32.51  |  30.56  |  14.60  |  26.22  |  50.62  |  32.87  |
 | google/translategemma-4b-it                      |   28.09  |   29.41  |   26.76  |  28.86  |  25.89  |  21.50  |  42.65  |  28.16  |  14.14  |  20.68  |  51.99  |  20.23  |
 | LiquidAI/LFM2.5-1.2B-JP                          |   25.47  |   24.51  |   26.43  |  19.06  |  29.99  |  22.10  |  43.61  |   7.80  |  14.57  |  23.85  |  54.77  |  12.54  |
 | pfnet/plamo-2-translate                          |   25.24  |   25.92  |   24.57  |  25.55  |  28.63  |  22.90  |  29.02  |  23.48  |  17.35  |  24.98  |  32.04  |  23.89  |
 | LiquidAI/LFM2-350M-ENJP-MT                       |   24.95  |   24.91  |   25.00  |  10.94  |  29.56  |  21.48  |  41.40  |  21.17  |   8.11  |  22.84  |  47.53  |  21.52  |
 | mistralai/Ministral-8B-Instruct-2410             |   24.12  |   27.52  |   20.71  |  19.23  |  29.21  |  16.25  |  50.23  |  22.69  |  12.91  |  16.49  |  41.66  |  11.80  |
 | Rakuten/RakutenAI-2.0-mini-instruct              |   18.43  |   17.24  |   19.62  |   0.11  |  30.62  |  18.21  |  29.34  |   7.90  |   5.19  |  20.36  |  45.70  |   7.23  |
 | SakanaAI/TinySwallow-1.5B-Instruct               |   15.74  |   14.99  |   16.49  |   4.96  |  18.93  |  15.83  |  26.67  |   8.58  |   6.30  |  17.58  |  34.07  |   8.00  |
 | llm-jp/llm-jp-3.1-1.8b-instruct4                 |   15.18  |   16.26  |   14.11  |  18.82  |   2.44  |  15.67  |  30.65  |  13.72  |  15.38  |   4.91  |  25.47  |  10.65  |

 Tiny Language Model For Japanese and English Bidirectional Translation
+- **Purrs on your lap** 🐱: Small and efficient! 0.8-7B models that run on edge devices.
 - **Swift and Feline Sharp** 🐾: Beats TranslateGemma-12B on text-to-text translation quality.
 - **Adopt and adapt** 🐈: Open source (MIT License) models you can customize and extend.
 - [CAT-Translate-0.8B](https://huggingface.co/cyberagent/CAT-Translate-0.8b/)
 - [CAT-Translate-1.4B](https://huggingface.co/cyberagent/CAT-Translate-1.4b/)
 - [CAT-Translate-3.3B](https://huggingface.co/cyberagent/CAT-Translate-3.3b/)
+- [CAT-Translate-7B](https://huggingface.co/cyberagent/CAT-Translate-7b/)
 ## Evaluation
 We chose these tasks as benchmarks because (1) they are derived from real world applications and (2) are less overoptimized compared to popular datasets (e.g., WMT).
 The results are below.
+All the models achieved the best scores among all models (including closed source) within their respective sizes for both En-Ja and Ja-En translation tasks.
 | Model                                            | Avg. BLEU | Avg. BLEU Ja->En | Avg. BLEU En->Ja | BSD (Ja-En) | Court (Ja-En) | JMed (Ja-En) | PFMT (Ja-En) | wat-pat-2025 (Ja-En) | BSD (En-Ja) | JMed (En-Ja) | PFMT (En-Ja) | wat-pat-2025 (En-Ja) |
 |:-------------------------------------------------|----------:|-----------------:|-----------------:|------------:|--------------:|-------------:|-------------:|------------------:|------------:|-------------:|-------------:|------------------:|
+| CyberAgent/CAT-Translate-7B                       |   37.68  |   41.06  |   34.31  |  33.75  |  45.29  |  30.65  |  49.86  |  45.74  |  16.29  |  29.62  |  52.94  |  38.37  |
+| CyberAgent/CAT-Translate-3.3B                       |   36.16  |   37.51  |   34.80  |  26.51  |  42.44  |  24.47  |  49.93  |  44.23  |  17.21  |  28.67  |  53.88  |  39.44  |
 | CyberAgent/CAT-Translate-1.4B                      |   33.73  |   33.26  |   34.19  |  31.28  |  43.84  |  24.08  |  36.55  |  30.57  |  15.71  |  26.92  |  51.53  |  42.58  |
 | Unbabel/Tower-Plus-9B                              |   32.41  |   36.84  |   27.99  |  15.43  |  40.54  |  29.13  |  58.00  |  41.10  |  10.00  |  18.80  |  53.00  |  30.16  |
 | google/translategemma-12b-it                       |   32.24  |   35.81  |   28.68  |  31.58  |  34.30  |  23.46  |  48.75  |  40.97  |  15.92  |  21.79  |  52.53  |  24.47  |
+| CyberAgent/CAT-Translate-3.3B-beta               |   30.60  |   30.32  |   30.88  |  17.20  |  38.65  |  23.96  |  40.58  |  31.22  |  16.63  |  26.68  |  53.40  |  26.80  |
 | CyberAgent/CAT-Translate-0.8B                    |   30.42  |   29.71  |   30.68  |  29.63  |  33.19  |  22.96  |  32.51  |  30.56  |  14.60  |  26.22  |  50.62  |  32.87  |
 | google/translategemma-4b-it                      |   28.09  |   29.41  |   26.76  |  28.86  |  25.89  |  21.50  |  42.65  |  28.16  |  14.14  |  20.68  |  51.99  |  20.23  |
 | LiquidAI/LFM2.5-1.2B-JP                          |   25.47  |   24.51  |   26.43  |  19.06  |  29.99  |  22.10  |  43.61  |   7.80  |  14.57  |  23.85  |  54.77  |  12.54  |
 | pfnet/plamo-2-translate                          |   25.24  |   25.92  |   24.57  |  25.55  |  28.63  |  22.90  |  29.02  |  23.48  |  17.35  |  24.98  |  32.04  |  23.89  |
 | LiquidAI/LFM2-350M-ENJP-MT                       |   24.95  |   24.91  |   25.00  |  10.94  |  29.56  |  21.48  |  41.40  |  21.17  |   8.11  |  22.84  |  47.53  |  21.52  |
 | mistralai/Ministral-8B-Instruct-2410             |   24.12  |   27.52  |   20.71  |  19.23  |  29.21  |  16.25  |  50.23  |  22.69  |  12.91  |  16.49  |  41.66  |  11.80  |
+| nvidia/NVIDIA-Nemotron-Nano-9B-v2-Japanese      |   22.97  |   22.77  |   23.18  |   9.62  |  34.98  |  18.01  |  38.44  |  12.81  |  10.62  |  20.41  |  42.55  |  19.13  |
 | Rakuten/RakutenAI-2.0-mini-instruct              |   18.43  |   17.24  |   19.62  |   0.11  |  30.62  |  18.21  |  29.34  |   7.90  |   5.19  |  20.36  |  45.70  |   7.23  |
 | SakanaAI/TinySwallow-1.5B-Instruct               |   15.74  |   14.99  |   16.49  |   4.96  |  18.93  |  15.83  |  26.67  |   8.58  |   6.30  |  17.58  |  34.07  |   8.00  |
 | llm-jp/llm-jp-3.1-1.8b-instruct4                 |   15.18  |   16.26  |   14.11  |  18.82  |   2.44  |  15.67  |  30.65  |  13.72  |  15.38  |   4.91  |  25.47  |  10.65  |