Synthetic baselines trained for our paper "Scaling Low-Resource MT via Synthetic Data Generation with LLMs" accepted as a main in EMNLP 2025.
AI & ML interests
At the University of Helsinki, we focus on: - NLP for morphologically-rich languages - Cross-lingual NLP - NLP in the humanities
Recent Activity
View all activity
Organization Card
Helsinki-NLP refers to the language technology research group at the University of Helsinki. Here, we publish various resource related to multilingual NLP, machine translation, text simplification to name a few application areas. We focus on wide language coverage, open data sets and public pre-trained models.
models 1,537
Helsinki-NLP/opus-mt-eo-caenes
Translation • 76.9M • Updated
• 2 • 1
Helsinki-NLP/opus-mt-caenes-eo
Translation • 76.9M • Updated
• 1
Helsinki-NLP/opus-mt-fr-en
Translation • 75.2M • Updated
• 717k • • 50
Helsinki-NLP/opus-mt-synthetic-en-eu
Updated
• 62 • 1
Helsinki-NLP/opus-mt-synthetic-en-mk
Updated
• 70
Helsinki-NLP/opus-mt-synthetic-en-ka
Updated
• 110
Helsinki-NLP/opus-mt-synthetic-en-so
Updated
• 77 • 1
Helsinki-NLP/opus-mt-synthetic-en-is
Updated
• 73 • 1
Helsinki-NLP/opus-mt-synthetic-en-uk
Updated
• 75
Helsinki-NLP/opus-mt-synthetic-en-gd
Updated
• 82
datasets 52
Helsinki-NLP/nemotron-cc-translated
Viewer
• Updated
• 5.79B • 14k • 2
Helsinki-NLP/shroom-cap
Preview
• Updated
• 36 • 1
Helsinki-NLP/fineweb-edu-translated
Preview
• Updated
• 294k • 4
Helsinki-NLP/OpenSubtitles2024
Viewer
• Updated
• 570M • 180 • 3
Helsinki-NLP/shroom
Preview
• Updated
• 10
Helsinki-NLP/mu-shroom
Viewer
• Updated
• 11.5k • 162 • 4
Helsinki-NLP/tatoeba_mt_train
Viewer
• Updated
• 13.7B • 112 • 5
Helsinki-NLP/tatoeba_mt
Updated
• 2.74k • 61
Helsinki-NLP/un_pc
Viewer
• Updated
• 323M • 2.35k • 26
Helsinki-NLP/un_ga
Viewer
• Updated
• 1.11M • 362 • 3