Latvian Text Encoders
Collection
8 items • Updated
Latvian DeBERTaV3 encoder model trained with a replaced token detection (RTD) objective, released with the paper "Pretraining and Benchmarking Modern Encoders for Latvian".
For evaluation code and benchmark results, see: https://github.com/LUMII-AILab/latvian-encoders
@inproceedings{znotins-2026-pretraining,
title = "Pretraining and Benchmarking Modern Encoders for {L}atvian",
author = "Znotins, Arturs",
editor = "Hettiarachchi, Hansi and
Ranasinghe, Tharindu and
Plum, Alistair and
Rayson, Paul and
Mitkov, Ruslan and
Gaber, Mohamed and
Premasiri, Damith and
Tan, Fiona Anting and
Uyangodage, Lasitha",
booktitle = "Proceedings of the Second Workshop on Language Models for Low-Resource Languages ({L}o{R}es{LM} 2026)",
month = mar,
year = "2026",
address = "Rabat, Morocco",
publisher = "Association for Computational Linguistics",
url = "https://aclanthology.org/2026.loreslm-1.40/",
pages = "461--470",
ISBN = "979-8-89176-377-7"
}