lightonai/modernbert-embed-large Sentence Similarity • 0.4B • Updated May 14, 2025 • 8.53k • • 33
view article Article Provence: efficient and robust context pruning for retrieval-augmented generation Jan 28, 2025 • 26
mirth/chonky_distilbert_base_uncased_1 Token Classification • 66.4M • Updated Jan 17 • 223k • • 15
mirth/chonky_mmbert_small_multilingual_1 Token Classification • 0.1B • Updated Jan 17 • 186 • 23
google/embeddinggemma-300m Sentence Similarity • 0.3B • Updated Sep 25, 2025 • 1.31M • • 1.62k
mamei16/chonky_distilbert-base-multilingual-cased Token Classification • 0.1B • Updated Nov 14, 2025 • 336 • 4
Text chunking / splitting models Collection It intelligently segments text into meaningful semantic chunks. Could be useful for RAG systems as text-chunking module. • 4 items • Updated Jan 24 • 1
Text chunking / splitting models Collection It intelligently segments text into meaningful semantic chunks. Could be useful for RAG systems as text-chunking module. • 4 items • Updated Jan 24 • 1
mirth/chonky_mmbert_small_multilingual_1 Token Classification • 0.1B • Updated Jan 17 • 186 • 23
mirth/chonky_mmbert_small_multilingual_1 Token Classification • 0.1B • Updated Jan 17 • 186 • 23