avsolatorio/GIST-small-Embedding-v0 Sentence Similarity • 33.4M • Updated Apr 17, 2025 • 98.8k • • 33
view article Article Releasing Common Corpus: the largest public domain dataset for training LLMs Pclanglais • Mar 20, 2024 • 32