AraMix is a SOTA Arabic pretraining dataset
Sultan Alrashed PRO
SultanR
AI & ML interests
Smol language modelling and Arabic!
Recent Activity
upvoted a paper about 11 hours ago
SmolKalam: Ensemble Quality-Filtered Translation at Scale for High Quality Arabic Post-Training Data updated a collection about 11 hours ago
SmolKalam updated a collection about 11 hours ago
SmolKalam