NeoAraBERT: A Modern Foundation Model for Arabic Embeddings with Diacritics-Aware Tokenization and POS-Targeted Masking
AI & ML interests
ML, NLP, CL for Arabic
Recent Activity
Organization Card
Unit For Research Studies in Arabic and Social Digial Spaces
models 11
U4RASD/NeoAraBERT_DA
Feature Extraction • 0.3B • Updated • 101 • 4
U4RASD/NeoAraBERT_MSA
Feature Extraction • 0.3B • Updated • 110 • 4
U4RASD/NeoAraBERT
Feature Extraction • 0.3B • Updated • 217 • 5
U4RASD/AREEj
0.6B • Updated • 18 • 1
U4RASD/ar-ms-baseline
Text Generation • Updated • 3 • 1
U4RASD/dalla-model-training
Updated
U4RASD/dalla-gemma-it
9B • Updated • 340
U4RASD/dalla-llama-it
8B • Updated
U4RASD/ArATTC
Text Classification • 0.1B • Updated • 12
U4RASD/ArGTC
Text Classification • Updated • 11
datasets 9
U4RASD/Muradif
Viewer • Updated • 38.6k • 57 • 4
U4RASD/omar-al-saleh-manuscripts-full
Viewer • Updated • 22 • 13 • 2
U4RASD/omar-al-saleh-manuscripts-segments
Viewer • Updated • 20.7k • 12
U4RASD/Masrad
Viewer • Updated • 19.4k • 9 • 1
U4RASD/curriculum_books_sft
Viewer • Updated • 1.36k • 6
U4RASD/curriculum_books_cpt
Viewer • Updated • 887 • 43
U4RASD/ArSRED
Viewer • Updated • 500k • 12 • 1
U4RASD/ArTopicDS-Books
Viewer • Updated • 21.2k • 11 • 2
U4RASD/ArBNTopic
Viewer • Updated • 19.8k • 10 • 3