Lusaka Language Analysis Model
The Lusaka Language Analysis is a multilingual sentiment classification model fine‑tuned from google-bert/bert-base-multilingual-cased (mBERT).
and it is built specifically for Zambian linguistic contexts with a focus on:
- Zambian English (Lusaka variety)
- Bemba
- Nyanja (Chichewa)
The model is optimized to recognize mixed-language usage, local idioms, indirect expressions, and contextual sarcasm commonly found in everyday Zambian communication and social media discourse.
Task
def classify_text(text):
"""
Run inference on a single text input using the fine‑tuned LusakaLang model.
Returns the predicted label and confidence score.
"""
result = classifier(text)[0]
label = result["label"]
score = round(result["score"], 4)
return label, score
samples = [
"Muli shani bane, nalishiba bwino.",
"How are you doing today?",
"Tili bwino, zikomo kwambiri."
]
for s in samples:
label, score = classify_text(s)
print(f"Text: {s}\nPrediction: {label} (confidence={score})\n")
Sample Output
Text: Muli shani bane, nalishiba bwino.
Prediction: Bemba (confidence=0.9821)
Text: How are you doing today?
Prediction: English (confidence=0.9954)
Text: Tili bwino, zikomo kwambiri.
Prediction: Nyanja (confidence=0.9736)
Language Graph
Note: The unknown langauge here represents a Mixed language of English, Bemba and Nyanja of varying degrees e.g GPS yenze nama issues so it made me delay my journey kwati nibamudala.
Classification Report
Confusion Matrix
Word Cloud
- Downloads last month
- 98
Model tree for Kelvinmbewe/mbert_Lusaka_Language_Analysis
Datasets used to train Kelvinmbewe/mbert_Lusaka_Language_Analysis
Evaluation results
- accuracy on LusakaLang Test Settest set self-reported0.997
- precision on LusakaLang Test Settest set self-reported0.997
- recall on LusakaLang Test Settest set self-reported0.997
- f1 on LusakaLang Test Settest set self-reported0.998



