Lusaka Language Analysis Model

The Lusaka Language Analysis is a multilingual sentiment classification model fine‑tuned from google-bert/bert-base-multilingual-cased (mBERT). and it is built specifically for Zambian linguistic contexts with a focus on:

Zambian English (Lusaka variety)
Bemba
Nyanja (Chichewa)

The model is optimized to recognize mixed-language usage, local idioms, indirect expressions, and contextual sarcasm commonly found in everyday Zambian communication and social media discourse.

Task

def classify_text(text):
    """
    Run inference on a single text input using the fine‑tuned LusakaLang model.
    Returns the predicted label and confidence score.
    """
    result = classifier(text)[0]
    label = result["label"]
    score = round(result["score"], 4)
    return label, score
samples = [
    "Muli shani bane, nalishiba bwino.",
    "How are you doing today?",
    "Tili bwino, zikomo kwambiri."
]
for s in samples:
    label, score = classify_text(s)
    print(f"Text: {s}\nPrediction: {label} (confidence={score})\n")

Sample Output

Text: Muli shani bane, nalishiba bwino.
Prediction: Bemba (confidence=0.9821)

Text: How are you doing today?
Prediction: English (confidence=0.9954)

Text: Tili bwino, zikomo kwambiri.
Prediction: Nyanja (confidence=0.9736)

Language Graph

Note: The unknown langauge here represents a Mixed language of English, Bemba and Nyanja of varying degrees e.g GPS yenze nama issues so it made me delay my journey kwati nibamudala.