Lusaka Language Analysis Model

The Lusaka Language Analysis is a multilingual sentiment classification model fine‑tuned from google-bert/bert-base-multilingual-cased (mBERT). and it is built specifically for Zambian linguistic contexts with a focus on:

  • Zambian English (Lusaka variety)
  • Bemba
  • Nyanja (Chichewa)

The model is optimized to recognize mixed-language usage, local idioms, indirect expressions, and contextual sarcasm commonly found in everyday Zambian communication and social media discourse.


Task

def classify_text(text):
    """
    Run inference on a single text input using the fine‑tuned LusakaLang model.
    Returns the predicted label and confidence score.
    """
    result = classifier(text)[0]
    label = result["label"]
    score = round(result["score"], 4)
    return label, score
samples = [
    "Muli shani bane, nalishiba bwino.",
    "How are you doing today?",
    "Tili bwino, zikomo kwambiri."
]
for s in samples:
    label, score = classify_text(s)
    print(f"Text: {s}\nPrediction: {label} (confidence={score})\n")

Sample Output

Text: Muli shani bane, nalishiba bwino.
Prediction: Bemba (confidence=0.9821)

Text: How are you doing today?
Prediction: English (confidence=0.9954)

Text: Tili bwino, zikomo kwambiri.
Prediction: Nyanja (confidence=0.9736)

Language Graph

image

Note: The unknown langauge here represents a Mixed language of English, Bemba and Nyanja of varying degrees e.g GPS yenze nama issues so it made me delay my journey kwati nibamudala.

Classification Report

image

Confusion Matrix

image

Word Cloud

image

Downloads last month
98
Safetensors
Model size
0.2B params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for Kelvinmbewe/mbert_Lusaka_Language_Analysis

Finetuned
(929)
this model
Finetunes
2 models

Datasets used to train Kelvinmbewe/mbert_Lusaka_Language_Analysis

Evaluation results