YAML Metadata Warning:empty or missing yaml metadata in repo card
Check out the documentation for more information.
Model Name
This model predicts whether a chat message should earn participation points. It was developed for the FEV Participation Points project, which studied an intervention where elementary and middle school tutors received guidance on awarding participation points during math tutoring sessions. The tutoring is chat-based.
Training Details
Base model
Bert base model
Datasets
The dataset consisted on a subset of 1,000 messages that include the word "point" in the utterance.
| Dataset | Split | Size | Source | Notes |
|---|---|---|---|---|
| Tutor math chats | train | 1,000 | Shared by tutoring provider | Contains only utterances with the word "point" |
Hyperparameters
| Parameter | Value |
|---|---|
| Learning rate | 1e-5 |
| Batch size | 8 |
| Optimizer | AdamW (beta1=0.9, beta2=0.999, epsilon=1*10-8) |
| Epochs / Steps | 20 epochs with early stopping (F1 on minority class) |
| Warmup | 0 |
| Weight decay | 0.01 |
Evaluation
Results
| Model | Dataset | Split | Metric | Score |
|---|---|---|---|---|
| This model | Subset of math messages with points awarded | test | F1 - Yes | 0.9943 |
| This model | Subset of math messages with points awarded | test | F1 - No | 0.9583 |
Limitations and Caveats
- Model is highly specific for taks related to FEV Participation points
- The model was trained on a subset of messages that include the word "point" in the utterance
How to Use
Message Structure
The classifier predicts directly on the message, with no previous context or following utterances.
Running instructions
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch
model_dir = "model_outputs" # or a specific checkpoint folder
tokenizer = AutoTokenizer.from_pretrained(model_dir)
model = AutoModelForSequenceClassification.from_pretrained(model_dir)
model.eval()
text = "Your message here"
inputs = tokenizer(text, return_tensors="pt", truncation=True)
with torch.no_grad():
logits = model(**inputs).logits
pred_id = logits.argmax(dim=-1).item()
label = {0: "no", 1: "yes"}[pred_id]
print(label)
Code and Responsibles
Repository: https://github.com/scale-nssa/fev_partpoints_nlp
Maintainers / Contributors: FEV Participation Points team (lead: JP Martinez)
Bias and Fairness
Dataset does not have information about the tutor or student demographic
License
This model is released under License Name.
- Downloads last month
- 15