bge-m3

Multi-format version of BAAI/bge-m3 - optimized for deployment.

Model Information

Property Value
Base Model BAAI/bge-m3
Task feature-extraction
Type Text Model
Trust Remote Code False

Available Versions

Folder Format Description Size
safetensors-fp32/ PyTorch FP32 Baseline, highest accuracy 2187 MB
safetensors-fp16/ PyTorch FP16 GPU inference, ~50% smaller 1104 MB

Usage

PyTorch (GPU)

from transformers import AutoModel, AutoTokenizer
import torch

# GPU inference with FP16
model = AutoModel.from_pretrained(
    "n24q02m/bge-m3",
    subfolder="safetensors-fp16",
    torch_dtype=torch.float16
).cuda()
tokenizer = AutoTokenizer.from_pretrained(
    "n24q02m/bge-m3",
    subfolder="safetensors-fp16"
)

# Inference
inputs = tokenizer("Hello world", return_tensors="pt").to("cuda")
with torch.no_grad():
    outputs = model(**inputs)
    embeddings = outputs.last_hidden_state.mean(dim=1)  # Mean pooling

Notes

  1. SafeTensors FP16 is the primary format for GPU inference
  2. Load tokenizer from the same folder as the model

License

Apache 2.0 (following the base model's license)

Credits

  • Base Model: BAAI/bge-m3
  • Conversion: PyTorch + SafeTensors
Downloads last month

-

Downloads are not tracked for this model. How to track
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for n24q02m/bge-m3

Base model

BAAI/bge-m3
Finetuned
(361)
this model