bge-m3

Multi-format version of BAAI/bge-m3 - optimized for deployment.

Model Information

Property	Value
Base Model	BAAI/bge-m3
Task	feature-extraction
Type	Text Model
Trust Remote Code	False

Available Versions

Folder	Format	Description	Size
`safetensors-fp32/`	PyTorch FP32	Baseline, highest accuracy	2187 MB
`safetensors-fp16/`	PyTorch FP16	GPU inference, ~50% smaller	1104 MB

Usage

PyTorch (GPU)

from transformers import AutoModel, AutoTokenizer
import torch

# GPU inference with FP16
model = AutoModel.from_pretrained(
    "n24q02m/bge-m3",
    subfolder="safetensors-fp16",
    torch_dtype=torch.float16
).cuda()
tokenizer = AutoTokenizer.from_pretrained(
    "n24q02m/bge-m3",
    subfolder="safetensors-fp16"
)

# Inference
inputs = tokenizer("Hello world", return_tensors="pt").to("cuda")
with torch.no_grad():
    outputs = model(**inputs)
    embeddings = outputs.last_hidden_state.mean(dim=1)  # Mean pooling

Notes

SafeTensors FP16 is the primary format for GPU inference
Load tokenizer from the same folder as the model

License

Apache 2.0 (following the base model's license)

Credits

Base Model: BAAI/bge-m3
Conversion: PyTorch + SafeTensors

Downloads last month: -; Downloads are not tracked for this model. How to track

Model tree for n24q02m/bge-m3

Base model

BAAI/bge-m3

Finetuned

(361)

this model