Status / Important

This is an MLX conversion artifact of TomoroAI/tomoro-colqwen3-embed-8b.

โš ๏ธ Note: Using this model currently requires a version of mlx-vlm that includes colqwen3 support (PR pending upstream). Until the PR is merged, you must use a fork/branch of mlx-vlm with the changes.

How to use (temporary)

pip uninstall -y mlx-vlm
pip install --upgrade "git+https://github.com/bulutyigit/mlx-vlm.git@add-colqwen3-embed"
from PIL import Image
from mlx_vlm import load

model, processor = load(
    "bulutyigit/tomoro-colqwen3-embed-8b-mlx-4bit",
    trust_remote_code=True,
)

queries = ["image of a dog", "image of Beijing"]
q_embs = model.encode_queries(processor, queries)

imgs = [
    Image.open("page_patch_0.png").convert("RGB"),
    Image.open("page_patch_1.png").convert("RGB"),
]
d_embs = model.encode_images(processor, imgs)

score = model.maxsim(q_embs[0], d_embs[0])
print("MaxSim:", float(score))

Tomoro ColQwen3 Embed 8B โ€” MLX 4-bit

Converted from TomoroAI/tomoro-colqwen3-embed-8b to MLX and quantized to 4-bit using mlx-vlm.

Original Model Card: https://huggingface.co/TomoroAI/tomoro-colqwen3-embed-8b

Convert command

mlx_vlm.convert \
  --hf-path TomoroAI/tomoro-colqwen3-embed-8b \
  --mlx-path ./tomoro-colqwen3-embed-8b-4bit-mlx \
  --quantize --q-bits 4
Downloads last month
89
Safetensors
Model size
2B params
Tensor type
BF16
ยท
U32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support

Model tree for bulutyigit/tomoro-colqwen3-embed-8b-mlx-4bit

Quantized
(2)
this model