Status / Important

This is an MLX conversion artifact of TomoroAI/tomoro-colqwen3-embed-8b.

⚠️ Note: Using this model currently requires a version of mlx-vlm that includes colqwen3 support (PR pending upstream). Until the PR is merged, you must use a fork/branch of mlx-vlm with the changes.

How to use (temporary)

pip uninstall -y mlx-vlm
pip install --upgrade "git+https://github.com/bulutyigit/mlx-vlm.git@add-colqwen3-embed"

from PIL import Image
from mlx_vlm import load

model, processor = load(
    "bulutyigit/tomoro-colqwen3-embed-8b-mlx-4bit",
    trust_remote_code=True,
)

queries = ["image of a dog", "image of Beijing"]
q_embs = model.encode_queries(processor, queries)

imgs = [
    Image.open("page_patch_0.png").convert("RGB"),
    Image.open("page_patch_1.png").convert("RGB"),
]
d_embs = model.encode_images(processor, imgs)

score = model.maxsim(q_embs[0], d_embs[0])
print("MaxSim:", float(score))

Tomoro ColQwen3 Embed 8B — MLX 4-bit

Converted from TomoroAI/tomoro-colqwen3-embed-8b to MLX and quantized to 4-bit using mlx-vlm.

Original Model Card: https://huggingface.co/TomoroAI/tomoro-colqwen3-embed-8b

Convert command

mlx_vlm.convert \
  --hf-path TomoroAI/tomoro-colqwen3-embed-8b \
  --mlx-path ./tomoro-colqwen3-embed-8b-4bit-mlx \
  --quantize --q-bits 4

Downloads last month: 89

Safetensors

Model size

2B params

Tensor type

BF16

U32

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for bulutyigit/tomoro-colqwen3-embed-8b-mlx-4bit

Base model

Qwen/Qwen3-VL-8B-Instruct

Finetuned

TomoroAI/tomoro-colqwen3-embed-8b

Quantized

(2)

this model