Aaron Sweeney's picture

12 25

Aaron Sweeney

Haze123

·

AI & ML interests

None yet

Recent Activity

reacted to eaddario's post with 👍 1 day ago

Experimental global target bits‑per‑weight quantization of mistralai/Ministral-3-14B-Instruct-2512 and mistralai/Ministral-3-14B-Reasoning-2512 Unlike standard llama.cpp quantizations that rely on fixed type heuristics (e.g., Q4_K_M), the Target BPW approach optimizes per-tensor precision where it matters the most, and produces high quality models that meet a precise global file size target. Key Advantages: - VRAM Maximization: Can generate high quality models sized exactly to fit hardware constraints (e.g., fitting the model into exactly 24GB VRAM). - Data-Driven Precision: Quantization mix is determined by actual weight error sensitivity rather than hardcoded rules, often yielding better PPL/KLD size trade-offs. Full benchmarks (PPL, KLD, ARC, MMLU, etc.) and methodology in the models' cards https://huggingface.co/eaddario/Ministral-3-14B-Instruct-2512-GGUF https://huggingface.co/eaddario/Ministral-3-14B-Reasoning-2512-GGUF

liked a Space about 2 months ago

Tongyi-MAI/Z-Image-Turbo

liked a Space 2 months ago

black-forest-labs/FLUX.2-dev

View all activity

Organizations

None yet

Haze123 's models 2

Haze123/Palzani-4-IMG-LoRA

Text-to-Image • Updated Nov 2, 2024 • 1

Haze123/my-lora

Updated Oct 31, 2024