ManniX PRO

ManniX-ITA

32 6 28

https://github.com/mann1x

mann1x

AI & ML interests

None yet

Recent Activity

new activity 15 days ago

ManniX-ITA/Qwen3.6-27B-Omnimerge-v4-MTP-GGUF:My experience

new activity 16 days ago

ManniX-ITA/gemma-4-A4B-98e-v7-coder-it-GGUF:Gets stuck in loops using llama.cpp

posted an update 17 days ago

--- 🚀 Gemma-4-A4B 98e v7-coder cohort — loop-fixed re-release. Two 20.8B MoE coders (4B-active), fresh-map prunes of Gemma 4 26B-A4B, 30/128 experts dropped per layer. The headline isn't a benchmark: the agentic loop is gone at the weights, not papered over by the sampler. 🔧 How: at prune time we force-keep the 46 agentic_eog experts a loop-protection signal flags as load-bearing for clean multi-turn termination (+ shared-FFN α=1.2). Result: 0 loops across 48 seeds on every published tier. 📊 Q6_K · llama.cpp · greedy · same host (from summary.json): ⚖️ v7-coder (fkbroad code3/lcb2) — balanced coder: LCB-med-55 98.18, HumanEval 98.17, HE+ 92.07, AIME 80.0, MATH-500 95.0, GSM8K 91, IFEval 92, MultiPL-E 89.7, ARC 92.2. ⚡ v7-coderx (code4/lcb3) — code-maximal: all-hard LCB-77 85.71 (cohort-best; 128e 79.22, v7-coder 84.42), HE+ 93.29, GSM8K 93, MATH-500 95.0, AIME 76.67. Whole budget on code. 🎯 Both land near GPQA ~51 — graduate science is the budget axis, neither is a science model. Pick v7-coder for the broad LCB-medium + HumanEval lead; v7-coderx for the all-hard slice and HE+. 🧪 The harness we used to prove the fix is now an omk tool: agentic-loop-harness replays a frozen agentic conversation across a sampler×seed matrix and reports a fail-rate per chat-template, so you can isolate a loop to one variable. Model-agnostic — any OpenAI-compatible server. The version we shared with Google: https://huggingface.co/google/gemma-4-12B-it/discussions/41#6a3926720abc934d03fd85c0 📦 Each ships bf16 · GGUF (+ CD-* + imatrix + mmproj vision) · NVFP4A16 (~13 GB) · Ollama. 🔗 https://huggingface.co/ManniX-ITA/gemma-4-A4B-98e-v7-coder-it (+ -it-GGUF, -NVFP4A16) · https://ollama.com/mannix/gemma4-98e-v7-coder 🔗 https://huggingface.co/ManniX-ITA/gemma-4-A4B-98e-v7-coderx-it (+ -it-GGUF, -NVFP4A16) · https://ollama.com/mannix/gemma4-98e-v7-coderx 🔧 https://github.com/mann1x/omnimergekit/tree/main/tools/agentic-loop-harness

View all activity

Organizations

None yet

Posts 10

Post

224

---
🚀 Gemma-4-A4B 98e v7-coder cohort — loop-fixed re-release. Two 20.8B MoE coders (4B-active), fresh-map prunes of Gemma 4 26B-A4B, 30/128 experts dropped per layer. The headline isn't a benchmark: the agentic loop is
gone at the weights, not papered over by the sampler.

🔧 How: at prune time we force-keep the 46 agentic_eog experts a loop-protection signal flags as load-bearing for clean multi-turn termination (+ shared-FFN α=1.2). Result: 0 loops across 48 seeds on every published
tier.

📊 Q6_K · llama.cpp · greedy · same host (from summary.json):

⚖️ v7-coder (fkbroad code3/lcb2) — balanced coder: LCB-med-55 98.18, HumanEval 98.17, HE+ 92.07, AIME 80.0, MATH-500 95.0, GSM8K 91, IFEval 92, MultiPL-E 89.7, ARC 92.2.

⚡ v7-coderx (code4/lcb3) — code-maximal: all-hard LCB-77 85.71 (cohort-best; 128e 79.22, v7-coder 84.42), HE+ 93.29, GSM8K 93, MATH-500 95.0, AIME 76.67. Whole budget on code.

🎯 Both land near GPQA ~51 — graduate science is the budget axis, neither is a science model. Pick v7-coder for the broad LCB-medium + HumanEval lead; v7-coderx for the all-hard slice and HE+.

🧪 The harness we used to prove the fix is now an omk tool: agentic-loop-harness replays a frozen agentic conversation across a sampler×seed matrix and reports a fail-rate per chat-template, so you can isolate a loop
to one variable. Model-agnostic — any OpenAI-compatible server. The version we shared with Google: google/gemma-4-12B-it#41

📦 Each ships bf16 · GGUF (+ CD-* + imatrix + mmproj vision) · NVFP4A16 (~13 GB) · Ollama.
🔗 ManniX-ITA/gemma-4-A4B-98e-v7-coder-it (+ -it-GGUF, -NVFP4A16) · https://ollama.com/mannix/gemma4-98e-v7-coder
🔗 ManniX-ITA/gemma-4-A4B-98e-v7-coderx-it (+ -it-GGUF, -NVFP4A16) · https://ollama.com/mannix/gemma4-98e-v7-coderx
🔧 https://github.com/mann1x/omnimergekit/tree/main/tools/agentic-loop-harness

Post

316

🚀 Gemma-4-A4B 98e v7-coder cohort — two 20.8B MoE coders (4B-active), fresh-map prunes of Gemma 4 26B-A4B. 30/128 experts dropped per layer from the rebuilt v7 competence maps (audited producers, 10 classes), generic-code 3× + LiveCodeBench 2× on a [24,40] floor, plus the mandatory shared-FFN α=1.2.

📊 Q6_K, llama.cpp, greedy, same host (from summary.json):

🧪 v7-coder — science-augmented (+ targeted_gpqa 1.5). A coder that kept ALL its science: GPQA-D 70.71 (+9.6pp over v6-coder, at PARITY with the unpruned 128e's 67.17), AIME 76.67, MATH-500 92.0, GSM8K 93.0, HE 98.78, HE+ 92.68, LCB-55 96.36, LCB-100 97.0, MultiPL-E 88.67, IFEval 95, ARC 94.8. Edges 128e on GPQA/AIME/GSM8K with no code regression — science recovered by a dedicated targeted_gpqa calibration class.

⚡ v7-coderx — code-maximal (no science term). The strongest coder in the cohort: LCB-med-55 98.18 + LCB-100 99.0 (highest of any Gemma-4 prune to date, +1.8/+2.0pp past the unpruned 128e), MultiPL-E 90.0, HE+ 92.68, HE 95.73, IFEval 95, GSM8K 91, MATH-500 89, AIME 70.0. Whole budget on code; the trade is graduate science (GPQA 48.48).

🎯 v7-coder for strong code + graduate science; v7-coderx for max code (~1.8pp LCB-55 buys ~+22pp GPQA between them). Successor to v6-coder, whose code profile led the 14–22B band (+9pp HE over Qwen2.5-Coder-14B, same-rig).

📦 Each ships bf16 · GGUF (29 tiers + ContribDynamic CD-* + F16 + imatrix + mmproj vision) · NVFP4A16 (native vLLM, ~13 GB) · Ollama (29 tiers + vision-* + :latest=Q4_K_M).

🔗 v7-coder: ManniX-ITA/gemma-4-A4B-98e-v7-coder-it (+ -it-GGUF, -NVFP4A16) · https://ollama.com/mannix/gemma4-98e-v7-coder
🔗 v7-coderx: ManniX-ITA/gemma-4-A4B-98e-v7-coderx-it (+ -it-GGUF, -NVFP4A16) · https://ollama.com/mannix/gemma4-98e-v7-coderx

View all Posts

Collections 1

models 53

datasets 1

ManniX-ITA/osync-code

Viewer • Updated Jan 12 • 1 • 14

ManniX PRO

AI & ML interests

Recent Activity

Organizations

Posts 10

Collections 1

ManniX-ITA/gemma-4-A4B-98e-v5-coder-it

ManniX-ITA/gemma-4-A4B-98e-v5-coder-it

models 53

ManniX-ITA/gemma-4-A4B-98e-v7-coder-NVFP4A16

ManniX-ITA/gemma-4-A4B-98e-v7-coder-it-GGUF

ManniX-ITA/gemma-4-A4B-98e-v7-coder-it

ManniX-ITA/gemma-4-A4B-98e-v7-coderx-NVFP4A16

ManniX-ITA/gemma-4-A4B-98e-v7-coderx-it-GGUF

ManniX-ITA/gemma-4-A4B-98e-v7-coderx-it

ManniX-ITA/gemma-4-A4B-98e-v6-coder-it-GGUF

ManniX-ITA/gemma-4-A4B-98e-v6-coder-it

ManniX-ITA/Qwen3.5-4B-MicroCoder-GGUF

ManniX-ITA/Qwen3.5-4B-MicroCoder

datasets 1

ManniX-ITA/osync-code

ManniX PRO

AI & ML interests

Recent Activity

Organizations

Posts 10

Collections 1

models 53 Sort: Recently updated

datasets 1

models 53