In a Training Loop 🔄

5 15 58

R

juiceb0xc0de

JuiceB0xC0de

AI & ML interests

destroying heuristic determination in 4 dimensions to flood the engines with diversity and a lot of swear words

Recent Activity

reacted to mipo57's post with 👀 about 3 hours ago

If you need to redact PII from your data in multiple languages, check out https://huggingface.co/bardsai/eu-pii-anonimization-multilang We open sourced our internal tooling at https://huggingface.co/bardsai Let me know what you think! :D

repliedto HannesVonEssen's post about 3 hours ago

📣 Hugging Face Visualizer, now as Chrome extension! https://hfviewer.com ✨ After installing, Hugging Face model pages will have an architecture visualization on the model page itself! 🔗 Link: https://chromewebstore.google.com/detail/hugging-face-viewer/mmadlggmpkpiockpjfepaohcllbnakej Thanks for all the nice feedback so far! ❤️

reacted to HannesVonEssen's post with 🔥 about 3 hours ago

View all activity

Organizations

reacted to mipo57's post with 👀 about 3 hours ago

Post

If you need to redact PII from your data in multiple languages, check out bardsai/eu-pii-anonimization-multilang

We open sourced our internal tooling at

bardsai

Let me know what you think! :D

replied to HannesVonEssen's post about 3 hours ago

Why do things gotta fall in your lap 15 minutes after you need them all the time 💁🏽

reacted to HannesVonEssen's post with 🔥 about 3 hours ago

Post

📣 Hugging Face Visualizer, now as Chrome extension!
https://hfviewer.com

✨ After installing, Hugging Face model pages will have an architecture visualization on the model page itself!

🔗 Link:
https://chromewebstore.google.com/detail/hugging-face-viewer/mmadlggmpkpiockpjfepaohcllbnakej

Thanks for all the nice feedback so far! ❤️

1 reply

reacted to danielhanchen's post with 🔥 1 day ago

Post

7058

We collaborated with NVIDIA to teach you how we made LLM training ~25% faster! 🚀

Learn how 3 optimizations help your home GPU train models faster:
1. Packed-sequence metadata caching
2. Double-buffered checkpoint reloads
3. Faster MoE routing

Guide: https://unsloth.ai/blog/nvidia-collab
GitHub: https://github.com/unslothai/unsloth

replied to MidasRev's post 1 day ago

i started a discussion we can talk about this is on my lr scheduler benchmark space. just go to my hugging face space and click community in the top right corner.

replied to MidasRev's post 1 day ago

What are you using to run your local models? llama.cpp ollama vLLM?

reacted to MidasRev's post with 🧠 1 day ago

Post

145

I would like to know with that spec what the best AI to use for coding

NVIDIA A10 24GO gddr6
AMD EPYC7313p
128Go ddr4ecc

Thanks to ppl who gonna help bcs i cant found a fast enought all 31b models are slow even the idk what to do or config if someone can send me the config too thx !

6 replies

replied to MidasRev's post 1 day ago

Yo, i personally love the Qwen2.5-coder line of models. I use it to adversarially look at code from other models very frequently. with your setup you could use Qwen/Qwen2.5-Coder-14B-Instruct-GGUF and use the q5_0.gguf quantized version. As far as configs go you could set:
Temperature 0.6
Top_P 1.0
Min_P 0
Alternatives you could use would be:
DeepSeek-Coder-V2-Lite-Instruct
Qwen2.5-Coder-7B Q8
For 31b models to fit with your hardware you would have to use q3 quants and the quality is not going to be the greatest. Alternatively you could look into using a service like Modal. They offer free GPU credits monthly. You can run an app as a shell and use ollama through their GPaaS. This gives you varying GPU's with VRAM that will fit to specific models you're looking for. But if completely local is what you want the models I've listed above should fit your needs.

reacted to ManniX-ITA's post with 👍 3 days ago

Post

1397

After the Feb/Mar '26 collapse of Claude Code I started building my own framework. The token crunch is still only mitigated, but the reasoning and quality are back — better than before. For research and LLM
training recipes, though, diverse knowledge and a second point of view are crucial. Pairing my Claude Max sub with an Ollama Pro sub has already saved me from days of botched trainings — multiple frontier
models helping Claude is next level. Acting as the middleman myself was interesting but inefficient, so I shipped skills that let Claude talk to Ollama models directly.

🚀 claude-hooks v1.1.0 ships two LLM-to-LLM skills.

💬 /get-advice — single-shot second opinion. Claude runs a multi-turn conversation with a configured Ollama advisor; the advisor grounds in your project through read_file / grep / glob / list_files /
recall_memory tools. Effort tiers cap fresh-session retries.

🤝 /consultants — multi-agent council for cross-cutting questions:
🧩 planner → researcher → critic → synthesizer

Each role runs its own Ollama model. 💾 Sessions persist to disk (summary.md + transcript.md + SQLite per-role message threads); closed sessions reopen and produce follow-ups 🔁 indistinguishable from warm
ones.

🎯 x-tier effort multiplies diversity:
• xmedium / xhigh — researcher fans across N models in parallel
• xmax — + multi-critic + meta-critic combine; critics anonymized as "Critic 1/2/3" to avoid model-bias

🛡️ Cloud-flap recovery, three layers: 15-attempt / ~15min retry budget; synthesizer failure-fallback model chain; degraded-answer composer surfaces researcher + critic work even when synthesis fails.

📊 7 cloud models benchmarked & Claude-graded on locked queries:
• PROD-READY (P:A R:A C:A S:A): kimi-k2.6, gemma4:31b, glm-5.1
• Role specialists: minimax-m2.7 (critic), qwen3.5 (planner)

🐧 Linux/macOS/Windows. No per-project setup.

🔗 github.com/mann1x/claude-hooks

reacted to danielhanchen's post with 🔥 5 days ago

Post

8594

We made a guide on how to run open LLMs in Claude Code, Codex and OpenClaw.

Use Gemma 4 and Qwen3.6 GGUFs for local agentic coding on 24GB RAM

Run with self-healing tool calls, code execution, web search via the Unsloth API endpoint and llama.cpp

Guide: https://unsloth.ai/docs/basics/api

reacted to Crownelius's post with 😎 6 days ago

Post

3960

Day 4-6 [05/05/2026]
Howdy,

Is anybody else willing to put a second mortgage on their house, just to spend 40k USD in compute credits? Just me? k...

I got dreams, man. The datasets I could build with 40k would be insane.
Somebody called me a genius the other day, they'd be shocked to find out, that I would put my house on the line for 30 days of runpod usage.

What would you do with it?
I would turn arxiv into a dataset. Turn each arxiv paper into a QnA.
Or... maybe if I got 40k USD in credit's Id end up like those 16 lost scientists.

Food for thought.
Anyways, I think I'm going to make a post once a week.
In the meantime you can find me building small llm's in discord here:
https://discord.gg/4DdwS9D8x9

6 replies

reacted to Ujjwal-Tyagi's post with 🧠 8 days ago

Post

205

6 Open-Source Libraries to FineTune LLMs
1. Unsloth
GitHub: https://github.com/unslothai/unsloth
→ Fastest way to fine-tune LLMs locally
→ Optimized for low VRAM (even laptops)
→ Plug-and-play with Hugging Face models

2. Axolotl
GitHub: https://github.com/OpenAccess-AI-Collective/axolotl
→ Flexible LLM fine-tuning configs
→ Supports LoRA, QLoRA, multi-GPU
→ Great for custom training pipelines

3. TRL (Transformer Reinforcement Learning)
GitHub: https://github.com/huggingface/trl
→ RLHF, DPO, PPO for LLM alignment
→ Built on Hugging Face ecosystem
→ Essential for post-training optimization

4. DeepSpeed
GitHub: https://github.com/microsoft/DeepSpeed
→ Train massive models efficiently
→ Memory + speed optimization
→ Industry standard for scaling

5. LLaMA-Factory
GitHub: https://github.com/hiyouga/LLaMA-Factory
→ All-in-one fine-tuning UI + CLI
→ Supports multiple models (LLaMA, Qwen, etc.)
→ Beginner-friendly + powerful

6. PEFT
GitHub: https://github.com/huggingface/peft
→ Fine-tune with minimal compute
→ LoRA, adapters, prefix tuning
→ Best for cost-efficient training

1 reply

replied to Ujjwal-Tyagi's post 8 days ago

Haha just the tip of the iceberg hey? I've been stuck in the library rabbit hole for a good while now and its honestly changes the game entirely.

reacted to Crownelius's post with 🚀 8 days ago

Post

5272

Day 3 - 05/02/2026
Scamp ships, hits the wall. New plan...

Scamp came back from training today... Didn't go so well, I'm still unsure...

Fast benchmark, temperature 0.7, top_p 0.9:
- "Capital of France is" produced "covered by the Crown" (grammatical, factually wrong)
- "23 + 19 = ?" produced "23. Answer: 23. Answer: 23..." (loops, math broken)
- "def fibonacci(n):" produced a list of letters

It speaks English. It can't reason. At 8K vocab and 50M params, it was never going to.

Next build: 412M MoE-3E. Three experts (math, language, code), top-1 routing, random init, let specialization emerge from gradient signal alone. Tried seeded Branch-Train-MiX first then dropped it. Adds compute for no clear win when the router will find its own attractors anyway.

Big lesson today came from limit testing on A100 80GB. Surprise, every planned phase ran out of memory even on 80GB. Root cause: at vocab 262144 (Gemma 3 standard), the output logits dominate during forward and backward. Fix: Liger Kernel's fused cross-entropy. It streams the loss computation instead of materialising the full B by T by vocab tensor. Without it the build would not run.

Scamp proved the pipeline runs end-to-end on real hardware. The 412M run starts tomorrow. If routing balances naturally and math finally crystallises, ships as Crowfeather-412M-3E with GGUF in F16, Q8, Q5, and Q4.

So... the training may have produced a poet if I had done it better. But I didn't, so instead... we get a malformed robot named Scamp... This is progress.

-Shane

P.S Join discord for discussion: https://discord.gg/8ZscHNmJYE and
I post my finished stuff here:

CompactAI-O

2 replies

posted an update 9 days ago

Post

139

I'm not obsessed with LR schedulers you are.

juiceb0xc0de/lr-scheduler-benchmark

Okay maybe I'm a little obsessed with LR schedulers ATM. I ran a SST-2 Sentiment Classification eval using the nyu-mll/glue dataset on distilbert/distilbert-base-uncased-67M to see how different schedulers perform.

I think I've graduated from ML enthusiast to full blown data hoarder and I don't know if I can turn back now.

Anyways I evaluated the 2 schedulers that i designed as well and was pretty happy with the performance of both over all so hell ya to that guess I'll go and grab some more graphs.

https://github.com/JuiceB0xC0de/aecs-scheduler.git
https://github.com/JuiceB0xC0de/lucky-pick-scheduler.git

nyu-mll/glue
distilbert/distilbert-base-uncased

replied to Crownelius's post 10 days ago

I feel this mistake in all of my hidden dimensions.

reacted to sequelbox's post with 👀 10 days ago

Post

3256

EARLY SNEAK PREVIEW of our first DeepSeek-V4-Pro dataset, Tachibana 4!

Tachibana 4 is our upcoming agentic coding dataset:
- Questions prioritize real-world, challenging agentic coding tasks across a variety of programming languages and topics.
- Areas of focus include back-end and front-end development, systems programming, distributed systems, performance optimization, data structures, databases and data engineering, game and mobile development, security engineering, compiler design, custom tooling, task automation, practical bugfixes, and more!
- A wide variety of emphasized languages improves development capability: Python, C, C++, C#, Go, TypeScript, Java, JavaScript, Rust, Haskell, SQL, Shell, R, Ruby, assembly code, and more!
- Synthethic prompts utilize a variety of personas, experience levels, and styles of communication to maximize real-world flexibility and usability.

Get it now: sequelbox/Tachibana4-DeepSeek-V4-Pro-PREVIEW

These agentic datasets will power the upcoming Esper 4, and whatever you can build! We'll have more finetunes on the way as well! :) we're going to make open source better and better for your work!

If you would like to see Esper 4 and these datasets faster, this is the best way you can help us: sequelbox/SupportOpenSource

for freedom, with love,
allegra

reacted to AbstractPhil's post with 🚀 10 days ago

Post

2712

By trying to disprove the Omega H2 battery I have discovered;
* Each topology formed by the H2 battery is deviant, none have a uniformly shared substrate of behavior. They are each uniquely independent per training set all with perfect recon.
* Image recon can be tracked and mapped, yielding a consistently mapped and response 16.77m vocabulary potential. In the current spectrum testing at around 5 million unicode bytes.
* The model scale shows patch size is related to how much data you want the model to represent within the model itself, and this has yet to see a capacity to this day. The MSE recons and yields - and the more data fed, the more they yield.
* The scaling principle shows that the model indefinitely scales upward and each level of the model can be iteratively captured upward to form deviant and uniformly consistent repeatable pathways of implicit codewise response, not just arbitrary bitwise recall. Meaningful implicit learned utility.
* Image recon patch size should match the slice of image you want to represent, as it uses patch smoothing per patch internally from identity.
* byte trigrams are channel-agnostic, they do not require a channel count just a formula for recall at nGram recall 99.6% for byte-by-byte representations. With those comes an adjacently capable codebook.
* sentencepiece preliminary tests show validity and reconstruction just like the byte trigrams, using the new byte trigram this would be arbitrarily convenient to recon a codebook for the structure.
* binary trees learn a uniformly potent and powerful gating mechanism that required further exploration, each of them produces direct responsive independent capacity and the responses are controllable.
* ternary experiments show the models are directly responsive to -1, 0, +1 behavior, so the quantization is very much a valid potential.
* preliminary tests with the H2O1 series of batteries show the models are responding similar to natural universal elements in the universe itself

9 replies

reacted to ereniko's post with 🚀 10 days ago

Post

270

I don't know why, but lately there's been a growing problem on HuggingFace: the platform is filled to the brim with datasets of reasoning traces from large AI models. I feel like someone should address this, but the dataset tab is just full of reasoning and output traces from models like Claude Opus and the model tab is full of fine-tunes trained on these.
What scares me most are the legal consequences of this, and the possibility that all models will start converging on the same tone because everyone is just fine-tuning on everyone else.

1 reply

reacted to kanaria007's post with 🧠 10 days ago

Post

138

✅ Article highlight: *Reference Harness / Minimal Interop Toolchain: The Smallest Executable Loop for 149* (art-60-153, v0.1)

TL;DR:
This article makes cross-vendor interop concrete.

Interop is not real because two vendors say they are “compatible.” It becomes real only when a runnable harness can make them **run the same pack**, **normalize outputs into the same schema**, **emit comparable receipts**, and **compare results under pinned rules**. In SI terms: *run → normalize → receipt → compare*.

Read:
kanaria007/agi-structural-intelligence-protocols

Why it matters:
• turns “cross-vendor interop” from a claim into an executable test loop
• separates reproducibility, comparability, and disclosability instead of blending them
• makes normalization, canonicalization, and comparison rules explicit and pinned
• fails closed when evidence, schemas, reason codes, or toolchain provenance are missing

What’s inside:
• the smallest executable interop loop: *run → normalize → receipt → compare*
• a reference harness contract that every vendor must satisfy
• canonical *normalized interop events* as the shared comparison language
• receipts for vendor runs, normalization, comparability assessment, and cross-vendor verdicts
• explicit comparability mapping: which metric families are *COMPARABLE* and which are *NOT_COMPARABLE*

Key idea:
Interop is not a marketing statement.

It is admissible only when vendors can produce **receipts whose normalized outputs are comparable under a pinned harness, normalization profile, digest procedure, and comparability mapping**.

*Cross-vendor interop becomes real only when a runnable harness can produce comparable receipts.*

R

AI & ML interests

Recent Activity

Organizations

juiceb0xc0de's activity