YoungMin

Rickie77

20 56

AI & ML interests

None yet

Recent Activity

reacted to SeaWolf-AI's post with 🚀 about 20 hours ago

A small gift for anyone building or studying foundation models. Most "open" models hand you the weights and stop there. With Aether-7B-5Attn we wanted to hand over the whole thing — so you can actually learn from it, reproduce it, and build on it: the data recipe, the training code, every hyperparameter, the complete logs, and the intermediate checkpoints. All Apache-2.0, reproducible byte-for-byte. What you can do with it: 🔁 Rebuild it from scratch, or fork the recipe for your own model 🔬 Study a real heterogeneous-attention MoE — 49 layers place 5 attention mechanisms on a 7×7 Latin square, arranged as a clean, attributable ablation 📈 Trace training dynamics across the released checkpoints (110k / 115k / 162k) It's a modest 6.59B model, and an honest one — the limitations (no KV-cache in this build, small scale) are written right in the card. We're not claiming it's special. If any piece of it saves you time or teaches you something, that's exactly what we hoped for. 🤗 📖 Full write-up → [blog] · https://huggingface.co/blog/FINAL-Bench/opensource-llm 📦 5 Attention Base · https://huggingface.co/FINAL-Bench/Aether-7B-5Attn 🎯 5 Attention Instruct · https://huggingface.co/FINAL-Bench/Aether-7B-5Attn-it 🚀 5 Attention Live demo · https://huggingface.co/spaces/FINAL-Bench/Aether-Sovereign-AI 📦 7 Attention Base · https://huggingface.co/FINAL-Bench/Aether-7B-7Attn-base 📦 11 Attention Base · https://huggingface.co/FINAL-Bench/Aether-6B-11Attn-base 🧬 Collection · https://huggingface.co/collections/FINAL-Bench/aether-foundation-model #opensource #LLM #MoE #reproducibility #Apache2

reacted to SeaWolf-AI's post with 🔥 about 20 hours ago

reacted to SeaWolf-AI's post with ❤️ about 20 hours ago

View all activity

Organizations

None yet

reacted to SeaWolf-AI's post with 🚀🔥❤️ about 20 hours ago

Post

1601

A small gift for anyone building or studying foundation models.

Most "open" models hand you the weights and stop there. With Aether-7B-5Attn we wanted to hand over the whole thing — so you can actually learn from it, reproduce it, and build on it: the data recipe, the training code, every hyperparameter, the complete logs, and the intermediate checkpoints. All Apache-2.0, reproducible byte-for-byte.

What you can do with it:
🔁 Rebuild it from scratch, or fork the recipe for your own model
🔬 Study a real heterogeneous-attention MoE — 49 layers place 5 attention mechanisms on a 7×7 Latin square, arranged as a clean, attributable ablation
📈 Trace training dynamics across the released checkpoints (110k / 115k / 162k)

It's a modest 6.59B model, and an honest one — the limitations (no KV-cache in this build, small scale) are written right in the card. We're not claiming it's special. If any piece of it saves you time or teaches you something, that's exactly what we hoped for. 🤗

📖 Full write-up →
[blog] · https://huggingface.co/blog/FINAL-Bench/opensource-llm
📦 5 Attention Base · FINAL-Bench/Aether-7B-5Attn
🎯 5 Attention Instruct · FINAL-Bench/Aether-7B-5Attn-it
🚀 5 Attention Live demo · FINAL-Bench/Aether-Sovereign-AI
📦 7 Attention Base · https://huggingface.co/FINAL-Bench/Aether-7B-7Attn-base
📦 11 Attention Base · FINAL-Bench/Aether-6B-11Attn-base
🧬 Collection · https://huggingface.co/collections/FINAL-Bench/aether-foundation-model

#opensource #LLM #MoE #reproducibility #Apache2

2 replies

liked a dataset about 20 hours ago

FINAL-Bench/Aether-7B-5Attn-checkpoints

Updated about 3 hours ago • 15

liked a model about 20 hours ago

FINAL-Bench/Aether-7B-5Attn

Text Generation • 7B • Updated about 1 hour ago • 2 • 25

liked a Space about 20 hours ago

Aether-7B-5Attn — 100% Open-source Foundation Model: Sovereign AI

🌏

100% open-source sovereign foundation model (Apache-2.0)

upvoted a collection about 20 hours ago

'Aether' Foundation Model

Collection

5 items • Updated about 1 hour ago • 23

upvoted an article about 20 hours ago

Article

Aether-7B-5Attn: A 100% Open-Source Sovereign Foundation Model — and a Controlled Experiment in Heterogeneous Attention

FINAL-Bench

•

about 20 hours ago

• 14

reacted to SeaWolf-AI's post with 👍❤️🔥 8 days ago

Post

5372

🔵 VKUE — No GPU? Runs anyway.

"Frontier models need a datacenter GPU" rests on a hidden assumption: that the model reads ALL its parameters every token. Decode is memory-bandwidth bound — sweep 34B params/token and an 8 GB card dies at 1–2 tok/s.

So we ran ONE 34.7B reasoning model — Ourbox-35B-JGOS, a sparse Mixture-of-Experts — as the identical weights across the whole hardware spectrum. All measured:

• B200: 18,057 tok/s (aggregate)
• 1× A10G: 126 tok/s
• 8 GB laptop (RTX 5060): 20 tok/s
• GPU-less CPU: 17 tok/s

Why it works: Ourbox holds 34.7B params but only ~3B are active per token (256 experts, top-8). Since decode is bandwidth-bound, a dense 34B moves ~16.7 GB/token while Ourbox moves ~1.45 GB — ~11× less traffic. Put the experts in system RAM, keep attention/router/shared on the GPU, and a 34.7B reasoner runs on an 8 GB laptop — or no GPU at all.

Sparsity alone, proven (same laptop, same quant, ~same footprint): Ourbox-35B (A3B) 20.01 tok/s vs Qwen2.5-32B (dense) 5.36 → 3.7× from sparsity alone, ~2× the best dense-32B on any 8 GB machine. Not a toy: GPQA Diamond 86.4% (maj@8).

Try it live (same prompt, GPU vs GPU-less CPU, live tok/s). Honest scope: one machine's measurements; the CPU path proves it RUNS without a GPU, not that it beats one.

📝 Article: https://huggingface.co/blog/FINAL-Bench/vkue
🔵 GPU vs CPU demo: https://final-bench-ourbox-35b-vkue-demo.hf.space/
🔵 CPU-only demo: https://final-bench-ourbox-35b-vkue-cpu.hf.space
📊 VKUE leaderboard: FINAL-Bench/VKUE
🤗 Model: FINAL-Bench/Ourbox-35B-JGOS-GGUF
⚡ VKAE (speed): VIDraft/vkae

VKUE is the "runs anywhere" side of our serving line; VKAE the "fast on datacenter GPUs" side. VKAE is fast; VKUE is everywhere.

7 replies

upvoted an article 8 days ago

Article

VKUE: No GPU? Runs Anyway — a 34.7B Reasoner on a Laptop and on Bare CPU

FINAL-Bench

•

8 days ago

• 17

liked a model 8 days ago

FINAL-Bench/Ourbox-35B-JGOS-GGUF

Text Generation • 35B • Updated 6 days ago • 2.04k • 26

liked a Space 8 days ago

VKUE Efficiency Leaderboard

🌐

Serve frontier-class models on minimal hardware — VKUE

upvoted an article 15 days ago

Article

Quantum Cryptanalysis on Real Hardware: Pushing Symmetric-Structure Key Recovery Beyond the Published Frontier

FINAL-Bench

•

15 days ago

• 15

reacted to SeaWolf-AI's post with 😔😎👍❤️🔥 15 days ago

Post

5171

🔓 We ran genuine quantum key-recovery on 'real IBM quantum hardware' — and pushed the frontier well past the largest hardware demos we're aware of (which sat at N=4).

Using Simon's algorithm on ibm_kingston, we recovered the secret key of two symmetric-cipher structures:
• Even–Mansour — N=5 → N=10
• 3-round Feistel (DES-family) — block 6 → 8

Each verified against an 'independent control key', using error mitigation only (no QEC).

🧭 Honest scope: this is not a quantum speedup (the effective difficulty tracks the classical birthday bound ~2^{n/2}), not a break of real AES/RSA, and not 16-round DES (ours is 3-round). The recovery method is reserved for a forthcoming paper; formal record status is pending peer review.

📄 Write-up: https://huggingface.co/blog/FINAL-Bench/quantum
🕹️ Try it live in your browser: https://vidraft-quantumos.hf.space/crypto
🏆 Leaderboard: FINAL-Bench/quantum-bench-leaderboard

#quantum #cryptography #quantumcomputing

YoungMin

AI & ML interests

Recent Activity

Organizations

Rickie77's activity

Aether-7B-5Attn — 100% Open-source Foundation Model: Sovereign AI

Aether-7B-5Attn: A 100% Open-Source Sovereign Foundation Model — and a Controlled Experiment in Heterogeneous Attention

VKUE: No GPU? Runs Anyway — a 34.7B Reasoner on a Laptop and on Bare CPU

VKUE Efficiency Leaderboard

Quantum Cryptanalysis on Real Hardware: Pushing Symmetric-Structure Key Recovery Beyond the Published Frontier