ZeroGPU Explorers

community

Activity Feed

AI & ML interests

None defined yet.

Recent Activity

LanguageBind submitted a paper 17 days ago

GEAR: Guided End-to-End AutoRegression for Image Synthesis

ymoslem authored a paper 18 days ago

Cluster, Route, Escalate: Cascaded Framework for Cost-Aware LLM Serving

ymoslem submitted a paper 18 days ago

Cluster, Route, Escalate: Cascaded Framework for Cost-Aware LLM Serving

View all activity

garibida

submitted a paper to Daily Papers 4 days ago

Latent-Identity Tuning in Text-to-Image Personalization Models

Paper • 2607.11885 • Published 5 days ago • 10

sayakpaul

authored 3 papers 8 days ago

submitted a paper to Daily Papers 8 days ago

Flash-BoN: Instant Drafts for Inference-Time Scaling in Diffusion Models

Paper • 2607.04461 • Published 13 days ago • 9

caizhongang

authored a paper 8 days ago

Vision as Unified Multimodal Generation

Paper • 2607.06560 • Published 11 days ago • 46

fffiloni

posted an update 13 days ago

Post

2008

You can explore and try my Agentic Space Factory:
fffiloni/agentic-space-factory-UI-public

Be kind 🤗

fffiloni

posted an update 15 days ago

Post

1799

I made a Hugging Face Space for SCAIL-2 🤗

Reference character + driving motion → animated result.

A simple demo to explore the paper’s core workflow with curated examples.

👉 fffiloni/SCAIL-2

1 reply

fffiloni

posted an update 16 days ago

Post

805

⏱️ Built a small Space for Visual Chronometer / Pulse of Motion.

Upload a video and estimate its Physical FPS: the frame rate implied by visual motion, independent of metadata.
Useful to inspect “chronometric hallucination” in generated videos: clips that look smooth, but move with the wrong physical time scale.

Try it here: fffiloni/Pulse-of-Motion

1 reply

ymoslem

authored a paper 18 days ago

Cluster, Route, Escalate: Cascaded Framework for Cost-Aware LLM Serving

Paper • 2606.27457 • Published 23 days ago • 4

ymoslem

submitted a paper to Daily Papers 18 days ago

Cluster, Route, Escalate: Cascaded Framework for Cost-Aware LLM Serving

Paper • 2606.27457 • Published 23 days ago • 4

IliaLarchenko

submitted a paper to Daily Papers 19 days ago

Learning to Fold: prizewinning solution at LeHome Challenge 2026 (1st place online, 2nd offline)

Paper • 2606.27163 • Published 23 days ago • 5

codelion

posted an update 19 days ago

Post

366

SPROG-9M — a 9.37M parameter model trained from scratch to solve GSM8K-style math without using an LLM at inference.

The model, codelion/sprog-9m, predicts symbolic programs over number slots, then a deterministic executor does the arithmetic. With a simple verifier, it reaches ~11.8% on GSM8K test.

We also released the dataset: codelion/gsm8k-synth, 117K validated synthetic GSM8K-style problems.

Tiny model, no pretraining, no LLM at inference, runs on a laptop.

IliaLarchenko

authored a paper 20 days ago

Learning to Fold: prizewinning solution at LeHome Challenge 2026 (1st place online, 2nd offline)

Paper • 2606.27163 • Published 23 days ago • 5

IliaLarchenko

posted an update 21 days ago

Post

188

I placed 🥈 2nd in the LeHome Challenge (ICRA 2026), and 🥇 1st of 62 teams in the first simulation round. Now I'm open-sourcing the full solution — code, tech report, and final weights.

The task: teach a cheap two-armed robot (SO-ARM101) to fold 4 garment types — long/short tops and pants. Garment category is hidden at eval. Round 1 in sim (auto-scored), round 2 on a real robot (jury-scored).

I trained a VLA policy with an RL loop on top. The key ideas:

🧠 The policy is its own value function. From the same forward pass that picks the next action chunk, cheap heads predict success probability, task completion %, garment type, and future keypoint distances + a Q-residual. Those become the advantage signal for RL — no separate critic.

🔁 A fully asynchronous RL loop coordinated only through the HF Hub: 1 trainer (H200) ships a fresh checkpoint ~every 40 min while N rollout workers (and a human doing teleop / DAgger corrections) collect data in parallel. Nobody waits — it uses the off-policy nature of the loop to the fullest.

📈 Binary success is too sparse, so I densify it into per-frame advantage via GAE — from objective keypoint checkpoints, the success-probability value baseline, and completion %.

🎛️ The RL combines AWR + RECAP. I also tune the inference knobs — execution length, playback speed, inpainting overlap, CFG scale, best-of-N — with a per-parameter Thompson-sampling bandit folded into rollout collection.

🔧 Round 2: with only ~1 week and no access to the eval robot — so the pipeline was sim → my robot → their robot, leaning on heavy augmentation to make the policy more robust.

📝 Blog: https://ilialarchenko.com/projects/lehome2026
📄 Tech report: Learning to Fold: prizewinning solution at LeHome Challenge 2026 (1st place online, 2nd offline) (2606.27163)
🔧 Code: https://github.com/IliaLarchenko/lehome_solution
🤗 Sim policy: IliaLarchenko/lehome_sim
🤗 Real policy: IliaLarchenko/lehome_real
🌐 Challenge: https://lehome-challenge.com/

fffiloni

posted an update 21 days ago

Post

1560

A few weeks ago, @victor opened the door: coding agents can now ship Hugging Face Spaces autonomously.

I pulled on that thread.

As someone who builds and ships Gradio demos regularly, I didn’t just want to reproduce the loop. I wanted to see what happens when that loop is plugged into the whole Hugging Face stack.

The interesting part is not only that an agent can ship a Space.

It’s what happens when Space generation becomes a first-class Hugging Face workflow.

That became Agentic Space Factory.

More soon. 🤗