ZhangJin

eliebak, cmpatino, anton-l, edbeeching, m-ric, nouamanetazi, akseljoonas, guipenedo, hynky, clefourrier, SaylorTwift, kashif, qgallouedec, hlarcher, glutamatt, Xenova, reach-vb, ngxson, craffel, lewtun, loubnabnl, lvwerra, thomwolf

•

Jul 8, 2025

• 783

upvoted a paper about 1 year ago

Pre-Trained Policy Discriminators are General Reward Models

Paper • 2507.05197 • Published Jul 7, 2025 • 39

upvoted 3 articles about 1 year ago

Article

Open-source DeepResearch – Freeing our search agents

m-ric, albertvillanova, merve, thomwolf, clefourrier

•

Feb 4, 2025

• 1.32k

Article

The Common Pile v0.1

stellaathena

•

Jun 6, 2025

• 54

Article

PipelineRL

ServiceNow

•

Apr 25, 2025

• 46

upvoted a paper over 1 year ago

InternVL3: Exploring Advanced Training and Test-Time Recipes for Open-Source Multimodal Models

Paper • 2504.10479 • Published Apr 14, 2025 • 311

upvoted 3 articles over 1 year ago

Article

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

Kseniase

•

Apr 4, 2025

• 16

Article

What changed in the Transformer architecture

rishiraj

•

Mar 8, 2025

• 18

Article

Common AI Model Formats

ngxson

•

Feb 27, 2025

• 73

upvoted a paper over 1 year ago

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published Feb 24, 2025 • 73

upvoted 2 articles over 1 year ago

Article

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

NormalUhr

•

Feb 7, 2025

• 297

Article

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference

mfuntowicz, hlarcher

•

Jan 16, 2025

• 76

ZhangJin

AI & ML interests

Recent Activity

Organizations

Benjamin0's activity

Visualize and understand GPU memory in PyTorch

SmolLM3: smol, multilingual, long-context reasoner

Open-source DeepResearch – Freeing our search agents

The Common Pile v0.1

PipelineRL

Topic 33: Slim Attention, KArAt, XAttention and Multi-Token Attention Explained – What’s Really Changing in Transformers?

What changed in the Transformer architecture

Common AI Model Formats

DeepSeek-R1 Dissection: Understanding PPO & GRPO Without Any Prior Reinforcement Learning Knowledge

Introducing multi-backends (TRT-LLM, vLLM) support for Text Generation Inference