3 6

Kangjie Zhang

Quagmire1

AI & ML interests

None yet

Recent Activity

upvoted an article 15 days ago

Efficient LLM Pretraining: Packed Sequences and Masked Attention

liked a dataset 19 days ago

General-Medical-AI/Project-Imaging-X

upvoted a paper 7 months ago

LIMI: Less is More for Agency

View all activity

Organizations

None yet

upvoted an article 15 days ago

Article

Efficient LLM Pretraining: Packed Sequences and Masked Attention

Oct 7, 2024

•

liked a dataset 19 days ago

General-Medical-AI/Project-Imaging-X

Viewer • Updated 20 days ago • 8 • 2.35k • 22

upvoted a paper 7 months ago

LIMI: Less is More for Agency

Paper • 2509.17567 • Published Sep 22, 2025 • 104

liked a model 10 months ago

ByteDance-Seed/BAGEL-7B-MoT

Any-to-Any • 15B • Updated Jan 9 • 10.7k • 1.19k

liked a Space 11 months ago

The Ultra-Scale Playbook

🌌

3.8k

The ultimate guide to training LLM on large GPU Clusters

liked 2 models about 1 year ago

deepseek-ai/DeepSeek-V3

Text Generation • 685B • Updated Mar 27, 2025 • 882k • • 4.06k

mistralai/Mixtral-8x7B-v0.1

47B • Updated Jul 24, 2025 • 160k • 1.8k

upvoted an article about 1 year ago

Article

Visualize and understand GPU memory in PyTorch

Dec 24, 2024

•

269

liked a model about 1 year ago

deepseek-ai/DeepSeek-R1

Text Generation • 685B • Updated Mar 27, 2025 • 3.95M • • 13.3k

updated a model over 1 year ago

Quagmire1/wiki-cased

Updated Dec 24, 2024

Kangjie Zhang

AI & ML interests

Recent Activity

Organizations

Quagmire1's activity

Efficient LLM Pretraining: Packed Sequences and Masked Attention

The Ultra-Scale Playbook

Visualize and understand GPU memory in PyTorch