Zizhuo Zhang PRO

resistz

29 4 2

resistzzz

AI & ML interests

None yet

Recent Activity

upvoted a paper about 1 month ago

Your Agents Are Aging Too: Agent Lifespan Engineering for Deployed Systems

upvoted a paper about 2 months ago

Rethinking How to Remember: Beyond Atomic Facts in Lifelong LLM Agent Memory

updated a model 7 months ago

resistz/GT-GRPO_Llama-3.2-3B-Instruct_NQ-HotpotQA

View all activity

Organizations

upvoted a paper about 1 month ago

Your Agents Are Aging Too: Agent Lifespan Engineering for Deployed Systems

Paper • 2605.26302 • Published May 25 • 33

upvoted a paper about 2 months ago

Rethinking How to Remember: Beyond Atomic Facts in Lifelong LLM Agent Memory

Paper • 2605.19952 • Published May 19 • 10

updated a model 7 months ago

resistz/GT-GRPO_Llama-3.2-3B-Instruct_NQ-HotpotQA

Updated Dec 22, 2025

published a model 7 months ago

resistz/GT-GRPO_Llama-3.2-3B-Instruct_NQ-HotpotQA

Updated Dec 22, 2025

updated a model 7 months ago

TMLR-Group-HF/Co-rewarding-III-Llama-3.2-3B-Instruct-DAPO14k

4B • Updated Dec 21, 2025 • 5

updated a collection 7 months ago

Co-rewarding

Collection

Co-rewarding is a novel self-supervised RL framework that improves training stability by seeking complementary supervision from another views. • 75 items • Updated Dec 21, 2025 • 1

published a model 7 months ago

TMLR-Group-HF/Co-rewarding-III-Llama-3.2-3B-Instruct-DAPO14k

4B • Updated Dec 21, 2025 • 5

updated a model 7 months ago

TMLR-Group-HF/Co-rewarding-III-Qwen3-4B-Base-DAPO14k

4B • Updated Dec 21, 2025 • 4

published a model 7 months ago

TMLR-Group-HF/Co-rewarding-III-Qwen3-4B-Base-DAPO14k

4B • Updated Dec 21, 2025 • 4

updated a model 7 months ago

TMLR-Group-HF/Co-rewarding-III-Qwen3-8B-Base-DAPO14k

8B • Updated Dec 21, 2025 • 3

updated a collection 7 months ago

Co-rewarding

Collection

Co-rewarding is a novel self-supervised RL framework that improves training stability by seeking complementary supervision from another views. • 75 items • Updated Dec 21, 2025 • 1

published a model 7 months ago

TMLR-Group-HF/Co-rewarding-III-Qwen3-8B-Base-DAPO14k

8B • Updated Dec 21, 2025 • 3

updated a model 7 months ago

TMLR-Group-HF/Co-rewarding-III-Llama-3.2-3B-Instruct-MATH

4B • Updated Dec 21, 2025 • 7

published a model 7 months ago

TMLR-Group-HF/Co-rewarding-III-Llama-3.2-3B-Instruct-MATH

4B • Updated Dec 21, 2025 • 7

updated a model 7 months ago

TMLR-Group-HF/Co-rewarding-III-Qwen3-4B-Base-MATH

4B • Updated Dec 21, 2025 • 6 • 1

published a model 7 months ago

TMLR-Group-HF/Co-rewarding-III-Qwen3-4B-Base-MATH

4B • Updated Dec 21, 2025 • 6 • 1

Zizhuo Zhang PRO

AI & ML interests

Recent Activity

Organizations

resistz's activity