Taeho Hwang

doubleyyh

·

ThisIsHwang

AI & ML interests

None yet

Recent Activity

upvoted a paper 24 days ago

ELDR: Expert-Locality-Aware Decode Routing for PD-Disaggregated MoE Serving

new activity 5 months ago

Qwen/Qwen3.5-27B:Value error, Model architectures ['Qwen3_5ForConditionalGeneration'] are not supported for now. Transformers version 5.3.0.dev0

reacted to sergiopaniego's post with 🚀 5 months ago

TRL v0.27.0 is out!! 🥳 It includes GDPO, the latest variant of GRPO for multi-reward RL ✨ GDPO decouples reward normalization to avoid reward collapse and improve per-reward convergence — developed by @sliuau @SimonX et al. Explore the paper: https://huggingface.co/papers/2601.05242 Explore the full set of changes here: https://github.com/huggingface/trl/releases/tag/v0.27.0

View all activity

Organizations

doubleyyh 's models 4

doubleyyh/email-tuned-qwen2-lora

Text Generation • Updated Dec 26, 2024 • 1

doubleyyh/mixed-bge-m3-email

Sentence Similarity • 0.6B • Updated Dec 25, 2024 • 73

doubleyyh/exit-gemma-2b

Updated Dec 21, 2024 • 2

doubleyyh/exit-gemma-7b

Updated Dec 21, 2024