arxiv:2602.04705
Junyuan Shang
sjy1203
ยท
AI & ML interests
NLP
Recent Activity
authored
a paper
1 day ago
DHA: Learning Decoupled-Head Attention from Transformer Checkpoints via Adaptive Heads Fusion
authored
a paper
1 day ago
NACL: A General and Effective KV Cache Eviction Framework for LLMs at Inference Time
authored
a paper
1 day ago
ERNIE 5.0 Technical Report
Organizations
None yet