jp1924

jp1924

AI & ML interests

Audio, Image, Text

Recent Activity

new activity about 6 hours ago

naver-hyperclovax/HyperCLOVAX-SEED-Think-32B:Update chat_template.jinja

upvoted a paper 1 day ago

DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning

upvoted a paper 3 days ago

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

View all activity

Organizations

New activity in naver-hyperclovax/HyperCLOVAX-SEED-Think-32B about 6 hours ago

Update chat_template.jinja

#12 opened about 2 months ago by

jp1924

upvoted a paper 1 day ago

DVAO: Dynamic Variance-adaptive Advantage Optimization for Multi-reward Reinforcement Learning

Paper • 2605.25604 • Published 4 days ago • 128

upvoted a paper 3 days ago

SkillOpt: Executive Strategy for Self-Evolving Agent Skills

Paper • 2605.23904 • Published 7 days ago • 186

upvoted 2 papers 7 days ago

Anti-Self-Distillation for Reasoning RL via Pointwise Mutual Information

Paper • 2605.11609 • Published 17 days ago • 193

Process Rewards with Learned Reliability

Paper • 2605.15529 • Published 14 days ago • 53

liked a dataset 9 days ago

TeichAI/DeepSeek-v4-Pro-Agent

Traces • Updated 7 days ago • 4.01k • 3.75k • 68

upvoted a paper about 1 month ago

RLPR: Extrapolating RLVR to General Domains without Verifiers

Paper • 2506.18254 • Published Jun 23, 2025 • 35

liked 2 datasets about 1 month ago

nvidia/Nemotron-Personas-Korea

Viewer • Updated Apr 23 • 1M • 49.2k • 477

allenai/RLVR-IFeval

Viewer • Updated Nov 21, 2024 • 15k • 1.08k • 32

upvoted 2 papers about 1 month ago

Reinforcement-aware Knowledge Distillation for LLM Reasoning

Paper • 2602.22495 • Published Feb 26 • 5

Good SFT Optimizes for SFT, Better SFT Prepares for Reinforcement Learning

Paper • 2602.01058 • Published Feb 1 • 45

liked a Space about 1 month ago

LLM Embeddings Explained: A Visual and Intuitive Guide

🚀

345

How Language Models Turn Text into Meaning, From Traditional

liked a dataset about 1 month ago

llamaindex/ParseBench

Benchmark • Updated Apr 19 • 169k • 48.6k • 88

New activity in naver-hyperclovax/HyperCLOVAX-SEED-Think-32B about 2 months ago

HyperCLOVAX-SEED-32B 모델의 `model_type`을 `hyperclovax_vision_v2`으로 변경 요청드립니다 (transformers PR 연계)

#11 opened 2 months ago by

jp1924

New activity in naver-hyperclovax/HyperCLOVAX-SEED-Think-32B 2 months ago

test

#10 opened 2 months ago by

jp1924

liked a dataset 3 months ago

aiqwe/FinShibainu

Viewer • Updated Dec 18, 2024 • 87.4k • 193 • 7

liked a Space 3 months ago

CircleCI Test Collection Helper Space

📊

Query test results for a PR

updated a dataset 3 months ago

jp1924/PatternedUtteranceWithNumber

Preview • Updated Feb 25 • 20

liked a model 3 months ago

OpenMOSS-Team/MOSS-Audio-Tokenizer

Feature Extraction • 2B • Updated Feb 13 • 78.5k • 43

jp1924

AI & ML interests

Recent Activity

Organizations

jp1924's activity

Update chat_template.jinja

LLM Embeddings Explained: A Visual and Intuitive Guide

HyperCLOVAX-SEED-32B 모델의 `model_type`을 `hyperclovax_vision_v2`으로 변경 요청드립니다 (transformers PR 연계)

test

CircleCI Test Collection Helper Space