-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 282 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 256 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 264
Erik Thorelli
esthor
AI & ML interests
Quantifying Agent Experience
Organizations
None yet
Audio-to-Text
-
Qwen/Qwen2-Audio-7B-Instruct
Audio-Text-to-Text • 8B • Updated • 736k • 536 -
openai/whisper-large-v3
Automatic Speech Recognition • 2B • Updated • 5.4M • • 5.79k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition • 0.8B • Updated • 8.57M • • 3.07k -
ibm-granite/granite-speech-3.3-8b
Automatic Speech Recognition • 9B • Updated • 61.2k • 171
papers-to-read
-
Reflect, Retry, Reward: Self-Improving LLMs via Reinforcement Learning
Paper • 2505.24726 • Published • 282 -
Reinforcement Pre-Training
Paper • 2506.08007 • Published • 265 -
GLM-4.1V-Thinking: Towards Versatile Multimodal Reasoning with Scalable Reinforcement Learning
Paper • 2507.01006 • Published • 256 -
A Survey of Context Engineering for Large Language Models
Paper • 2507.13334 • Published • 264
Audio-to-Text
-
Qwen/Qwen2-Audio-7B-Instruct
Audio-Text-to-Text • 8B • Updated • 736k • 536 -
openai/whisper-large-v3
Automatic Speech Recognition • 2B • Updated • 5.4M • • 5.79k -
openai/whisper-large-v3-turbo
Automatic Speech Recognition • 0.8B • Updated • 8.57M • • 3.07k -
ibm-granite/granite-speech-3.3-8b
Automatic Speech Recognition • 9B • Updated • 61.2k • 171
models 0
None public yet
datasets 0
None public yet