Jiarui Yao

FlippyDora

·

AI & ML interests

None yet

Recent Activity

upvoted a paper about 7 hours ago

Molt: A Scalable PyTorch-Native Training Framework for Agentic Reinforcement Learning

updated a dataset 2 days ago

jrtmp/flash_attn-2.8.3.post1-cuda12.9-torch2.10-cp312-cp312-linux_x86_64.whl

published a dataset 2 days ago

jrtmp/flash_attn-2.8.3.post1-cuda12.9-torch2.10-cp312-cp312-linux_x86_64.whl

View all activity

Organizations

upvoted a paper about 7 hours ago

Molt: A Scalable PyTorch-Native Training Framework for Agentic Reinforcement Learning

Paper • 2607.21653 • Published 6 days ago • 15

updated a dataset 2 days ago

jrtmp/flash_attn-2.8.3.post1-cuda12.9-torch2.10-cp312-cp312-linux_x86_64.whl

Updated 2 days ago • 17

published a dataset 2 days ago

jrtmp/flash_attn-2.8.3.post1-cuda12.9-torch2.10-cp312-cp312-linux_x86_64.whl

Updated 2 days ago • 17

upvoted a paper 3 days ago

Predictive Divergence Masks for LLM RL

Paper • 2607.10848 • Published 16 days ago • 9

submitted a paper to Daily Papers 3 days ago

Predictive Divergence Masks for LLM RL

Paper • 2607.10848 • Published 16 days ago • 9

upvoted a paper 10 days ago

MeanFlowNFT: Bringing Forward-Process RL to Average-Velocity Generators

Paper • 2607.15273 • Published 12 days ago • 17

updated a dataset 14 days ago

harnessRL/opengame-baselines

Updated 14 days ago • 18

upvoted a paper 18 days ago

Single-Rollout Asynchronous Optimization for Agentic Reinforcement Learning

Paper • 2607.07508 • Published 20 days ago • 26

published a dataset 19 days ago

harnessRL/opengame-baselines

Updated 14 days ago • 18

upvoted 3 papers 19 days ago

DSpark: Confidence-Scheduled Speculative Decoding with Semi-Autoregressive Generation

Paper • 2607.05147 • Published 22 days ago • 37

Hierarchical Sparse Attention Done Right: Toward Infinite Context Modeling

Paper • 2607.02980 • Published 25 days ago • 81

Nemotron-Labs-Diffusion: A Tri-Mode Language Model Unifying Autoregressive, Diffusion, and Self-Speculation Decoding

Paper • 2607.05722 • Published 21 days ago • 13

updated a dataset 26 days ago

jrtmp/seedsweep_taylor_scalar

Updated 26 days ago • 59

published a dataset 26 days ago

jrtmp/seedsweep_taylor_scalar

Updated 26 days ago • 59

upvoted a paper 27 days ago

Nemotron-Labs-Diffusion-Image: Advancing Masked Discrete Diffusion for High-Resolution Image Synthesis

Paper • 2606.29814 • Published 29 days ago • 15

upvoted 2 papers 28 days ago

PhysisForcing: Physics Reinforced World Simulator for Robotic Manipulation

Paper • 2606.28128 • Published Jun 26 • 53

GBC: Gradient-Based Connections for Optimizing Multi-Agent Systems

Paper • 2606.28187 • Published Jun 26 • 13

upvoted a paper about 1 month ago

Nemotron 3 Ultra: Open, Efficient Mixture-of-Experts Hybrid Mamba-Transformer Model for Agentic Reasoning

Paper • 2606.15007 • Published Jun 12 • 19

upvoted 2 papers about 2 months ago

Flow-DPPO: Divergence Proximal Policy Optimization for Flow Matching Models

Paper • 2606.11025 • Published Jun 9 • 41

Rethinking the Divergence Regularization in LLM RL

Paper • 2606.09821 • Published Jun 8 • 34