Lize Pirenne's picture

Lize Pirenne

Inversta

·

Pangasius

AI & ML interests

LLMs, RL

Recent Activity

upvoted a paper 1 day ago

Representation Forcing for Bottleneck-Free Unified Multimodal Models

upvoted a paper 19 days ago

Hölder Policy Optimisation

upvoted a paper 20 days ago

Steered LLM Activations are Non-Surjective

View all activity

Organizations

None yet

upvoted a paper 1 day ago

Representation Forcing for Bottleneck-Free Unified Multimodal Models

Paper • 2605.31604 • Published 26 days ago • 61

upvoted a paper 19 days ago

Hölder Policy Optimisation

Paper • 2605.12058 • Published May 12 • 21

upvoted 3 papers 20 days ago

Steered LLM Activations are Non-Surjective

Paper • 2604.09839 • Published May 7 • 14

Reinforcement Learning for LLM-based Multi-Agent Systems through Orchestration Traces

Paper • 2605.02801 • Published May 4 • 9

OpenSearch-VL: An Open Recipe for Frontier Multimodal Search Agents

Paper • 2605.05185 • Published May 6 • 106

upvoted 2 papers 21 days ago

Asymmetric Flow Models

Paper • 2605.12964 • Published May 13 • 22

Mean Mode Screaming: Mean--Variance Split Residuals for 1000-Layer Diffusion Transformers

Paper • 2605.06169 • Published May 7 • 237

upvoted 5 papers about 2 months ago

LLM Safety From Within: Detecting Harmful Content with Internal Representations

Paper • 2604.18519 • Published Apr 20 • 26

AgentSPEX: An Agent SPecification and EXecution Language

Paper • 2604.13346 • Published Apr 14 • 167

Elucidating the SNR-t Bias of Diffusion Probabilistic Models

Paper • 2604.16044 • Published Apr 17 • 73

LLaDA2.0-Uni: Unifying Multimodal Understanding and Generation with Diffusion Large Language Model

Paper • 2604.20796 • Published Apr 22 • 244

Attention Sink in Transformers: A Survey on Utilization, Interpretation, and Mitigation

Paper • 2604.10098 • Published Apr 11 • 82

upvoted 8 papers 2 months ago

The Past Is Not Past: Memory-Enhanced Dynamic Reward Shaping

Paper • 2604.11297 • Published Apr 13 • 144

Reinforcement Learning via Value Gradient Flow

Paper • 2604.14265 • Published Apr 15 • 7

Continuous Adversarial Flow Models

Paper • 2604.11521 • Published Apr 13 • 11

From P(y|x) to P(y): Investigating Reinforcement Learning in Pre-train Space

Paper • 2604.14142 • Published Apr 15 • 30

FP4 Explore, BF16 Train: Diffusion Reinforcement Learning via Efficient Rollout Scaling

Paper • 2604.06916 • Published Apr 8 • 34

INSPATIO-WORLD: A Real-Time 4D World Simulator via Spatiotemporal Autoregressive Modeling

Paper • 2604.07209 • Published Apr 8 • 38

DMax: Aggressive Parallel Decoding for dLLMs

Paper • 2604.08302 • Published Apr 9 • 54

TriAttention: Efficient Long Reasoning with Trigonometric KV Compression

Paper • 2604.04921 • Published Apr 6 • 116