ChengQ's picture

5

ChengQ

0Cheng0

·

qiancheng0

AI & ML interests

Natural Language Processing

Recent Activity

authored a paper about 4 hours ago

Context Engineering for Trustworthiness: Rescorla Wagner Steering Under Mixed and Inappropriate Contexts

authored a paper about 4 hours ago

EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents

authored a paper about 4 hours ago

The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning

View all activity

Organizations

None yet

authored 20 papers about 4 hours ago

Context Engineering for Trustworthiness: Rescorla Wagner Steering Under Mixed and Inappropriate Contexts

Paper • 2509.04500 • Published Sep 2, 2025 • 5

EscapeBench: Towards Advancing Creative Intelligence of Language Model Agents

Paper • 2412.13549 • Published Dec 18, 2024

The Right Time Matters: Data Arrangement Affects Zero-Shot Generalization in Instruction Tuning

Paper • 2406.11721 • Published Jun 17, 2024

Self-Improving LLM Agents at Test-Time

Paper • 2510.07841 • Published Oct 9, 2025 • 10

Atomic Reasoning for Scientific Table Claim Verification

Paper • 2506.06972 • Published Jun 8, 2025

ShortageSim: Simulating Drug Shortages under Information Asymmetry

Paper • 2509.01813 • Published Sep 1, 2025

xRouter: Training Cost-Aware LLMs Orchestration System via Reinforcement Learning

Paper • 2510.08439 • Published Oct 9, 2025 • 1

CostBench: Evaluating Multi-Turn Cost-Optimal Planning and Adaptation in Dynamic Environments for LLM Tool-Use Agents

Paper • 2511.02734 • Published Nov 4, 2025 • 22

LoCoBench-Agent: An Interactive Benchmark for LLM Agents in Long-Context Software Engineering

Paper • 2511.13998 • Published Nov 17, 2025 • 3

JustRL: Scaling a 1.5B LLM with a Simple RL Recipe

Paper • 2512.16649 • Published Dec 18, 2025 • 27

From Word to World: Can Large Language Models be Implicit Text-based World Models?

Paper • 2512.18832 • Published Dec 21, 2025 • 15

Geometric-Disentangelment Unlearning

Paper • 2511.17100 • Published Nov 21, 2025

ISACL: Internal State Analyzer for Copyrighted Training Data Leakage

Paper • 2508.17767 • Published Aug 25, 2025 • 1

Agentic Reasoning for Large Language Models

Paper • 2601.12538 • Published Jan 18 • 204

Steer2Adapt: Dynamically Composing Steering Vectors Elicits Efficient Adaptation of LLMs

Paper • 2602.07276 • Published Feb 7 • 11

Tool-R0: Self-Evolving LLM Agents for Tool-Learning from Zero Data

Paper • 2602.21320 • Published Feb 24 • 12

How Far Can Unsupervised RLVR Scale LLM Training?

Paper • 2603.08660 • Published Mar 9 • 59

Rethinking On-Policy Distillation of Large Language Models: Phenomenology, Mechanism, and Recipe

Paper • 2604.13016 • Published Apr 14 • 105

PEARL: Self-Evolving Assistant for Time Management with Reinforcement Learning

Paper • 2601.11957 • Published Jan 28 • 3

Veri-R1: Toward Precise and Faithful Claim Verification via Online Reinforcement Learning

Paper • 2510.01932 • Published Oct 4, 2025