arxiv:2602.22675

Search More, Think Less: Rethinking Long-Horizon Agentic Search for Efficiency and Generalization

Published on Feb 26

· Submitted by

Chen on Feb 27

OPPO

Upvote

Authors:

Dingfeng Shi ,

Minghao Liu

Abstract

A deep learning framework called SMTL improves efficient long-horizon agentic search by replacing sequential reasoning with parallel evidence acquisition, achieving state-of-the-art performance across multiple research benchmarks while reducing reasoning steps by 70.7%.

AI-generated summary

Recent deep research agents primarily improve performance by scaling reasoning depth, but this leads to high inference cost and latency in search-intensive scenarios. Moreover, generalization across heterogeneous research settings remains challenging. In this work, we propose Search More, Think Less (SMTL), a framework for long-horizon agentic search that targets both efficiency and generalization. SMTL replaces sequential reasoning with parallel evidence acquisition, enabling efficient context management under constrained context budgets. To support generalization across task types, we further introduce a unified data synthesis pipeline that constructs search tasks spanning both deterministic question answering and open-ended research scenarios with task appropriate evaluation metrics. We train an end-to-end agent using supervised fine-tuning and reinforcement learning, achieving strong and often state of the art performance across benchmarks including BrowseComp (48.6\%), GAIA (75.7\%), Xbench (82.0\%), and DeepResearch Bench (45.9\%). Compared to Mirothinker-v1.0, SMTL with maximum 100 interaction steps reduces the average number of reasoning steps on BrowseComp by 70.7\%, while improving accuracy.

View arXiv page View PDF GitHub 0 Add to collection

Community

Qianben

Paper submitter about 10 hours ago

Recent deep research agents often improve performance by scaling reasoning depth. While effective, this approach significantly increases inference cost and latency in search-intensive settings, and often struggles to generalize across heterogeneous research scenarios.

In this paper, we introduce Search More, Think Less (SMTL), a long-horizon agentic search framework designed to improve both efficiency and generalization. Instead of relying on sequential deep reasoning, SMTL adopts parallel evidence acquisition, allowing the agent to gather information more efficiently and manage context under constrained budgets.

To enhance cross-task generalization, we further propose a unified graph-driven data synthesis pipeline that generates training tasks spanning both:

deterministic multi-hop QA (Deep Search), and

open-ended research scenarios (Deep Research),

with task-appropriate evaluation metrics.

We train an end-to-end agent using supervised fine-tuning and reinforcement learning. SMTL achieves strong and often state-of-the-art performance across multiple benchmarks:

BrowseComp: 48.6%

GAIA: 75.7%

Xbench: 82.0%

DeepResearch Bench: 45.9%

Importantly, under a maximum 100-step interaction budget, SMTL reduces the average number of reasoning steps on BrowseComp by 70.7% compared to Mirothinker-v1.0, while simultaneously improving accuracy — demonstrating a significantly better accuracy-efficiency Pareto frontier.

SMTL suggests that, for long-horizon search agents, efficient evidence acquisition may be more impactful than simply scaling reasoning depth.