MachineLearningLM: Continued Pretraining Language Models on Millions of Synthetic Tabular Prediction Tasks Scales In-Context ML Paper • 2509.06806 • Published Sep 8, 2025 • 63
Finch: Benchmarking Finance & Accounting across Spreadsheet-Centric Enterprise Workflows Paper • 2512.13168 • Published Dec 15, 2025 • 52
LitVISTA: A Benchmark for Narrative Orchestration in Literary Text Paper • 2601.06445 • Published Jan 10
Beyond Accuracy: A Cognitive Load Framework for Mapping the Capability Boundaries of Tool-use Agents Paper • 2601.20412 • Published Jan 28
Pixel-wise Graph Attention Networks for Person Re-identification Paper • 2307.09183 • Published Jul 18, 2023