ImAge

ImAge is an implicit aggregation method to get robust global image descriptors for visual place recognition, which neither modifies the backbone nor needs an extra aggregator. This work outperforms previous SOTA methods on several VPR benchmarks.

Paper: Towards Implicit Aggregation: Robust Image Representation for Place Recognition in the Transformer Era (NeurIPS 2025)

GitHub: Lu-Feng/ImAge

Usage

import torch
model = torch.hub.load("Lu-Feng/ImAge", "ImAge")
model.eval()

# Extract descriptor from an image
image = torch.randn(1, 3, 322, 322)  # [B, 3, H, W]
with torch.no_grad():
    descriptor = model(image)  # [B, 6144] L2-normalized descriptor

Citation

@inproceedings{ImAge,
title={Towards Implicit Aggregation: Robust Image Representation for Place Recognition in the Transformer Era},
author={Feng Lu and Tong Jin and Canming Ye and Xiangyuan Lan and Yunpeng Liu and Chun Yuan},
booktitle={The Annual Conference on Neural Information Processing Systems},
year={2025}
}

Downloads last month: -; Downloads are not tracked for this model. How to track

Inference Providers NEW

This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Paper for fenglu96/ImAge4VPR

Towards Implicit Aggregation: Robust Image Representation for Place Recognition in the Transformer Era

Paper • 2511.06024 • Published Nov 8, 2025