MultihopSpatial: Multi-hop Compositional Spatial Reasoning Benchmark for Vision-Language Model
-
etri-vilab/MultiHopSpatial-Qwen3-VL-4B-Instruct
Image-Text-to-Text ⢠4B ⢠Updated ⢠22 -
etri-vilab/MultihopSpatial
Viewer ⢠Updated ⢠11.3k ⢠1.42k ⢠2 -
MultihopSpatial: Multi-hop Compositional Spatial Reasoning Benchmark for Vision-Language Model
Paper ⢠2603.18892 ⢠Published ⢠1