GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristic

Modi Jin1 · Yiming Zhang1 · Boyuan Sun1 · Dingwen Zhang2 · Mingming Cheng1 · Qibin Hou1†

1VCIP, Nankai University 2 School of Automation, Northwestern Polytechnical University

†Corresponding author

English | 简体中文

Paper PDF github Project Page Demo

GeoAgent is a vision-language model for image geolocation that reasons closely with humans and derives fine-grained address conclusions. Built upon Qwen2.5-VL, it achieves strong performance across multiple geographic grains (city, region, country, continent) while generating interpretable chain-of-thought reasoning.

GeoAgent introduces:

  1. Geo-similarity reward combining spatial and semantic similarity to handle the many-to-one mapping between natural language and geographic locations;
  2. Consistency reward assessed by a consistency agent to ensure the integrity and consistency of reasoning chains. The model is trained on GeoSeek, a novel geolocation dataset with human-annotated CoT and bias-reducing sampling.

We also introduce GeoSeek, which is a new geolocation dataset comprising:

  • GeoSeek-CoT (10k): High-quality chain-of-thought data labeled by geography experts and professional geolocation game players. Each entry includes street-view images, GPS coordinates, three-level location labels (country, city, precise location), and human reasoning processes—standardized into a unified CoT format.
  • GeoSeek-Loc (20k): Images for RL-based finetuning, sampled via a stratified strategy considering population, land area, and highway mileage to reduce geographic bias.
  • GeoSeek-Val (3k): Validation benchmark with locatability scores and scene categories (manmade structures, natural landscapes, etc.) for evaluation.

Installation

Requirements

  • Python>=3.9
  • torch==2.6.0
  • torchvision==0.21.0
  • torchaudio==2.6.0
  • ms-swift>=3.8.0
  • xformers==0.0.27.post2
  • deepspeed==0.15.0
  • cuda==12.4

Setup

git clone https://github.com/HVision-NKU/GeoAgent.git
cd GeoAgent

conda create -n GeoAgent python=3.9
conda activate GeoAgent
pip install -r requirements.txt

Usage

Get GeoAgent Model

Download the pre-trained checkpoints from Hugging Face:

mkdir checkpoints
cd checkpoints

# (Optional) Using huggingface mirrors
export HF_ENDPOINT=https://hf-mirror.com

# download GeoAgent model from huggingface
huggingface-cli download --resume-download ghost233lism/GeoAgent --local-dir ghost233lism/GeoAgent

Quick Inference

We provide the quick inference scripts for single/batch image input in infer/. Please refer to infer/README for detailed information.

Training

bash tools/train_sft.sh 
bash tools/train_grpo.sh

Citation

@article{jin2026geoagent,
  title={GeoAgent: Learning to Geolocate Everywhere with Reinforced Geographic Characteristics},
  author={Jin, Modi and Zhang, Yiming and Sun, Boyuan and Zhang, Dingwen and Cheng, Ming-Ming and Hou, Qibin},
  journal={arXiv preprint arXiv:2602.12617},
  year={2026}
}

License

This code is licensed under the Creative Commons Attribution-NonCommercial 4.0 International for non-commercial use only.

Please note that any commercial use of this code requires formal permission prior to use.

Contact

For technical questions, please contact jin_modi[AT]mail.nankai.edu.cn

For commercial licensing, please contact andrewhoux[AT]gmail.com.

Acknowledgments

We sincerely thank Yue Zhang, H.M., Haowen He, Yuke Jun, and other experts in geography, as well as outstanding geolocation game players, for their valuable guidance, prompt design suggestions, and data support throughout the construction of the GeoSeek dataset.

We also thank Zhixiang Wang, Chilin Chen, Jincheng Shi, Liupeng Zhang, Yuan Gu, Yanghang Shao, Jinhua Zhang, Jiachen Zhu, Gucheng Qiuyue, Qingyang Guo, Jingchen Yang, Weilong Kong, Xinyuan Li, and Mr. Xu (an anonymous volunteer) for their outstanding contributions in providing high-quality reasoning process data.

Downloads last month
34
Safetensors
Model size
8B params
Tensor type
BF16
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Model tree for ghost233lism/GeoAgent

Finetuned
(998)
this model

Dataset used to train ghost233lism/GeoAgent

Paper for ghost233lism/GeoAgent