- configuration: config.yaml
- Use multimodal LLM to inference: infer.py
- Extract specific results(e.g.: latitude and longitude): extract_info.py
- Use ChatGPT to check and refine the results: check_GPT.py
- Change prompt method: method.py
- Calculate distance between prediction and ground truth: calc_dist.py
- Automatic script: run.py
[
{
"image_file": "file_name",
"gt": {
"latitude": "value",
"longitude": "value"
},
"model_1": {
"method_1": {
"output": " ",
"latitude": " ",
"longitude": " ",
"location": " ",
"xxx": " "
}
},
"model_2": {
"method_1": {
"output": " ",
"xxx": " "
}
},
},
......
]
image_file
: The name of the image file.gt
: A nested object containing geographical location information.latitude
: The latitude value.longitude
: The longitude value.
model_name
: A nested object containing model-related information.method
: A nested object containing method-related information.output
: The output result of the method.xxx
: Other relevant information.
View Interactive Map on local browser
Feel free to run a simple demo:
python infer_demo.py --image-path /path/to/image