jeff-zilence / transgeo2022 Goto Github PK

View Code? Open in Web Editor NEW

99.0 3.0 20.0 533 KB

Official repository for TransGeo: Transformer Is All You Need for Cross-view Image Geo-localization

License: MIT License

Python 97.24% Shell 2.76%

transgeo2022's People

Contributors

Stargazers

Watchers

Forkers

xyli1905 forest-repo kuiy-2020 skyy93 sunyuxi manupillai308 steven-xiong fbasatemur prathikr whu-lyh jiwencao mksasx whxcjm chowte fahmina1319 tanzir111 summerpanking

transgeo2022's Issues

FileNotFoundError: [Errno 2] No such file or directory: './result_vigor/attention/train/35112.png'

hello， thanks for your sharing project.
I meet the same question with BoopSainy when I try to train model on VIGOR.
I changed the code as you modified last month, but it still unable to work.
Can you tell me what should I do to fix it?
Thanks again!

FileNotFoundError: [Errno 2] No such file or directory: './result_vigor/attention/train/35112.png'

About Output MLP head layer

Hello, @Jeff-Zilence

I'm wondering MLP head layer which is applied to cls tocken.
In BERT, Final MLP head layer is used for classification task.
But Solving CVGL task, Why you use Final MLP head layer to cls tocken without using the cls token itself as an output feature??

Is there differences using Final MLP head or not?
I wonder if the task performance changes depending on the feature dimension of the MLP head.

Thank you so much for you apply :)

How to properly determine the ASAM parameter rho?

Hello, @Jeff-Zilence

I'm trying to apply ASAM to learning vision transformer for CVGL Task.
I used your ASAM code , and tried to adjst rho within (0.5 - 3.0)
But Test accuracy in Training converged to 0 after 20k step in all cases.

How to properly determine the ASAM parameter rho?
And Have you ever encountered the same phenomenon when applying ASAM while learning Transgeo?

Thank you for your reply.

get stuck after one training iteration

Thankyou for your amazing work,
I had a problem trying the code, the network model seemed to get stuck after one training iteration, did the author have such a problem?
Program did not report an error, the process is still occupied memory running.

About this dataset. CVPR subset-selected

Hi，I want to ask about the dataset . I download the CVUSA dataset.But the format of the content inside the dataset doesn't correspond to the way the data is loaded in the code.Do you meet the same question?

CVPR subset-selected

bingmap
splits
streetview
all.csv

Inference on single image

Hi TransGeo authors/team,

Congrats on this work!

I was wondering if it is possible to do an inference on a single image?

Thanks

Can I request early access to the code?

Hi, congrats on your paper acceptance to CVPR. Can I request early access to your code to replicate the results? Thank you

Decline in evaluation results

Hi,

I run you model and codes, and the results in CVUSA and CVACT are normal.
But there are some drops (about 5 points drop from 61.48 to 57.6) in VIGOR, maybe the provided model is not the best one?
My results
Your paper results

Besides, did you provide the models for vigor-cross?

Best.

super slow when calculating accuracy

Hi，thanks for your nice work!
I am trying to train on VIGOR dataset following the original run_VIGOR.sh script.
The training process in each epoch goes well,but the time used for calculating accuracy seems too long,you can check this:

One epoch of training took about an hour, but calculating accuracy took about 6 hours.
I haven't made much changes to other files,except adding two lines below in train.py,to respond to an error—OSError: image file is truncated (91 bytes not processed). I only used one rtx3090 for training.
what lead to this？

training on single GPU

Hello,

Using the provided code, can I utilize the VIGOR dataset with a single GPU?

It seems that the args.distributed is applied when either multiprocessing_distributed or world_size > 1, otherwise the training doesn't proceed.

Is it possible to train with a single GPU?

Thank you.

About how to structure datasets

thanks a lot for your sharing project, could you please tell me how you construct these 3 datasets in this project and how you preprocess they. thanks again

Dataset Problem

What is the format of that all.csv, and why is the latitude and longitude in the one I downloaded, and also missing the ATTENTION folder?

Do I need to download the data set to my computer?

I want to run this model.But I want to know do I need to download the data set to my computer?

Polar coordinate conversio

Hello! Can you provide polar coordinate conversion code? What is the size of the street view image input model if polar coordinate conversion is used? I really hope to receive a reply! Thank you for your outstanding work!

Request for Pretrained model for CVUSA Limited FoV.

Hello,

First I want to say that I think this is a great work.
I'm looking for a model for CVUSA limited FoV for some experiences, I have seen that you already uploaded one here, but the link has expired.

Could you please reupload it?

Thank you very much.

RuntimeError

Thanks for your sharing！
I meet runtimerror when I try to train model on VIGOR.
I used several versions of torch and torchvision, but I can not solve the problem.
So what version should I use，or there could be another solution.

File "E:/pycharm/PyCharm Community Edition 2023.2.1/plugins/python-ce/helpers/pydev/pydevd.py", line 1500, in _exec
pydev_imports.execfile(file, globals, locals) # execute the script
File "E:\pycharm\PyCharm Community Edition 2023.2.1\plugins\python-ce\helpers\pydev_pydev_imps_pydev_execfile.py", line 18, in execfile
exec(compile(contents+"\n", file, 'exec'), glob, loc)
File "F:\TransGeo2022-main\train.py", line 706, in
main()
File "F:\TransGeo2022-main\train.py", line 191, in main
main_worker(args.gpu, ngpus_per_node, args)
File "F:\TransGeo2022-main\train.py", line 346, in main_worker
sampler=torch.utils.data.distributed.DistributedSampler(train_scan_dataset),
File "E:\anaconda\envs\pytorch\lib\site-packages\torch\utils\data\distributed.py", line 68, in init
num_replicas = dist.get_world_size()
File "E:\anaconda\envs\pytorch\lib\site-packages\torch\distributed\distributed_c10d.py", line 1196, in get_world_size
return _get_group_size(group)
File "E:\anaconda\envs\pytorch\lib\site-packages\torch\distributed\distributed_c10d.py", line 576, in _get_group_size
default_pg = _get_default_group()
File "E:\anaconda\envs\pytorch\lib\site-packages\torch\distributed\distributed_c10d.py", line 707, in _get_default_group
raise RuntimeError(
RuntimeError: Default process group has not been initialized, please make sure to call init_process_group.

pretrained model on VIGOR

hello, Is the model you provide for pretrained on VIGOR under same-aera or cross-aera protocol?

Help about parameter setting

Hello, would you like to know that when training cvusa, a gpu is used, gou is set to 1,lr is set to 0.0001, batch-size is set to 32, did-URL is set to 'tcp://localhost:10001' and world-size is set to 1. rank set to 0, epochs set to 100, op set to sam, wd set to 0.03, dataset set to cvusa, cos set to True,dim set to 1000, asam set to True, rho set to 2.5. But the result of the first stage is very bad, I would like to ask if I made a mistake, I took a screenshot of the specific parameter Settings, thank you

Test model for Street-view image and Aerial-view image

Hi, how can I test models for one Street-view image and Aerial-view image?
Could you please provide code example for that?

About accuracy

Hello, are there any other operations besides the setting provided in the code? Why did my first stage of training on the VIGOR dataset result in only 53%？
The settings I used are as follow(I only changed the batch-size):
python -u train.py --lr 0.00005 --batch-size 16 --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --epochs 50 --save_path ./result_vigor --op sam --wd 0.03 --mining --dataset vigor --cos --dim 1000 --asam --rho 2.5

FileNotFoundError: [Errno 2] No such file or directory: './result_vigor/attention/train/1415.png'

Hi,

Really thanks for your open-source code and VIGOR dataset!

I meet a strange issue when I try to learn your code. I run the "run_VIGOR.sh" file and adjusted the epoch to 10 for both stage. When the stage-1 completed and automatically skipped to stage-2. There is an error:

FileNotFoundError: [Errno 2] No such file or directory: './result_vigor/attention/train/1415.png'

Then I check the /attention/train folder and found there is only 50608 png attention images without the 1415.png
Do you once meet similar issue when running the code?

Annotation Data

I want to learn with a personal dataset.
Is learning possible without annotation data?
If it is not possible, can you tell me how to get the annotation data?

Learning rate setting

Hello! Your work is very outstanding. If I use 4 or 8 GPUs for training, how should I set the learning rate and what should I set it to? I hope to hear back from you. Thank you!

multi-card training

May I ask the author how to set up multi-card training?

No images in 'trains' or 'val'

Thanks for your excellent work!
Initially I used your pre-trained model for evaluation, no issues were encountered during this process.
But when I started training the CVUSA model, I found that the results differed from pre-trained. There was no images in trains or val.
At the same time, during training, there were warnings prompted, and no loss values were displayed.

Time 9.938 ( 9.938) Data 8.832 ( 8.832) Loss nan (nan) Mean-P 0.34 ( 0.34) Mean-N nan ( nan)

waring:
UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance.
grad.sizes() = [1, 1, 384], strides() = [99072, 384, 1]
bucket_view.sizes() = [1, 1, 384], strides() = [384, 384, 1] (Triggered internally at /opt/conda/conda-bld/pytorch_1656352430114/work/torch/csrc/distributed/c10d/reducer.cpp:326.)

How to solve this issue, looking forward to your reply.

Poor reproduction results

Hello! I am very interested in your work and it has greatly inspired me. This is an outstanding job. I encountered difficulties in reproducing the results of my paper and urgently need your help. The experimental results of the paper are presented in the VIGOR dataset: Percentage top1:61.48, top5:87.54, top10:91.88, and top1%: 99.56. The results I validated using the trained model you provided are: Percentage top1: 57.56677121946583, top5: 83.97870924817032, top10: 89.24246744606026, top1%: 99.41070240471439, time: 5480.254483938217. I haven't changed any parameters, R @ 1 has a significant difference, what should I do? Hope for your reply! Thank you!

Request for Pretrained model for CVUSA Limited FoV.

Thankyou for your amazing work,

I could not find the pretrained model for CVUSA limited FoV.
Could you provide these ? It would be of great help!

Thanks for your time. Looking forward for your response.

怎么产生热力图（注意力图），并将其与原图像叠加？

如题

The model of VIGOR you provide is not the best.

Graduation Project

I have some problems about running this model，for example the CVUSA dataset loading and how to run this model;I'm doing my Graduation Project.I would appreciate it if I can contact you in Wechat;My wechat is 13613677455;I'm in BUAA;

the parameter configuration of the CVACT dataset

Hello, I would like to ask about the parameter configuration of the CVACT dataset, similar to python -u train.py --lr 0.0001 --batch-size 32 --dist-url 'tcp://localhost:10001' --multiprocessing-distributed --world-size 1 --rank 0 --epochs 100 --save_ path ./result --op sam --wd 0.03 --mining --dataset cvusa --cos --dim 1000 --asam --rho 2.5。Thank you!

dataset

The website of the cvusa data set has no effect, would you be willing to share the data?

Have semi-positives been used for training?

I noticed that in the VIGOR dataset, each query (ground image) not only has one positive satellite image but also has three semi-positives, which are not specifically aligned with the query image. However, I didn't see them being used as positive samples in your code. Can I assume that one query (ground image) is only paired with one satellite image in training？

Issue with the Loss function?

Hey Sijie,

I wanted to ask if you have seen negative mean similarity during training and how it affected convergence?

Use pretrained model

I want to use the pretrained model "CVUSA_model/result/model_best.pth.tar" by adding " --resume='/TransGeo2022/result/CVUSA_model/result/model_best.pth.tar'". But get error:

raise RuntimeError('Error(s) in loading state_dict for {}:\n\t{}'.format(
RuntimeError: Error(s) in loading state_dict for DistributedDataParallel:
size mismatch for module.reference_net.pos_embed: copying a param with shape torch.Size([1, 402, 384]) from checkpoint, the shape in current model is torch.Size([1, 258, 384]).

about features of train set

hi, i'm deeply impressed by your work on geo-localization, and i want to learn more about how VIGOR (method) and TransGeo worked by analyzing the difference between these two network's training features, could you please provide the features of TransGeo 's training set (inferenced by the pre-trained model of both same_area and cross_area on VIGOR dataset), I would appreciate it very much, looking forward to more discussion with you, thank you!

And, I'm now a phd student from school of software, tsinghua university. Can I add your WeChat for further discussion or potential cooperation (If there is a opportunity), looking forward to your reply, thank you!

Supplementary materials

Hi there,
It's a good work and I like it so much! But I noticed the paper mentioned there exists a supplementary materials. May I know where can I see the material? It seems I cannot find the link to it.
Thanks!

Grad strides do not match bucket view strides

Such a waring occurs at runtime
/.conda/envs/pytor1/lib/python3.7/site-packages/torch/autograd/init.py:175: UserWarning: Grad strides do not match bucket view strides. This may indicate grad was not created according to the gradient layout contract, or that the param's strides changed since DDP was constructed. This is not an error, but may impair performance.
grad.sizes() = [1, 1, 384], strides() = [99072, 384, 1]
bucket_view.sizes() = [1, 1, 384], strides() = [384, 384, 1] (Triggered internally at /opt/conda/conda-bld/pytorch_1656352430114/work/torch/csrc/distributed/c10d/reducer.cpp:326.)

And the training results are unreasonable
Time 10.881 (10.881) Data 9.173 ( 9.173) Loss nan (nan) Mean-P 0.34 ( 0.34) Mean-N nan ( nan)

This problem seems to be caused by Distributed training, but I set the GPU to 0, how to solve it

Link expired for pre-trained limited FOV model

You previously uploaded a limited FOV pre-trained model here. That link has since expired again, would you be willing to re-upload it?

the dataset

34/5000

Could you provide me with a VIGOR dataset? I cannot contact the author of it