tsingularity / dift Goto Github PK

View Code? Open in Web Editor NEW

562.0 562.0 29.0 11.89 MB

[NeurIPS'23] Emergent Correspondence from Image Diffusion

Home Page: https://diffusionfeatures.github.io

License: MIT License

Jupyter Notebook 12.76% Python 86.42% Shell 0.81%

correspondences diffusion-models

dift's People

Contributors

Stargazers

Watchers

dift's Issues

About the parameter ensemble_size in SDFeaturizer.forward

Hi! Thanks for your great work.
Here I don't understand why the input image is repeated for 8 times. Can ensemble_size be modified to 1?

About Ablated Diffusion Model (ADM)

Hi, congrats on the great work!
I am interested in trying your Ablated Diffusion Model (ADM) baseline. Would you be able to share with us the implementation? Thank you.

Implementation of Ablated Diffusion Model (ADM)

Congratulations for your great work!

I'm fascinated with Ablated Diffusion Model (ADM) baseline, could you please release your code?

Thanks a lot!!!

OSError: stabilityai/stable-diffusion-2-1 does not appear to have a file named config.json.

Hello!When I ran extract_dift.py following the readme,I came across this problem.Would you mind help me slove this?
Thanks

Question about Prompt for stable diffusion to obtain Feature maps

Thanks authors for the nice work. I have some questions about obtain feature map from stable diffusion model. According to your code, if I read correctly, you will need a text prompt, e.g., " a photo of cat" to obtain the diffusion feature.

I wonder how authors obtain the text prompts when evaluating on the label progation benchmarks or other benchmarks. Do you need to annonate them in a rough way?

How to choose features from OpenCLIP?

Hello! Thanks for your great work.
I don't know how to use OpenCLIP to find correspondences. Could you please share these codes?
Thanks again.

CUDA error

Thanks for your great work!
I try the eval_davis.py with adm model after create with conda env create -f environment.yml.

But it raise a cuda error:

I passed CUDA_LAUNCH_BLOCKING=1, then:

I try to delete codes which may have an effect, including gc.collect() and 'torch.cuda.empty_cache()',. but it doesn't work.

Would you mind help me to slove this?
Thanks a lot!

Hi, Can you please provide me with a basic example of the viewpoint change?

Thank you for this awesome project. I have seen on the webpage page it is said that it changes the viewpoint of the image object as well. There is no such demo given. Please provide me with a basic example of how I can do this if possible. I really appreciate any help you can provide.

Code about sparse feature matching

Hi! Your work is amazing and I found that it may be helpful to some of my projects. I checked your paper and I am interested in DIFT sparse feature matching. It seems that your code doesn't include this part. Could you please share this code and exection tips? Thanks!

Batch Inference

Thanks for sharing the code!

a question regarding the demo, does the code supports batch inference?

it's written that the input should be a single image tensor and a single text sequence

Args: img_tensor: should be a single torch tensor in the shape of [1, C, H, W] or [C, H, W] prompt: the prompt to use, a string t: the time step to use, should be an int in the range of [0, 1000] up_ft_index: which upsampling block of the U-Net to extract feature, you can choose [0, 1, 2, 3] ensemble_size: the number of repeated images used in the batch to extract features Return:

so I was wondering how to do batch inference

Thanks

Correspondance Layers in SDXL

Hello,

Hi @Tsingularity ! Thank you for this amazing work. Could you provide some intuitions on the applicability of SDXL and the best layer to extract features in SDXL?

Also, do you think the method would apply to a purely transformer based architecture like SD3/DiT as well?

How to use clip image feature to do correlation

Hey! I have see the same question in closed issue. But that question's response is about the input image size.

What I want to ask is that clip image encoder only get a 1D token like (640). But the image size is actually 2D like (256,256) resolution. So How do you use the aligned embedding token to do feature correlation? Maybe it doesn't have same dimension.

Looking forward to your reply!

How to reimplement segmentation with DIFT

Hello,
Great work!
I would like to know how I would go about using DIFT in segmentation task that mentioned in the paper.

Thank you very much!

Using Application: Edit Propagation

Hello,
Great work!
I would like to know how I would go about using the "Edit Propagation" method as seen in the last example.
Thank you very much!

Question regarding edit propagation

Hello,
Thanks for great work! DIFT is truly impressive and I believe it offers endless possibilities for downstream tasks.

I have a question about the Edit Propagation discussed in your paper. From my understanding, one would initially paste a sticker onto the source image, then extract a matching mask in the target image using DIFT, and subsequently apply the transformation from source mask to target mask. Am I understanding this correctly?

If so, I have a question about how the system handles features that are present in the source image but missing in the target image through DIFT. For instance, in the project page, there's an example of a dog wearing a Santa hat. Given that the target image doesn't have the hat, it would seem challenging to extract the corresponding feature map from the target. Could you kindly explain it please?

Thank you so much!

Recreating results from paper

What commands were used to get the numbers in the paper for Spair71k? Im running the command suggested in the repo and getting worse results than are listed in the paper.

(dift) ehedlin@dory:dift$ python eval_spair.py     --dataset_path ./SPair-71k     --save_path ./spair_ft --dift_model sd     --img_size 768 768     --t 261     --up_ft_index 2     --ensemble_size 8                                                                                                                                                                                            
main path: /scratch/iamerich/dift                                                                                                                      
dataset_path: ./SPair-71k                                                                                                                              
save_path: ./spair_ft                                                                                                                                  
dift_model: sd                                                                                                                                         
img_size: [768, 768]                                                                                                                                   
t: 261                                                                                                                                                 
up_ft_index: 2                                                                                                                                         
ensemble_size: 8                                                                                                                                       
saving all test images' features...                                                                                                                    
100%|██████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 18/18 [15:14<00:00, 50.82s/it]
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 702/702 [00:09<00:00, 73.55it/s]
motorbike per image [email protected]: 22.07                                                                                                                     
motorbike per point [email protected]: 24.04                                                                                                                     
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 600/600 [00:11<00:00, 51.32it/s]
horse per image [email protected]: 26.61                                                                                                                         
horse per point [email protected]: 29.55                                                                                                                         
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 646/646 [00:10<00:00, 64.40it/s]
chair per image [email protected]: 11.92                                                                                                                         
chair per point [email protected]: 13.25                                                                                                                         
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 870/870 [00:16<00:00, 52.79it/s]
bottle per image [email protected]: 25.35
bottle per point [email protected]: 26.52
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 600/600 [00:16<00:00, 36.05it/s]

cat per image [email protected]: 59.13
cat per point [email protected]: 58.86
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 702/702 [00:12<00:00, 57.22it/s]
bird per image [email protected]: 41.07
bird per point [email protected]: 43.91
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 650/650 [00:10<00:00, 61.80it/s]
bicycle per image [email protected]: 26.51
bicycle per point [email protected]: 28.09
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 644/644 [00:12<00:00, 52.04it/s]
bus per image [email protected]: 24.37
bus per point [email protected]: 33.25
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 756/756 [00:22<00:00, 34.08it/s]
train per image [email protected]: 48.58
train per point [email protected]: 50.81
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 650/650 [00:11<00:00, 54.38it/s]
person per image [email protected]: 26.87
person per point [email protected]: 30.52
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 690/690 [00:13<00:00, 49.57it/s]
aeroplane per image [email protected]: 32.07
aeroplane per point [email protected]: 34.82
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 664/664 [00:10<00:00, 64.55it/s]
sheep per image [email protected]: 25.98
sheep per point [email protected]: 33.42
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 692/692 [00:19<00:00, 34.80it/s]
tvmonitor per image [email protected]: 23.60
tvmonitor per point [email protected]: 24.71
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 600/600 [00:13<00:00, 46.04it/s]
dog per image [email protected]: 30.61
dog per point [email protected]: 33.44
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 862/862 [00:12<00:00, 67.95it/s]
pottedplant per image [email protected]: 27.44
pottedplant per point [email protected]: 29.76
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 640/640 [00:14<00:00, 43.27it/s]
cow per image [email protected]: 39.09
cow per point [email protected]: 44.68
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 702/702 [00:09<00:00, 73.83it/s]
boat per image [email protected]: 15.73
boat per point [email protected]: 18.28
100%|████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 564/564 [00:09<00:00, 60.91it/s]
car per image [email protected]: 22.34
car per point [email protected]: 30.62
All per image [email protected]: 29.35
All per point [email protected]: 34.31

Ask for evaluation code

Nice work! Would you please provide us with the code of Benchmark Evaluation? Or can you provide anyplace of similar evaluation code
you refer?
especially evaluation of the datasets of SPair-71k, PF-WILLOW and CUB-200-2011

Demo error

Thanks for great work!

While implementing demo, I experienced the error above.

It worked well when I set do_classifier_free_guidance as True(it was originally set as False)

It seems like negative prompt embedding becomes Nonetype when I set do_classifier_free_guidance False.

If code is wrong, please tell me!

stabilityai/stable-diffusion-2-1 does not appear to have a file named config.json.

Hello, when I ran the demo, stabilityai/stable-diffusion-2-1 does not appear to have a file named config.json. This means that config.json cannot be obtained. I would like to ask where to download the file? The specific error reported is as follows:

/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/file_download.py:1132: FutureWarning: resume_download is deprecated and will be removed in version 1.0.0. Downloads always resume when possible. If you want to force a new download, use force_download=True.
warnings.warn(
Traceback (most recent call last):
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/urllib3/connection.py", line 174, in _new_conn
conn = connection.create_connection(
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/urllib3/util/connection.py", line 95, in create_connection
raise err
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/urllib3/util/connection.py", line 85, in create_connection
sock.connect(sa)
OSError: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 703, in urlopen
httplib_response = self._make_request(
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 386, in _make_request
self._validate_conn(conn)
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 1042, in _validate_conn
conn.connect()
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/urllib3/connection.py", line 363, in connect
self.sock = conn = self._new_conn()
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/urllib3/connection.py", line 186, in _new_conn
raise NewConnectionError(
urllib3.exceptions.NewConnectionError: <urllib3.connection.HTTPSConnection object at 0x7fa269a40d90>: Failed to establish a new connection: [Errno 101] Network is unreachable

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/requests/adapters.py", line 486, in send
resp = conn.urlopen(
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/urllib3/connectionpool.py", line 787, in urlopen
retries = retries.increment(
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/urllib3/util/retry.py", line 592, in increment
raise MaxRetryError(_pool, url, error or ResponseError(cause))
urllib3.exceptions.MaxRetryError: HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /stabilityai/stable-diffusion-2-1/resolve/main/unet/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa269a40d90>: Failed to establish a new connection: [Errno 101] Network is unreachable'))

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1722, in _get_metadata_or_catch_error
metadata = get_hf_file_metadata(url=url, proxies=proxies, timeout=etag_timeout, headers=headers)
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1645, in get_hf_file_metadata
r = _request_wrapper(
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 372, in _request_wrapper
response = _request_wrapper(
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 395, in _request_wrapper
response = get_session().request(method=method, url=url, **params)
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/requests/sessions.py", line 589, in request
resp = self.send(prep, **send_kwargs)
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/requests/sessions.py", line 703, in send
r = adapter.send(request, **kwargs)
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/utils/_http.py", line 66, in send
return super().send(request, *args, **kwargs)
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/requests/adapters.py", line 519, in send
raise ConnectionError(e, request=request)
requests.exceptions.ConnectionError: (MaxRetryError("HTTPSConnectionPool(host='huggingface.co', port=443): Max retries exceeded with url: /stabilityai/stable-diffusion-2-1/resolve/main/unet/config.json (Caused by NewConnectionError('<urllib3.connection.HTTPSConnection object at 0x7fa269a40d90>: Failed to establish a new connection: [Errno 101] Network is unreachable'))"), '(Request ID: 1efb8410-3b12-49f4-b553-4de46f64da86)')

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/diffusers/configuration_utils.py", line 337, in load_config
config_file = hf_hub_download(
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/utils/_validators.py", line 114, in _inner_fn
return fn(*args, **kwargs)
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1221, in hf_hub_download
return _hf_hub_download_to_cache_dir(
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1325, in _hf_hub_download_to_cache_dir
_raise_on_head_call_error(head_call_error, force_download, local_files_only)
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/huggingface_hub/file_download.py", line 1826, in _raise_on_head_call_error
raise LocalEntryNotFoundError(
huggingface_hub.utils._errors.LocalEntryNotFoundError: An error happened while trying to locate the file on the Hub and we cannot find the requested files in the local cache. Please check your connection and try again or make sure your Internet connection is on.

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/home/pcl/DETR/SDAseg/others/DIFT-main/demo.py", line 14, in
dift = SDFeaturizer()
File "/home/pcl/DETR/SDAseg/others/DIFT-main/src/models/dift_sd.py", line 192, in init
unet = MyUNet2DConditionModel.from_pretrained(sd_id, subfolder="unet")
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/diffusers/models/modeling_utils.py", line 472, in from_pretrained
config, unused_kwargs, commit_hash = cls.load_config(
File "/home/pcl/anaconda3/envs/PY310/lib/python3.10/site-packages/diffusers/configuration_utils.py", line 364, in load_config
raise EnvironmentError(
OSError: stabilityai/stable-diffusion-2-1 does not appear to have a file named config.json.

The training code and Geometric Correspondence demo code

Hi, I want to ask when The training code and Geometric Correspondence demo code will be released?

Another related work

Thanks for this impressive work. Microsoft has previously proposed the CoCosNet series of works (Cross-domain Correspondence Learning for Exemplar-based Image Translation, CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation) that establishes dense correspondence for cross-domain images using GANs. The idea is also about cultivating the hidden knowledge learned alongside the generative process. Could you please have a look at the two papers and mention them in your work?

Thanks.

About the use of the ft tensor

Hello, thanks for your great work! I am curious about the meaning of the output feature tensor in your demo, and how it can be used in other downstream tasks as mentioned in your paper, such as image matching and segmentation? For instance, the output tensor of your demo is [2,1280,48,48], 48 are H and W dimension, and 2 refer to the input image and output image respectively, what is the meaning of 1280?

tsingularity / dift Goto Github PK

dift's People

Contributors

Stargazers

Watchers

Forkers

dift's Issues

Recommend Projects

Recommend Topics

Recommend Org