Thank you for your contribution. I wonder if you plan to release the mask prediction v

Hello, <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-u

<div class="highlight highlight-source-python notranslate position-relative overflow-au

MAE prediction visualization code about mae-pytorch HOT 9 OPEN

pengzhiliang commented on July 27, 2024 2

MAE prediction visualization code

from mae-pytorch.

Comments (9)

tikboaHIT commented on July 27, 2024 9

I have implemented the visualization code here, can I submit a merge request?

from mae-pytorch.

avitrost commented on July 27, 2024 2

Hi, I was wondering how to perform inference and run the full encoder-decoder network on a complete, unmasked image? In other words, after it is trained, how would I call the model such that it encodes a complete image with no masks, and then reconstructs the original image using the decoder?

from mae-pytorch.

pengzhiliang commented on July 27, 2024

Unfortunately, our current visualization code also has some bugs, I will try to solve it!

from mae-pytorch.

pengzhiliang commented on July 27, 2024

Of course. Thank you for your contributions.

And can you provide some visualization results here?

from mae-pytorch.

tikboaHIT commented on July 27, 2024

Of course, can you provide pre-trained models and test images? Because the current model is mainly based on a custom datset.

from mae-pytorch.

pengzhiliang commented on July 27, 2024

I have uploaded the weight to google drive, please see latest readme.txt.

from mae-pytorch.

pengzhiliang commented on July 27, 2024

Hello, @avitrost , maybe there are some bugs now when the mask is always 0.
But if you really want to observe the performance when MAE is used as a pure auto-encoder, you should have a change:
For the encoder:
you can directly let x_vis=x in this line.
And for the inputs to decoder:

x_vis = self.encoder_to_decoder(x_vis) # [B, N_vis, C_d]
B, N, C = x_vis.shape
expand_pos_embed = self.pos_embed.expand(B, -1, -1).type_as(x).to(x.device).clone().detach()
pos_emd_vis = expand_pos_embed[~mask].reshape(B, -1, C)
pos_emd_mask = expand_pos_embed[mask].reshape(B, -1, C)
x_full = torch.cat([x_vis + pos_emd_vis, self.mask_token + pos_emd_mask], dim=1)

x = self.decoder(x_full, pos_emd_mask.shape[1]) # [B, N_mask, 3 * 16 * 16]

It needs to be changed to:

x_vis = self.encoder_to_decoder(x_vis) # [B, N, C_d]
expand_pos_embed = self.pos_embed.expand(B, -1, -1).type_as(x).to(x.device).clone().detach()
x = self.decoder(x_vis+expand_pos_embed , x_vis.shape[1]) # [B, N, 3 * 16 * 16]

Maybe there will be some other bugs, you can have a debug.
Hope this can help you!

from mae-pytorch.

Pter61 commented on July 27, 2024

expand_pos_embed = self.pos_embed

Thank you for your contribution! I have a question about this change. Why this change does not update in the latest code?

from mae-pytorch.

mouxinyue1 commented on July 27, 2024

Hello, I would like to ask if the weight file loaded visually is the pre-trained weight file or the fine-tuned weight file? Error when I load the fine-tuned weight file:

from mae-pytorch.

MAE prediction visualization code about mae-pytorch HOT 9 OPEN

Comments (9)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent