the-ai-summer / self-attention-cv Goto Github PK

Implementation of various self-attention mechanisms focused on computer vision. Ongoing repository.

License: MIT License

Python 84.32% Jupyter Notebook 15.68%

deep-learning transformer transformers self-attention attention-mechanism attention machine-learning machine-learning-algorithms artificial-intelligence

self-attention-cv's People

Contributors

Stargazers

Watchers

Forkers

sailfish009 cleemesser phongnhhn92 lllfx zaghlol94 nathanlem1 xxxhycl2010 douglas2code florianwu777 chlei233 youtang1993 deeplearning2012 yyht sainatarajan zqsiat polariskyle xrosliang ghali007 sahilrider tsingzao lake200 cumtchenll stjordanis steliord ayanamireifan gokulsg song-xu-jojo ljingv piantic chenchy arthasmil yibuxiaoxinaishangni rafaelmri ansuini chenhuayou sahadevpoudel davilla7 mathpopo nawshad miaomiaogarden gabbysuwichaya embeddedsamurai bracealround ravimk07 zousophie abcxubu mtchibozo ikerlz siyuan89 cvlinks hannahaih iff-0303 aiedward duanweiwe ree1s qnguyen3 augustkrzhu rameshragala cv-ip dumpmemory trendingtechnology ahmed98adly wh-forker abdelpakey 11710615 onionon1on yuanpanlifly thithaotran laihuaijing leealee12345 pchandrasekaran1595 sweetwind1996 templeblock insightque ccjack xy21yue yogii786 zhouweilian berkgungor pyrookie1 yidan-zhang lian-yu-i ivyliu47 jianglin314 mariyamiteva snowbhr06 sharifmhamza hungnphan codwest tiger-tiger maxiao6668 qianjinfighter huake-ezhou subburajs nimritakoul dennisgu zongyinliu sachdevkartik gowriaddepalli darwinyang

self-attention-cv's Issues

Thank you very much for the code. But when I run test_TransUnet.py , It starts reporting errors. Why is that? Could you please help me solve it? Thank you

Thank you very much for the code. But when I run test_TransUnet.py ,
It starts reporting errors. Why is that?I
`Traceback (most recent call last):
File "self-attention-cv/tests/test_TransUnet.py", line 14, in
test_TransUnet()
File "/self-attention-cv/tests/test_TransUnet.py", line 11, in test_TransUnet
y = model(a)
File "C:\Users\dell.conda\envs\myenv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "self-attention-cv\self_attention_cv\transunet\trans_unet.py", line 88, in forward
y = self.project_patches_back(y)
File "C:\Users\dell.conda\envs\myenv\lib\site-packages\torch\nn\modules\module.py", line 727, in _call_impl
result = self.forward(*input, **kwargs)
File "C:\Users\dell.conda\envs\myenv\lib\site-packages\torch\nn\modules\linear.py", line 93, in forward
return F.linear(input, self.weight, self.bias)
File "C:\Users\dell.conda\envs\myenv\lib\site-packages\torch\nn\functional.py", line 1692, in linear
output = input.matmul(weight.t())
RuntimeError: mat1 dim 1 must match mat2 dim 0

Process finished with exit code 1
`
Could you please help me solve it? Thank you.

Request for Including UNETR

Thanks for great work ! I noticed nice implementation of this paper (https://arxiv.org/abs/2103.10504) here:

https://github.com/tamasino52/UNETR/blob/main/unetr.py

It would be great if this can also be included in your repo, since it comes with lots of other great features. So we can explore more.

Thanks ~

TransUNet - Why is the patch_dim set to 1?

Hi,

Can you please explain why is the patch_dim set to 1 in TransUNet class? Thank you in advance!

self-attention-cv/self_attention_cv/transunet/trans_unet.py

Line 54 in 8280009

patch_dim=1,

How do I import a pre-trained VIT model into the TransUnet interface?

Looking forward to your reply

ResNet + Pyramid Vision Transformer Version 2

Thank you for your work with a clear explanation. As you know, ViT doesn't work on small datasets and I am implementing ResNet34 with Pyramid Vision Transformer Version 2 to make it better. The architecture of ViT and PVT V2 is completely different. Could you provide me some help to implement it? please

AxialAttentionBlock : Doesn't work on gpu

The code is currently supported for cpu. I tried running for gpu but its given the following error in relative_pos_enc_qkv.py file. I tried making some changes to change device for inputs but its still not working.

/usr/local/lib/python3.7/dist-packages/self_attention_cv/pos_embeddings/relative_pos_enc_qkv.py in forward(self)
     36 
     37     def forward(self):
---> 38         all_embeddings = torch.index_select(self.relative, 1, self.relative_index_2d)  # [head_planes , (dim*dim)]
     39 
     40         all_embeddings = rearrange(all_embeddings, ' c (x y)  -> c x y', x=self.dim)

RuntimeError: Input, output and indices must be on the current device`
```

Regression with attention

Hello!

thanks for sharing this nice repo :)

I'm trying to use ViT to do regression on images. I'd like to predict 6 floats per image.

My understanding is that I'd need to simply define the network as

vit = ViT(img_dim=128,
               in_channels=3,
               patch_dim=16,
               num_classes=6,
               dim=512)

and during training call

vit(x)

and compute the loss as MSE instead of CE.

The network actually runs but it doesn't seem to converge. Is there something obvious I am missing?

many thanks!

ImageNet Pretrained TimesFormer

I see you have recently added the TimesFormer model to this repository. In the paper, they initialize their model weights from ImageNet pretrained weights of ViT. Does your implementation offer this too? Thanks!

Question: Sliding Window Module for Transformer3dSeg Object

I was wondering whether or not you've implemented an example using the network in a 3d medical segmentation task and/or use case? If this network only exports the center slice of a patch then we would need a wrapper function to iterate through all patches in an image to get the final prediction for the entire volume. From the original paper, I assume they choose 10 patches at random from an image during training, but it's not too clear how they pieced everything together during testing.

Your thoughts on this would be greatly appreciated!

See:

self-attention-cv/self_attention_cv/Transformer3Dsegmentation/tranf3Dseg.py

Line 10 in 33ddf02

class Transformer3dSeg(nn.Module):

so, how to use it? I want to use cityscape datasets test this method

Convolution-Free Medical Image Segmentation using Transformers

Thank you very much for your contribution. As a novice, I have a doubt. In tranf3dseg, the output of the model is the prediction segmentation of the center patch, so how can I get the segmentation of the whole input image? I am looking forward to any reply.

Axial attention

What is the meaning of qkv_channels?

self-attention-cv/self_attention_cv/axial_attention_deeplab/axial_attention.py

Line 32 in 5246e55

self.qkv_channels = self.dim_head_v + self.dim_head_kq * 2

use AxialAttention on gpu

I try to use AxialAttention on gpu, but I get a mistake.Can you give me some tips about using AxialAttention on gpu.
Thanks!
mistake:
RuntimeError: expected self and mask to be on the same device, but got mask on cpu and self on cuda:0

Do the encoder modules incorporate positional encoding?

I am wondering if I use say the LinformerEncoder if I have to add the position encoding or if that's already done? From the source files it doesn't seem to be there, but I'm not sure how to include the position encoding as they seem to need the query which isn't available when just passing data directly to the LinformerEncoder. I very well may be missing something any help would be great. Perhaps an example using positional encoding would be good.

Segmentation for full image

Hi,

Thank you for your effort and time in implementing this. I have a quick question, I want to get segmentation for full image not just for the middle token, would it be correct to change self.tokens to self.p here:

self-attention-cv/self_attention_cv/Transformer3Dsegmentation/tranf3Dseg.py

Line 66 in 5246e55

self.mlp_seg_head = nn.Linear(dim, self.tokens * self.num_classes)

and change this:

self-attention-cv/self_attention_cv/Transformer3Dsegmentation/tranf3Dseg.py

Line 94 in 5246e55

y = self.mlp_seg_head(y[:, self.mid_token, :])

y = self.mlp_seg_head(y)

the-ai-summer / self-attention-cv Goto Github PK

self-attention-cv's People

Contributors

Stargazers

Watchers

Forkers

self-attention-cv's Issues

Recommend Projects

Recommend Topics

Recommend Org