dwromero / g_selfatt Goto Github PK
View Code? Open in Web Editor NEWCode repository for the paper "Group Equivariant Stand-Alone Self Attention For Vision" published at ICLR 2021. https://openreview.net/forum?id=JkfYjnOEo6M
License: MIT License
Code repository for the paper "Group Equivariant Stand-Alone Self Attention For Vision" published at ICLR 2021. https://openreview.net/forum?id=JkfYjnOEo6M
License: MIT License
hello~~
hi~i want to ask how much memory is required to run the model if the size of the input picture is 1024×1024? i run the demo 0_local_p4_selfattention.ipynb,i put in a picture of myself,and the shape of the input_tensor is torch.Size([1, 96, 256, 256]),i can get the result of LiftLocalSelfAttention, but when i run “out_2 = sa_2(out_1)”,an error will be reported:
RuntimeError: [enforce fail at CPUAllocator.cpp:68] . DefaultCPUAllocator: can't allocate memory
Is this model unable to load large images and Is it possible to reduce patch_size
thanl you!!!!!!
hello~
i want to ask if i can build rotated-equivariant swin transformer anchitectural in your g_selfatt library?
thank you!!
Hi,
On the CIFAR 10 dataset, I aim to replicate the given results.
However, the training on the CIFAR-10 dataset always diverge even if I disable automatic mixed precision by using cosine learning rate scheduler.
Could you please share your configuration files so that I can replicate the reported results?
Based on the paper, here is my configuration file for training on the CIFAR-10 dataset.
def get_config():
default_config = dict(
# --------------------------
# General parameters
dataset="CIFAR10",
# The dataset to be used, e.g., MNIST.
model="z2sa",
# The model to be used, e.g., p4sa.
optimizer="SGD",
# The optimizer to be used, e.g., Adam.
optimizer_momentum=0.9,
# If optimizer == SGD, this specifies the momentum of the SGD.
device="cuda",
# The device in which the model will be deployed, e.g., cuda.
scheduler="linear_warmup_cosine",
# The lr scheduler to be used, e.g., multistep, constant.
sched_decay_steps=(400,),
# If scheduler == multistep, this specifies the steps at which
# The scheduler should be decreased.
sched_decay_factor=1.0,
# The factor with which the lr will be reduced, e.g., 5, 10.
lr=0.01,
# The lr to be used, e.g., 0.001.
norm_type="LayerNorm",
# The normalization type to be used in the network, e.g., LayerNorm.
attention_type="Local",
# The type of self-attention to be used in the network, e.g., Local, Global.
activation_function="Swish",
# The activation function used in the network. e.g., ReLU, Swish.
patch_size=5,
# If attention_type == Local, the extension of the receptive field on which self-attention is calculated.
dropout_att=0.1,
# Specifies a layer-wise dropout factor applied on the computed attention coefficients, e.g., 0.1.
dropout_values=0.1,
# Specifies a layer-wise dropout factor applied on the resulting value coefficients from self-att layers, e.g., 0.1.
whitening_scale=1.0,
# Specifies a factor with which the current variance initialization is weighted.
weight_decay=0.0001,
# Specifies a L2 norm over the magnitude of the weigths in the network, e.g., 1e-4.
batch_size=24,
# The batch size to be used, e.g., 64.
epochs=350,
# The number of epochs to perform training, e.g., 200.
seed=0,
# The seed of the run. e.g., 0.
comment="",
# An additional comment to be added to the config.path parameter specifying where
# the network parameters will be saved / loaded from.
pretrained=False,
# Specifies if a pretrained model should be loaded.
train=True,
# Specifies if training should be performed.
augment=False, # No augment used in our experiments.
path="",
# This parameter is automatically derived from the other parameters of the run. It specifies
# the path where the network parameters will be saved / loaded from.
)
default_config = ml_collections.ConfigDict(default_config)
return default_config
Aside from that, could you please answer a couple of questions?
Best,
Byungjin Kim
Hi,
In order to compare the p4sa model with other group-equivariant convolution-based models like G-CNN and E(2)-CNN, I tried to evaluate it.
However, the results were worse than these CNN-based models.
Have you ever attempted to evaluate the p4sa model on the CIFAR 10 dataset?
If so, could you please tell me how accurate it was?
Best,
Byungjin Kim
Hi
I'd like to reproduce the results from the experiments section.
with NVIDIA TITAN Xp , training takes twice as long as table 1.
Could you please tell me which GPU you used?
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.