The g_selfatt from dwromero

The use of models on large images

hello~~
hi~i want to ask how much memory is required to run the model if the size of the input picture is 1024×1024？ i run the demo 0_local_p4_selfattention.ipynb，i put in a picture of myself，and the shape of the input_tensor is torch.Size([1, 96, 256, 256])，i can get the result of LiftLocalSelfAttention， but when i run “out_2 = sa_2(out_1)”，an error will be reported：

RuntimeError: [enforce fail at CPUAllocator.cpp:68] . DefaultCPUAllocator: can't allocate memory

Is this model unable to load large images and Is it possible to reduce patch_size

thanl you！！！！！！

Rotated Transformer-memory

hello~
i want to ask if i can build rotated-equivariant swin transformer anchitectural in your g_selfatt library?
thank you!!

pretrained models to get the reported results

Hi,

On the CIFAR 10 dataset, I aim to replicate the given results.
However, the training on the CIFAR-10 dataset always diverge even if I disable automatic mixed precision by using cosine learning rate scheduler.
Could you please share your configuration files so that I can replicate the reported results?

Based on the paper, here is my configuration file for training on the CIFAR-10 dataset.

def get_config():
default_config = dict(
# --------------------------
# General parameters
dataset="CIFAR10",
# The dataset to be used, e.g., MNIST.
model="z2sa",
# The model to be used, e.g., p4sa.
optimizer="SGD",
# The optimizer to be used, e.g., Adam.
optimizer_momentum=0.9,
# If optimizer == SGD, this specifies the momentum of the SGD.
device="cuda",
# The device in which the model will be deployed, e.g., cuda.
scheduler="linear_warmup_cosine",
# The lr scheduler to be used, e.g., multistep, constant.
sched_decay_steps=(400,),
# If scheduler == multistep, this specifies the steps at which
# The scheduler should be decreased.
sched_decay_factor=1.0,
# The factor with which the lr will be reduced, e.g., 5, 10.
lr=0.01,
# The lr to be used, e.g., 0.001.
norm_type="LayerNorm",
# The normalization type to be used in the network, e.g., LayerNorm.
attention_type="Local",
# The type of self-attention to be used in the network, e.g., Local, Global.
activation_function="Swish",
# The activation function used in the network. e.g., ReLU, Swish.
patch_size=5,
# If attention_type == Local, the extension of the receptive field on which self-attention is calculated.
dropout_att=0.1,
# Specifies a layer-wise dropout factor applied on the computed attention coefficients, e.g., 0.1.
dropout_values=0.1,
# Specifies a layer-wise dropout factor applied on the resulting value coefficients from self-att layers, e.g., 0.1.
whitening_scale=1.0,
# Specifies a factor with which the current variance initialization is weighted.
weight_decay=0.0001,
# Specifies a L2 norm over the magnitude of the weigths in the network, e.g., 1e-4.
batch_size=24,
# The batch size to be used, e.g., 64.
epochs=350,
# The number of epochs to perform training, e.g., 200.
seed=0,
# The seed of the run. e.g., 0.
comment="",
# An additional comment to be added to the config.path parameter specifying where
# the network parameters will be saved / loaded from.
pretrained=False,
# Specifies if a pretrained model should be loaded.
train=True,
# Specifies if training should be performed.
augment=False, # No augment used in our experiments.
path="",
# This parameter is automatically derived from the other parameters of the run. It specifies
# the path where the network parameters will be saved / loaded from.
)
default_config = ml_collections.ConfigDict(default_config)
return default_config

Aside from that, could you please answer a couple of questions?

Which attention type did you employ for the CIFAR-10 dataset? local or global?
There are no pre-trained models that can reproduce the reported results. Where can I find them?

Best,
Byungjin Kim

Evaluation the p4sa model on the CIFAR 10 dataset.

Hi,

In order to compare the p4sa model with other group-equivariant convolution-based models like G-CNN and E(2)-CNN, I tried to evaluate it.
However, the results were worse than these CNN-based models.
Have you ever attempted to evaluate the p4sa model on the CIFAR 10 dataset?
If so, could you please tell me how accurate it was?

Best,
Byungjin Kim

Experiments environments

I'd like to reproduce the results from the experiments section.
with NVIDIA TITAN Xp , training takes twice as long as table 1.

Could you please tell me which GPU you used?

dwromero / g_selfatt Goto Github PK

g_selfatt's People

Contributors

Stargazers

Watchers

Forkers

g_selfatt's Issues

The use of models on large images

Rotated Transformer-memory

pretrained models to get the reported results

Evaluation the p4sa model on the CIFAR 10 dataset.

Experiments environments

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent