Hi, I have trained a Swin transformer but I am getting a weird error when trying t

It's caused by <a href="https://github.com/leondgarse/keras_cv_attention_models/blob/m

Swinv2 about keras_cv_attention_models HOT 3 CLOSED

leondgarse commented on May 17, 2024

Swinv2

from keras_cv_attention_models.

Comments (3)

leondgarse commented on May 17, 2024

It's caused by swin_transformer_v2.py#L95 attn = keras.layers.Add()([attn, mask]). It can be fixed using attn = attn + mask instead:

# Use `attn = attn + mask` instead of `attn = keras.layers.Add()([attn, mask])`
from keras_cv_attention_models import swin_transformer_v2
mm = swin_transformer_v2.SwinTransformerV2Small_ns()
mm.save('aa.h5')
bb = keras.models.load_model('aa.h5')

But then it will throw error using TPU with bfloat16 in model saving... Not sure if any method fit both situation.
Currently, you may reload using load_weights:

from keras_cv_attention_models import swin_transformer_v2
mm = swin_transformer_v2.SwinTransformerV2Small_ns(input_shape=..., num_classes=...)
# Any other layers
mm.load_weights("{pretrained.h5}")

Maybe wrapping make_window_attention_mask as a layer can work for both situation. Will take a try.

from keras_cv_attention_models.

leondgarse commented on May 17, 2024

It should works now, for both situation. May save and load your model again:

from keras_cv_attention_models import swin_transformer_v2
mm = swin_transformer_v2.SwinTransformerV2Small_ns(input_shape=..., num_classes=...)
# Any other layers
mm.load_weights("{pretrained.h5}")
mm.save("aa.h5")

bb = keras.models.load_model('aa.h5')