This was fun Kaggle competition (https://www.kaggle.com/c/google-quest-challenge). Instead of trying to win, I wanted to check how far I can stretch the good old LSTM to compete against bigger transformer models of other competitors.
Verdict: You can't - unless you try some crazy ensemble of a ton of them.