Understanding of the Neural Nets and Classical ML this includes Transformer Architecture, CNNs, LSTMs, UNets and SVM, XGBoost, (param selection, EDA) etc.
Understanding of Maths: Statistics, Probability, Linear Algebra, little bit of calculus (little bit of everything)
Implementaion of paper which I can train on my laptop :smaller versions of large models/ finetuning
Python all the way down
Implementation of Data Pipelines for both image and text
Developer stuff: Docker, AWS and API building etc.
Writing down stuff (maybe): Research findings as well as technical documentation
Stuff I want to be good at:
Implementation of complex models from scratch and not just a toy version
Train a large model on multi-GPU clusture, which involves not just the training but the entire jazz
Valuable contribution to an opensource project
Data scientist with capability to reason model performance and not just build, trial and error.
Writing a clean and optimized code
Being less pretentious, and writing more code than words :)