Machine Learning & Big Data 2022 Fall homework 1: mini batch-sgd
https://github.com/keyork/mlbd2022fall-minibatch-sgd
pip install numpy pandas matplotlib colorlog
python train.py -h
python train.py --args ...
-
Using Mini-batch gradient descent for the example in slides 31-33
-
Test the performances with different batch sizes
Four main parts: Dataloader, Linear Model, SGD, Back Line Search
Using iteration in Python, randomly rearrange all data, load {batch size} data each time.
Using array * array in numpy directly instead of circulate, args is also a np.array:
Search in the direction of getting smaller to get
Set
Set iteration = {1,3,20,50}, using back line search to ensure learning rate, set batch_size = {1,10,50,100,500,1000,4000}, record result and loss curve.
Not using back line search, repeat the experiments.
remove bls, remove mini batch
The larger the batch size, the slower the model converges if others are the same. Back Line Search can ensure that the learning rate is appropriate to avoid divergences and allow the model to converge quickly.
We use the result by {iter=50, batch_size=50, back line search=True} as a good outcome:$\beta=[87.31551772, 8.87405893, 0.4220265, -1.78599689]$