K-Means is an algorithm which automatically clusters similar data together. First, it assigns every training example (point) with its closest centroid:
The second part of the algorithm recomputes the mean of the points that were assigned to it:
Initially, the image has a size of 128 x 128 pixel locations with 24 bits at each location which is 393, 216 bits. The new image has a size of 65, 920 bits after using the algorithm to group similar data together. This compression redues the size by a factor of 6.
Compute the covariance matrix of the data:
Use the SVD function to compute the eigenvectors
Each face image starts with 32 x 32 = 1024 pixels
They are then compressed into a vector of R^100. More than 10 times reduction!
This can be used to train data much more efficiently
The images can then be reconstrcuted from the compressed data using an estimate
Program based on Stanford's Machine Learning Course.