Understanding Neural Nets through Examining the Effects of Noisy and Incorrect Labeling
ABSTRACT: In this paper we propose an approach to increase the robustness of Multi-classification Artificial Neural Networks to noise. Our approach is to train the Algorithm as a binary classification algorithm to lower the effect noisy data has on weights and biases of the ANN. The Multiplication Algorithms trains on the information that a random guess is the image label, but tests in the same way as a multi-classification algorithm. Furthermore, The Multiplication Algorithm is tested against the standard Pytorch ANN both of which are trained using the MNIST dataset. In this paper, we can conclude that the the Multiplication Algorithm is more noise resistant than the regular Algorithm, but needs more data than the regular algorithm.
SUMMARY: For ANNs to have accurate results it needs to meet two conditions: be trained on a large amount of data and that the data has a low error bound. Specifically in a research setting it becomes easier to generate experimental data than it is to reduce noise. This paper shows that the training an artificial neural network on binary labels increases the robustness of the algorithm to noisy data at the expense of needing more data. Specifically, two novel ANN Algorithms, the Concatenation and Multiplication Algorithms are compared to the standard pytorch multi-classification algorithm. The Concatenation and Multiplication Algorithms are trained on binary labels of if a randomly generated guess is the the correct label. All three ANNs were tested on the problem of classifying handwritten digits because of the accessibility of the MNIST dataset, but the ANNs can be generalized for any multi-classification algorithm.