Giter Club home page Giter Club logo

arend100 / final Goto Github PK

View Code? Open in Web Editor NEW
5.0 2.0 3.0 122.41 MB

# Final_Project ## English Alphabet Machine Learning Model This project utilizes the EMNIST-letters image database. This database is an extension of the MNIST database of handwritten digits. EMNIST datasets can be found at: https://www.nist.gov/node/1298471/emnist-dataset. MNIST datasets can be found at: http://yann.lecun.com/exdb/mnist/. This project imported data by pip installing the emnist (and mnist) libraries. Note - the "data/python-mnist" is from sorki/python-mnist on github. This repository can also be utilized with EMNIST testing. This project did not utilize the sorki folder. #### Notes on EMNIST 'Letters' Dataset "The EMNIST Balanced dataset contains a set of characters with an equal number of samples per class. The EMNIST Letters dataset merges a balanced set of the uppercase and lowercase letters into a single 26-class task. The EMNIST Digits and EMNIST MNIST dataset provide balanced handwritten digit datasets directly compatible with the original MNIST dataset. The EMNIST Letters dataset seeks to further reduce the errors occurring from case confusion by merging all the uppercase and lowercase classes to form a balanced 26-class classification task. In a similar vein, the EMNIST Digits class contains a balanced subset of the digits dataset containing 28,000 samples of each digit." ## About Model This model uses the EMNIST 'Letters' dataset. Refer to 'EMNIST.ipynb' for a detailed look at the model. This model trained at 95% accuracy and test at 91% accuracy. ``` model.fit(X_train, y_train, batch_size=128, epochs=10, shuffle=True, verbose=2) ``` ```Epoch 10/10 1248000/1248000 - 19s - loss: 0.1345 - acc: 0.9512 `` ``` model_loss, model_accuracy = model.evaluate(X_test, y_test, verbose=2) print(f"Loss: {model_loss}, Accuracy: {model_accuracy}") 20800/20800 - 2s - loss: 0.4131 - acc: 0.9130 Loss: 0.41310153188356846, Accuracy: 0.9129807949066162 ## Predicting the Model A: 1, B: 2, C: 3, D: 4, E: 5, F: 6, G: 7, H: 8, I: 9, J: 10, K: 11, L: 12, M: 13, N: 14, O: 15, P: 16, Q: 17, R: 18, S: 19, T: 20, U: 21, V: 22, W: 23, X: 24, Y: 25, Z: 26 This model tested a couple letters from the test. Both were correctly predicted. This model also collected some images from the internet to test. These images came from https://graphemica.com. These images are in the 'images' folder. The first image predicted incorrectly, but the same letter in a different font was predicted correctly. The second imported image seems to be more similar to the emnist dataset. This model attempted to read newly handwritten data (written and imported myself) to test the accuracy of the emnist model made and actual handwriting. The results were not as thought. An image of each letter of the alphabet (capital only) were pictured and uploaded. Their are 4-sets: Pencil, Pen, Sharpie, and Marker. The goal of this was to determine which writing utensil would test the most accurate with the emnist set, however; the pictures imported are all sideways and rotating them made the images much lighter and illegible thus predicting failure for every one tested (only one of each utensil was predicted in the EMNIST.ipynb) A self created model based off the 26 images uploaded for each utensil were created only to have an accuracy of 0. More photos would need to be taken as well as better pixel conversions need to be done to get from 4D down to 2D. ## Conclusions The emnist model itself runs extremely well. The model run with api images need more testing but predicted correct results. Actual handwritten data needs to be redone to come up with better predictions. This is a case of bad data going in, bad data coming out with the self model. Better quality pictures need to be taken and uploaded so rotating is not needed by the 'pillow' library.

Jupyter Notebook 99.77% HTML 0.23%

final's Introduction

Final_Project

English Alphabet Machine Learning Model

This project utilizes the EMNIST-letters image database. This database is an extension of the MNIST database of handwritten digits. EMNIST datasets can be found at: https://www.nist.gov/node/1298471/emnist-dataset. MNIST datasets can be found at: http://yann.lecun.com/exdb/mnist/. This project imported data by pip installing the emnist (and mnist) libraries. Note - the "data/python-mnist" is from sorki/python-mnist on github. This repository can also be utilized with EMNIST testing. This project did not utilize the sorki folder.

Notes on EMNIST 'Letters' Dataset "

The EMNIST Balanced dataset contains a set of characters with an equal number of samples per class. The EMNIST Letters dataset merges a balanced set of the uppercase and lowercase letters into a single 26-class task. The EMNIST Digits and EMNIST MNIST dataset provide balanced handwritten digit datasets directly compatible with the original MNIST dataset. The EMNIST Letters dataset seeks to further reduce the errors occurring from case confusion by merging all the uppercase and lowercase classes to form a balanced 26-class classification task. In a similar vein, the EMNIST Digits class contains a balanced subset of the digits dataset containing 28,000 samples of each digit."

About Model

This model uses the EMNIST 'Letters' dataset. Refer to 'EMNIST.ipynb' for a detailed look at the model. This model trained at 95% accuracy and test at 91% accuracy. model.fit(X_train, y_train, batch_size=128, epochs=10, shuffle=True, verbose=2) Epoch 10/10 1248000/1248000 - 19s - loss: 0.1345 - acc: 0.9512 `` model_loss, model_accuracy = model.evaluate(X_test, y_test, verbose=2) print(f"Loss: {model_loss}, Accuracy: {model_accuracy}") 20800/20800 - 2s - loss: 0.4131 - acc: 0.9130 Loss: 0.41310153188356846, Accuracy: 0.9129807949066162

Predicting the Model

A: 1, B: 2, C: 3, D: 4, E: 5, F: 6, G: 7, H: 8, I: 9, J: 10, K: 11, L: 12, M: 13, N: 14, O: 15, P: 16, Q: 17, R: 18, S: 19, T: 20, U: 21, V: 22, W: 23, X: 24, Y: 25, Z: 26
This model tested a couple letters from the test. Both were correctly predicted. This model also collected some images from the internet to test. These images came from https://graphemica.com. These images are in the 'images' folder. The first image predicted incorrectly, but the same letter in a different font was predicted correctly. The second imported image seems to be more similar to the emnist dataset. This model attempted to read newly handwritten data (written and imported myself) to test the accuracy of the emnist model made and actual handwriting. The results were not as thought. An image of each letter of the alphabet (capital only) were pictured and uploaded. Their are 4-sets: Pencil, Pen, Sharpie, and Marker. The goal of this was to determine which writing utensil would test the most accurate with the emnist set, however; the pictures imported are all sideways and rotating them made the images much lighter and illegible thus predicting failure for every one tested (only one of each utensil was predicted in the EMNIST.ipynb) A self created model based off the 26 images uploaded for each utensil were created only to have an accuracy of 0. More photos would need to be taken as well as better pixel conversions need to be done to get from 4D down to 2D.

Conclusions

The emnist model itself runs extremely well. The model run with api images need more testing but predicted correct results. Actual handwritten data needs to be redone to come up with better predictions. This is a case of bad data going in, bad data coming out with the self model. Better quality pictures need to be taken and uploaded so rotating is not needed by the 'pillow' library.

final's People

Contributors

arend100 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.