Giter Club home page Giter Club logo

Michael Znidarsic's Projects

bert icon bert

TensorFlow code and pre-trained models for BERT

bert-fake-news icon bert-fake-news

Predicts news text's reliability with 91%+ validation accuracy. Uses Google BERT encoding as input for a Deep Bidirectional-LSTM Neural Network. Dataset consists of decent-length articles balanced for political leaning and spanning a diverse spectrum of reliability to fit the real-world newsscape. Initial research for this model available at https://github.com/michaelznidarsic/FakeNewsDetection

cs224d icon cs224d

Code for Stanford CS224D: deep learning for natural language understanding

fakenewsdetection icon fakenewsdetection

Novel approaches to detecting intentionally fake and willfully misleading news articles. The end result of this study is an ensemble learning binary classifier of news (fake vs. real, or more accurately: unreliable vs. reliable). Attributes fed into the submodels include normalized word frequencies (e.g. TF-IDF), lexical cues, and distributions of word sentiment severity. The formatting of the PowerPoint may have been somewhat distorted in a conversion process. The key source for most of the compiled dataset was several27's excellent FakeNewsCorpus at https://github.com/several27/FakeNewsCorpus

pixel-importance-image-classification icon pixel-importance-image-classification

An exploration of the predictive importance of individual pixels in a deep convolutional neural network using SHAP values. Neural Network architecture inspired by VGG16. Image classification on the Intel Scene Classification dataset available at https://www.kaggle.com/nitishabharathi/scene-classification.

purchaseprediction-customersegmentation icon purchaseprediction-customersegmentation

A series of projects all attempting to link customer traits/actions to target behavior. Unsupervised methods including KMeans clustering and Principal Component Analysis are used for Customer Segmentation. Machine Learning models such as XGBoost, RandomForest, SVMs, and Deep Neural Networks are used to predict customer behavior. Datasets are generally from banks or markets.

speech-recognition-convolutional-nn icon speech-recognition-convolutional-nn

Experiment in Speech Recognition on Google's Speech Command Dataset using Tensorflow/Keras. 88%-89% validation accuracy achieved classifying between spoken digits (zero through nine) using MFCC transformation and a deep CNN. Work in progress, a couple preprocessing functions disclaimed as borrowed in the code.

split-gene-classification icon split-gene-classification

A neural network that takes as input a sequence of 60 nitrogenous bases (DNA) and predicts whether the sequence contains an intron/exon boundary (IE), an exon/intron boundary (EI), or neither (N). A maximum validation accuracy of 96.24% was reached. Data obtained at https://archive.ics.uci.edu/ml/datasets/Molecular+Biology+%28Splice-junction+Gene+Sequences%29

syracuseuniversityadmissionsprediction icon syracuseuniversityadmissionsprediction

This study creates and compares the efficacy of several machine learning models for the prediction of whether or not an undergraduate student offered admission at Syracuse University will accept admission. The dataset is proprietary and cannot be shared.

textualentailmentbilstmattention icon textualentailmentbilstmattention

A Bi-Directional LSTM with Neural Attention and word embeddings. Tackles the difficult problem of Textual Entailment using the Stanford Natural Language Inference (SNLI) corpus. Demonstrates that a 3-class validation accuracy of 76%+ can be obtained on the corpus without resorting to pre-training or recursion/trees. Concept pioneered in "Reasoning about Entailment with Neural Attention" by Rocktäschel et al. Inspiration taken from https://github.com/shyamupa/snli-entailment. Please find data corpus at https://nlp.stanford.edu/projects/snli/

textualentailmentdualembeddedcnn icon textualentailmentdualembeddedcnn

A 2-input Convolutional Neural Network with word embeddings. Tackles the difficult problem of Textual Entailment using the Stanford Natural Language Inference (SNLI) corpus. Demonstrates that a 3-class validation accuracy of 73%+ can be obtained on the corpus without resorting to pre-training, recursion/trees, attention, or LSTM/RNNs. Please find data corpus at https://nlp.stanford.edu/projects/snli/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.