Giter Club home page Giter Club logo

pycon2020's Introduction

pycon2020

Natural Language Processing (NLP) in Python tutorial given for PyCon 2020 remote conference.

Link to video: https://youtu.be/vyOgWhwUmec

Resources

Here is a list of resources helpful for items covered throughout the video

Good libraries for NLP:

Bag of words

Overview: https://machinelearningmastery.com/gentle-introduction-bag-words-model/
Sklearn Code: https://scikit-learn.org/stable/modules/feature_extraction.html#text-feature-extraction

Word Vectors

Overview: https://medium.com/@jayeshbahire/introduction-to-word-vectors-ea1d4e4b84bf
Spacy info: https://spacy.io/usage/vectors-similarity

Regexes

Python overview: https://docs.python.org/3/howto/regex.html
Regex Cheatsheet: https://cheatography.com/davechild/cheat-sheets/regular-expressions/
Regex tester: https://regex101.com/
Regex golf (to practice): https://alf.nu/RegexGolf

Stemming/Lemmatizing

Overview & NLTK Code: https://www.guru99.com/stemming-lemmatization-python-nltk.html
Spacy: https://spacy.io/api/lemmatizer

Stopwords

Quick overview + code: https://www.geeksforgeeks.org/removing-stop-words-nltk-python/

Parts of speech

TextBlob usage: https://textblob.readthedocs.io/en/dev/api_reference.html
List of tags: https://www.ling.upenn.edu/courses/Fall_2003/ling001/penn_treebank_pos.html

Transformers:

Attention is all you need: https://arxiv.org/pdf/1706.03762.pdf
Good overview of these architectures https://www.youtube.com/watch?v=TQQlZhbC5ps
Illustrated transfomer: http://jalammar.github.io/illustrated-transformer/

Transformer Types:

Bert: https://arxiv.org/pdf/1810.04805.pdf
OpenAI GPT: https://openai.com/blog/better-language-models/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.