Giter Club home page Giter Club logo

product-tagging's Introduction

Automated Product Tagging

This project of Automated Product Tagging is part of my internal project for my internship: Onestdata.

Every product is made up of several tags that are set to describe its characteristics. These tags can include anything about the product, e.g. color, size and type. These tags allow visitors to filter products based on the categories they want to explore.

The algorithm is largely based on the NLTK library. The NLTK (Natural Language Toolkit) library is a leading platform for building Python programs to work with human language data. Since we work with a dataset which has a description column, containing human language, this package is really useful in producing tags for products. For more documentation you can click on this link: NLTK

The machine learning model on the other hand is based on the TfIdfVectorizer. This method tokenizes documents/texts, learns the vocabulary and inverses the document frequency weighting and allows you to encode new documents. For more documentation you can click on this link: TFIDF

Alongside the model I chose for the LinearSVC (Linear Support Vector Classification). The purpose of this model is to fit to the data you provide, returning a "best fit" hyperplane that divides, or categorizes, your data. From there, after getting the hyperplane, you can then feed some features to your classifier to see what the "predicted" class is. See: NLTK. Because we are dealing with products that can carry multiple tags, this is a good multilabel classification model.

Workflow

workflow

UI Home page to use the machine learning model

alt text

UI Upload CSV page to upload a file

alt text

Installation

Use the package manager pip to install the needed libraries.

pip install -r requirements.txt

Run

flask run

or

python app.py

product-tagging's People

Contributors

wolfsinem avatar

Stargazers

 avatar

Watchers

 avatar  avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.