Giter Club home page Giter Club logo

imbd_review_lstm_embed_classifier's Introduction

IMDB Review Classification

This repository contains code for text classification of IMDB movie reviews. The goal is to predict whether a movie review is positive or negative. The dataset consists of movie reviews from the Internet Movie Database (IMDB).

Dataset

The IMDB movie reviews dataset is used for training and evaluation. The dataset contains movie reviews as text, and each review is labeled as either positive or negative sentiment.

Main_LSTM.py

Data Loading and Preprocessing

  • The dataset is loaded using the tensorflow_datasets library, which provides the IMDB_reviews/subwords8k dataset with subword tokenization.
  • The training and test data are shuffled and padded to a fixed length using padded_batch.
  • The tokenizer is obtained from the dataset's information and used to convert text data to sequences and pad them.

LSTM-based Model

  • The LSTM-based model architecture consists of an Embedding layer followed by two Bidirectional LSTM layers to capture sequential information from the text.
  • The model has a Dense layer for feature extraction and a final Dense layer with a sigmoid activation function for binary classification (positive or negative).

Training and Evaluation

  • The model is trained using binary cross-entropy loss and the Adam optimizer for 10 epochs.
  • The training and validation accuracy and loss are plotted to assess the model's performance.

Main.py

Data Loading and Preprocessing

  • The dataset is loaded using tensorflow_datasets, and the movie reviews and their corresponding labels are extracted.
  • The sentences are converted to sequences using Tokenizer, and the sequences are padded to a fixed length.

Embedding-based Model

  • The embedding-based model architecture consists of an Embedding layer followed by a Flatten layer to flatten the sequence data.
  • The model then has a Dense layer for feature extraction and a final Dense layer with a sigmoid activation function for binary classification.

Training and Evaluation

  • The model is trained using binary cross-entropy loss and the Adam optimizer for 100 epochs.
  • The training and validation accuracy and loss are plotted to assess the model's performance.

Save Embedding

  • The save_for_embedding function is included to save the word embeddings for visualization using the Embedding Projector.

Feel free to experiment with different hyperparameters, model architectures, or other methods to further improve the classification accuracy.

For any questions or suggestions, please contact Francesco Alotto. Happy movie review classification with AI! ๐ŸŽฅ๐Ÿค–

imbd_review_lstm_embed_classifier's People

Watchers

Francesco Alotto avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.