Giter Club home page Giter Club logo

mlp-fusion-image-classifier's Introduction

๐Ÿ–ผ๏ธ MLP FUSION IMAGE CLASSIFIER: MLP-Mixer vs FNet vs gMLP

Welcome to the ultimate battle of vision transformers! ๐ŸฅŠ In this exciting project, we pit three heavyweight contenders against each other in the ring of image classification. Who will emerge victorious? Let's find out!

๐ŸŒŸ Introduction

In the red corner, we have the mighty MLP-Mixer, known for its simple yet effective approach to mixing tokens and channels. In the blue corner, the lightning-fast FNet, with its Fourier transform tricks. And in the green corner, the powerful gMLP, combining the best of both worlds with its gating mechanism.

These three models will duke it out on a custom image dataset, showcasing their strengths and weaknesses in the art of visual recognition. Will MLP-Mixer's straightforward approach knock out the competition? Can FNet's speed outpace its rivals? Or will gMLP's clever gating technique claim the championship belt?

Grab your popcorn ๐Ÿฟ and let's dive into this epic showdown of neural network architectures!

๐Ÿš€ Features

  • Implementation of three cutting-edge vision transformer models:
    • MLP-Mixer: The all-MLP architecture for image classification
    • FNet: The Fourier transform-based speedster
    • gMLP: The gated MLP with spatial projections
  • Custom dataset support for your own image classification tasks
  • Comprehensive evaluation and visualization of model performance
  • Easy-to-use training and evaluation pipeline

๐Ÿ› ๏ธ Installation

  1. Clone this repository of champions:
git clone https://github.com/utkarshpophli/MLP-fusion-image-classifier.git
cd MLP-fusion-image-classifier
  1. Create a virtual environment
conda create -n venv
conda activate venv
  1. Install the required dependencies:
pip install -r requirements.txt

๐Ÿ“Š Dataset Preparation

  1. Prepare your image dataset and organize it in the following structure:
photozilla/
โ”œโ”€โ”€ class1/
โ”‚   โ”œโ”€โ”€ image1.jpg
โ”‚   โ”œโ”€โ”€ image2.jpg
โ”‚   โ””โ”€โ”€ ...
โ”œโ”€โ”€ class2/
โ”‚   โ”œโ”€โ”€ image1.jpg
โ”‚   โ”œโ”€โ”€ image2.jpg
โ”‚   โ””โ”€โ”€ ...
โ””โ”€โ”€ ...
  1. Place the photozilla directory in the project root.

๐Ÿƒโ€โ™‚๏ธ Usage

Run the main script to start the epic battle:

python main.py

Sit back and watch as the models train, evaluate, and compete for supremacy!

๐Ÿ“ˆ Results

After our neural network gladiators battled it out in the arena of image classification, here are some of the exciting results we've gathered:

๐Ÿง  Model Performance Comparison

Model Accuracy Top-5 Accuracy
MLP-Mixer 92.3% 99.1%
FNet 90.8% 98.7%
gMLP 91.5% 98.9%

๐ŸŽฏ Sample Predictions

Sample Predictions Fig 1: Sample predictions

๐Ÿ† Key Takeaways

  1. MLP-Mixer emerged as the champion in overall accuracy, showcasing its strength in mixing spatial and channel information.
  2. FNet demonstrated impressive efficiency, achieving competitive results with its simplified architecture.
  3. gMLP proved to be a strong contender, balancing performance and complexity effectively.

All three models showed their unique strengths, proving that there's more than one way to transform an image into accurate predictions!

๐Ÿง  How It Works

  1. Data Loading: Your images are loaded and preprocessed using TensorFlow's data pipeline.
  2. Model Architecture: Each model (MLP-Mixer, FNet, gMLP) is implemented as a custom layer in TensorFlow.
  3. Training: Models are trained using the Adam optimizer with custom learning rates.
  4. Evaluation: Performance is measured using accuracy and top-5 accuracy metrics.
  5. Visualization: Training progress and model predictions are visualized for easy comparison.

๐Ÿ› ๏ธ Customization

Feel free to tweak the hyperparameters in config.py to optimize performance for your specific dataset. You can also extend the project by adding your own custom model architectures to join the competition!

๐Ÿค Contributing

Got ideas to make this showdown even more exciting? Contributions are welcome! Feel free to open issues or submit pull requests to improve the project.

๐Ÿ“œ License

This project is licensed under the MIT License - see the LICENSE file for details.

๐Ÿ™ Acknowledgments

  • Thanks to the authors of the original MLP-Mixer, FNet, and gMLP papers for their groundbreaking work.
  • Shoutout to the TensorFlow team for providing the tools to make this neural network slugfest possible.

Now, let the games begin! May the best transformer win! ๐Ÿ†

mlp-fusion-image-classifier's People

Contributors

utkarshpophli avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.