koushikvikram / multimodal-image-retrieval Goto Github PK

📝🔍🖼️ A deep learning application for retrieving images by searching with text.

Python 7.48% Jupyter Notebook 92.51% Shell 0.01%

computer-vision convolutional-neural-networks deep-learning github-actions image-retrieval image-search koushik-vikram koushikvikram machine-learning-testing mlops model-testing multimodal-deep-learning multimodal-learning natural-language-processing nlp pytest python pytorch streamlit word2vec

multimodal-image-retrieval's Introduction

MLOps and Data Engineering.

multimodal-image-retrieval's People

Contributors

Stargazers

Watchers

multimodal-image-retrieval's Issues

Create DataSet classes loading Training, Validation and Test Images.

For training and validation sets, load the caption embeddings and corresponding images in memory.
For test set, load only images in memory. Don't load captions.

We'll use images along with their caption embeddings to train an "encoder" that learns how to embed images in caption embedding space.

GitHub Action for Python Tests

Setup GitHub Action for running Python tests.

Refer to the following articles:

Explore Word2Vec Embeddings

Explore words for similarity and belonging in the word2vec model after training. Plot the vectors along with their labels using t-SNE.

Reference:

https://www.kaggle.com/jeffd23/visualizing-word-vectors-with-t-sne

Use Google Style Python Docstrings for Classes and their Methods

Read the following article and update docstrings in Classes and their Methods:

https://sphinxcontrib-napoleon.readthedocs.io/en/latest/example_google.html

Develop and train a ResNet to embed images in the caption embedding space

Train a Convolutional Neural Network to regress Caption Embeddings from Images.

Start with a ResNet pre-trained on ImageNet.

We're teaching the Neural Network to embed Images in the Caption Embedding space.

Write tests for the Caption Class

Write Unit Tests for the Caption Class.

Generate Embeddings for the Test Images

Run the test images through the trained Convolutional Neural Network and get their caption embeddings.

Save these embeddings along with the Image IDs to disk.

Interactive Word2Vec Exploration App

Tasks:

Convert Colab Notebook interactive using Widgets.
Make an interactive app using Voila for exploring the trained Word2Vec model.
Deploy the app on Heroku and provide link in README.md

Model and loss function details

Hi, can you please let me know the model that you are using for training and also the multimodal loss function that you are using.

Write tests for the Word2Vec Model

Test the word2vec model. This goes along with exploration and visualization of the model.

Generate Documentation Automatically using Sphinx GitHub Action

Setup the following GitHub Action to automatically generate documentation when code is pushed:

https://github.com/marketplace/actions/sphinx-build

Deploy as WebApp to Streamlit Cloud

Offer the Image Retrieval System as a WebApp using Streamlit and deploy it to Streamlit cloud from the GitHub repo.

https://streamlit.io/cloud

Generate train, validation and test sets for training and testing Convolutional Neural Network

Generate embeddings for each caption and split them into train, validation and test sets.

Save the split datasets to disk. They will be used for training and testing the Convolutional Neural Network.

Write tests for CaptionDataset class

Test the Dataset class.