Giter Club home page Giter Club logo

neonwatty / meme_search Goto Github PK

View Code? Open in Web Editor NEW
273.0 3.0 9.0 9.6 MB

Index your memes by their content and text, making them easily retrievable for your meme warfare pleasures. Find funny fast.

Home Page: https://memesearch.co/

License: Apache License 2.0

Python 0.83% CSS 0.01% Jupyter Notebook 99.14% Dockerfile 0.03%
demo-app generative-ai large-language-models machine-learning text-embedding vector-database vision-language-model

meme_search's Introduction

Open In Colab Youtube Python application

Meme Search app, walkthrough, and demo

Use Python and AI to index your memes by their content and text, making them easily retrievable for your meme warfare pleasures.

A table of contents for the remainder of this README:

Introduction

This repository contains code, a walkthrough notebook (meme_search_walkthrough.ipynb), and streamlit demo app for indexing, searching, and easily retrieving your memes based on semantic search of their content and text.

All processing - from image-to-text extraction, to vector embedding, to search - is performed locally.

Pipeline overview

This meme search pipeline is built using the following open source components:

  • moondream: a tiny, kickass vision language model used for image captioning / extracting image text
  • all-MiniLM-L6-v2: a very popular text embedding model
  • faiss: a fast and efficient vector db
  • sqlite: the greatest database of all time, used for data indexing
  • streamlit: for serving up the app

Installation instructions

To create a handy tool for your own memes pull the repo and install the requirements file

pip install -r requirements.txt

Note that the particular pinned requirements here are necessary to avoid a current nasty segmentation fault involving sentence-transformers as of 6/5/2024.

Alternatively you can install all the requirements you need using docker via the compose file found in the repo. The command to install the above requirements and start the server using docker-compose is

docker compose up

Start the streamlit server

After indexing your memes you can then start the streamlit app, allowing you to semantically search for and retrieve your memes

python -m streamlit run meme_search/app.py

To start the app via docker-compose use

docker compose up

Note: you can drag and drop any recovered meme directly from the streamlit app to any messager app of your choice.

Index your own memes

Place any images / memes you would like indexed for the search app in this repo's subdirectory

data/input/

You can clear out the default test images in this location first, or leave them.

Next, click the "refresh index" button to update your index when images are added or removed from the image directory, affecting only the newly added or removed images.

Alternatively - at your terminal - paste the following command

python meme_search/utilities/create.py

or if running the server via docker us

docker exec meme_search python meme_search/utilities/create.py

You will see printouts at the terminal indicating success of the 3 main stages for making your memes searchable. These steps are

  1. extract: get text descriptions of each image, including ocr of any text on the image, using the kickass tiny vision-llm moondream

  2. embed: window and embed each image's text description using a popular embedding model - sentence-transformers/all-MiniLM-L6-v2

  3. index: index the embeddings in an open source and local vector base faiss database and references connecting the embeddings to their images in the greatest little db of all time - sqlite

Changelog

Meme Search is under active development! See the CHANGELOG.md in this repo for a record of the most recent changes.

Feature requests and contributing

Feature requests and contributions are welcome!

See the discussion section of this repository for suggested enhancements to contribute to / weight in on!

Please see CONTRIBUTING.md for some boilerplate ground rules for contributing.

Running tests

Tests can be run by first installing the test requirements as

pip install -r requirements.test

Then the test suite can be run as

python -m pytest tests/

meme_search's People

Contributors

jasonyang-ee avatar neonwatty avatar stroescutheo avatar thijsvanloef avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

meme_search's Issues

get_current_indexed_img_names failed with exception unable to open database file

Hi, i try to use your docker compose

version: '3.8'

volumes:
  meme_search-config:
    driver_opts:
      type: "nfs"
      o: "addr={{IP}},rw,noatime,rsize=8192,wsize=8192,tcp,timeo=14,nfsvers=4"
      device: ":/volume1/docker-config/meme_search-config"
  syno-image:
    driver_opts:
      type: "nfs"
      o: "addr={{IP}},rw,noatime,rsize=8192,wsize=8192,tcp,timeo=14,nfsvers=4"
      device: ":/volume2/multimedia/image"
      
services:
  meme_search:
    image: ghcr.io/neonwatty/meme-search:latest
    container_name: meme_search
    ports:
      - 8501:8501
    volumes:
      - meme_search-config:/home/data
      - syno-image:/home/data/input
    ## uncomment to enable GPU support for the container
    # deploy:
    #   resources:
    #     reservations:
    #       devices:
    #         - driver: nvidia
    #           count: 1
    #           capabilities: [gpu]

From the webapp i tried to click on 'refresh index' but i got this error

ValueError: FAILURE: get_current_indexed_img_names failed with exception unable to open database file
Traceback:

File "/usr/local/lib/python3.10/site-packages/streamlit/runtime/scriptrunner/script_runner.py", line 589, in _run_script
    exec(code, module.__dict__)
File "/home/meme_search/app.py", line 33, in <module>
    val = process()
File "/home/meme_search/utilities/create.py", line 8, in process
    old_imgs_to_be_removed, new_imgs_to_be_indexed = get_input_directory_status(img_dir, sqlite_db_path)
File "/home/meme_search/utilities/status.py", line 24, in get_input_directory_status
    current_indexed_names = get_current_indexed_img_names(sqlite_db_path)
File "/home/meme_search/utilities/status.py", line 18, in get_current_indexed_img_names
    raise ValueError(f"FAILURE: get_current_indexed_img_names failed with exception {e}")

I try to execute this command from the container shell

python meme_search/utilities/create.py

But i got the same error

[Feature Sugestion] Allow indexing of assets in subfolders

Pretty much what it says in the title. It would be useful to be able to use subfolders on the main data folder:

  • Easier to import existing images
  • Keeps some low tech organization in case of app technical issues
  • In docker, allows for easy mounting of other folders without having to move pictures around

Missing LICENSE

I see you have no LICENSE file for this project. The default is copyright.

I would suggest releasing the code under the GPL-3.0-or-later or AGPL-3.0-or-later license so that others are encouraged to contribute changes back to your project.

ERROR: The Compose file './compose.yaml' is invalid because: 'name' does not match any of the regexes: '^x-'

When attempting to run the docker-compose step I am getting the following error:

ERROR: The Compose file './compose.yaml' is invalid because:
'name' does not match any of the regexes: '^x-'

You might be seeing this error because you're using the wrong Compose file version. Either specify a supported version (e.g "2.2" or "3.3") and place your service definitions under the `services` key, or omit the `version` key and place your service definitions at the root of the file to use version 1.
For more on the Compose file format versions, see https://docs.docker.com/compose/compose-file/

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.