Giter Club home page Giter Club logo

rayqdrantfastembed's Introduction

Ray Distributed Computing with FastEmbed and Qdrant

This repository contains code to demonstrate the usage of Ray distributed computing framework along with FastEmbed for embedding generation and Qdrant for similarity search. Specifically, it shows how to efficiently generate embeddings for text data, store them in Qdrant, and perform similarity search queries.

Requirements

  • Python 3.x
  • Jupyter Notebook (for running RayQdrant.ipynb)
  • PyPDF2
  • nltk
  • ray
  • fastembed
  • qdrant_client

You can install the required libraries using pip:

pip install PyPDF2 nltk fastembed qdrant-client[fastembed]
pip install -U "ray[data,train,tune,serve]"

Usage

  1. Clone the Repository:

    Clone this repository to your local machine:

    git clone https://github.com/yash9439/RayQdrantFastEmbed.git
  2. Start Docker Environment:

    Open the RayQdrant.ipynb file using Jupyter Notebook:

    sudo docker pull qdrant/qdrant
    sudo docker run -p 6333:6333 -p 6334:6334 -v $(pwd)/qdrant_storage:/qdrant/storage:z qdrant/qdrant
  3. Run Jupyter Notebook:

    Open the RayQdrant.ipynb file using Jupyter Notebook:

    jupyter notebook RayQdrant.ipynb

    Execute each cell in the notebook sequentially to run the code. Ensure you have the necessary dependencies installed.

  4. Interpret Results:

    After running the notebook, you will see the time taken for embedding generation using Ray distributed computing. Additionally, you'll get the results of similarity search queries using Qdrant.

Folder Structure

  • Docs/: This directory contains the PDF documents for which embeddings are generated.
  • RayQdrant.ipynb: Jupyter Notebook containing the code for embedding generation using Ray and similarity search using Qdrant.

License

This code is provided under the Apache License 2.0.

Feel free to modify and distribute it as needed. If you find any issues or have suggestions for improvements, please feel free to open an issue or create a pull request.

rayqdrantfastembed's People

Contributors

yash9439 avatar

Stargazers

 avatar  avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.