Giter Club home page Giter Club logo

towhee's Introduction

https://towhee.io

x2vec, Towhee is all you need!

Slack Twitter License Github Actions Coverage

What is Towhee?

Towhee is a flexible, application-oriented framework for computing embedding vectors over unstructured data. It aims to make democratize anything2vec, allowing everyone - from beginner developers to large organizations - to train and deploy complex machine learning pipelines with just a few lines of code.

Towhee has pre-built pipelines for a variety of tasks, including audio/music embeddings, image embeddings, celebrity recognition, and more. For a full list of pipelines, feel free to visit our Towhee hub.

Key features

  • Easy embedding for everyone: Transform your data into vectors with less than five lines of code.

  • Rich operators and pipelines: No more reinventing the wheel! Collaborate and share pipelines with the open source community.

  • Automatic versioning: Our versioning mechanism for pipelines and operators ensures that you never run into dependency hell.

  • Support for fine-tuning models*: Feed your dataset into our Trainer and get a new model in just a few easy steps.

  • Deploy to cloud*: Ready-made pipelines can be deployed to the cloud with minimal effort.

Features marked with a star (*) are on our roadmap and have not yet been implemented. Help is always appreciated, so come join our Slack or check out our docs for more information.

Getting started

Towhee requires Python 3.6+ and Pytorch 1.4.0+. Support for Tensorflow and scikit-learn models is coming soon. Towhee can be installed via pip:

% pip install -U pip  # if you run into installation issues, try updating pip
% pip install towhee

Towhee provides a variety of pre-built embedding pipelines. For example, generating an embedding can be done in as little as five lines of code:

>>> from towhee import pipeline

# Use our in-built embedding pipeline
>>> img_path = 'towhee_logo.png'
>>> embedding_pipeline = pipeline('image-embedding')
>>> embedding = embedding_pipeline(img_path)

Your image embedding is now stored in embedding. It's that simple.

Dive deeper

If you find that one of our default embedding pipelines does not suit you, you can also specify a custom pipeline from the hub as follows:

>>> embedding_pipeline = pipeline('towhee/image-embedding-resnet101')

For a full list of supported pipelines, visit our docs page.

Custom machine learning pipelines can be defined in a YAML file or via a Spark-like high-level programming interface (coming soon โ„ข). The first time you instantiate and use a pipeline, all Python functions, configuration files, and model weights are automatically downloaded from the Towhee hub. To ease the development process, pipelines which already exist in the local Towhee cache (/$HOME/.towhee/pipelines) will be automatically loaded:

# This will load the pipeline defined at $HOME/.towhee/pipelines/fzliu/my-embedding-pipeline.yaml
>>> embedding_pipeline = pipeline('fzliu/my-embedding-pipeline')

Architecture overview

Towhee is composed of three main building blocks - Pipelines, Operators, and a singleton Engine.

  • Pipeline: A Pipeline is a single embedding generation task that is composed of several operators. Operators are connected together within the pipeline via a directed acyclic graph.

  • Operator: An Operator is a single node within a pipeline. An operator can be a machine learning model, a complex algorithm, or a Python function. All files needed to run the operator are contained within a directory (e.g. code, configs, models, etc...).

  • Engine: The Engine sits at Towhee's core. Given a Pipeline, the Engine will drive dataflow between individual operators, schedule tasks, and monitor compute resource (CPU/GPU/etc) usage. We provide a basic Engine within Towhee to run pipelines on a single-instance machine - K8s and other more complex Engine implementations are coming soon.

For a deeper dive into Towhee and its architecture, check out the Towhee docs.

Contributing

Remember that writing code is not the only way to contribute! Submitting issues, answering questions, and improving documentation are some of the many ways you can join our growing community. Check out our contributing page for more information.

Special thanks goes to these folks for contributing to Towhee, either on Github, our Towhee Hub, or elsewhere:




towhee's People

Contributors

binbinlv avatar chiiizzzy avatar derekdqc avatar filip-halt avatar fzliu avatar guorentong avatar jaelgu avatar jeffoverflow avatar junjiejiangjjj avatar oneseer avatar shiyu22 avatar sre-ci-robot avatar sutcalag avatar wxywb avatar zhousicong avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.