backprop-ai / backprop Goto Github PK

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

License: Other

Python 100.00%

natural-language-processing nlp question-answering bert language-model text-classification multilingual-models image-classification fine-tuning transfer-learning transformers

backprop's Introduction

Backprop makes it simple to use, finetune, and deploy state-of-the-art ML models.

Solve a variety of tasks with pre-trained models or finetune them in one line for your own tasks.

Out of the box tasks you can solve with Backprop:

Conversational question answering in English
Text Classification in 100+ languages
Image Classification
Text Vectorisation in 50+ languages
Image Vectorisation
Summarisation in English
Emotion detection in English
Text Generation

For more specific use cases, you can adapt a task with little data and a single line of code via finetuning.

⚡ Getting started	Installation, few minute introduction
💡 Examples	Finetuning and usage examples
📙 Docs	In-depth documentation about task inference and finetuning
⚙️ Models	Overview of available models

Getting started

Installation

Install Backprop via PyPi:

pip install backprop

Basic task inference

Tasks act as interfaces that let you easily use a variety of supported models.

import backprop

context = "Take a look at the examples folder to see use cases!"

qa = backprop.QA()

# Start building!
answer = qa("Where can I see what to build?", context)

print(answer)
# Prints
"the examples folder"

You can run all tasks and models on your own machine, or in production with our inference API, simply by specifying your api_key.

See how to use all available tasks.

Basic finetuning and uploading

Each task implements finetuning that lets you adapt a model for your specific use case in a single line of code.

A finetuned model is easy to upload to production, letting you focus on building great applications.

import backprop

tg = backprop.TextGeneration("t5-small")

# Any text works as training data
inp = ["I really liked the service I received!", "Meh, it was not impressive."]
out = ["positive", "negative"]

# Finetune with a single line of code
tg.finetune({"input_text": inp, "output_text": out})

# Use your trained model
prediction = tg("I enjoyed it!")

print(prediction)
# Prints
"positive"

# Upload to Backprop for production ready inference
# Describe your model
name = "t5-sentiment"
description = "Predicts positive and negative sentiment"

tg.upload(name=name, description=description, api_key="abc")

See finetuning for other tasks.

Why Backprop?

No experience needed
- Entrance to practical AI should be simple
- Get state-of-the-art performance in your task without being an expert
Data is a bottleneck
- Solve real world tasks without any data
- With transfer learning, even a small amount of data can adapt a task to your niche requirements
There are an overwhelming amount of models
- We offer a curated selection of the best open-source models and make them simple to use
- A few general models can accomplish more with less optimisation
Deploying models cost effectively is hard work
- If our models suit your use case, no deployment is needed: just call our API
- Adapt and deploy your own model with just a few lines of code
- Our API scales, is always available, and you only pay for usage

Examples

Solve any text based task with Finetuning (Github, Colab)
Search for images using text (Github)
Finding answers from text (Github)
More finetuning and task examples

Documentation

Check out our docs for in-depth task inference and finetuning.

Model Hub

Curated list of state-of-the-art models.

Demos

Zero-shot image classification with CLIP.

Credits

Backprop relies on many great libraries to work, most notably:

Feedback

Found a bug or have ideas for new tasks and models? Open an issue.

backprop's People

Contributors

Stargazers

Watchers

Forkers

backprop's Issues

Finetuning Image Text Vectorizer with CLIP

Hello, I tried finetuning Image-Text Vectorizer CLIP model using above approach. But I get stuck with the error -

Link to full code - Colab

What I need is something which gives cosine similarity between an image and a text, shall I finetune with triplet, or with cosine similarity? if its cosine similarity, then how will I get those cosine similarity?

The triplet variant takes text and image and gives one normalised vector, I am bit confused because I thought it would give a cosine similarity.

How can I execute this on GPU?

I am trying to run this code below on GPU, where should I specify device and what is the command like ?

device='gpu' or device='cuda' and where should I be mentioning it ?

This is your old code bit :

from transformers import T5ForConditionalGeneration, T5Tokenizer

MODEL = "kiri-ai/t5-base-qa-summary-emotion"
TOKENIZER = "t5-base"

def generate(input_text, model_name: str = None, tokenizer_name: str = None):
    # Refer to global variables
    global model
    global tokenizer
    # Setup
    # Initialise model
    if model == None:
        # Use the default model
        if model_name == None:
            model = T5ForConditionalGeneration.from_pretrained(MODEL)
        # Use the user defined model
        else:
            model = T5ForConditionalGeneration.from_pretrained(model_name)

    # Initialise tokenizer
    if tokenizer == None:
        # Use the default tokenizer
        if tokenizer_name == None:
            tokenizer = T5Tokenizer.from_pretrained(TOKENIZER)
        # Use the user defined tokenizer
        else:
            tokenizer = T5Tokenizer.from_pretrained(tokenizer_name)

    is_list = False
    if isinstance(input_text, list):
        is_list = True

    features = tokenizer(input_text, padding=True, return_tensors='pt')
    tokens = model.generate(input_ids=features['input_ids'],
                            attention_mask=features['attention_mask'], max_length=512)
    if is_list:
        return [tokenizer.decode(tokens, skip_special_tokens=True) for tokens in tokens]
    else:
        return tokenizer.decode(tokens[0], skip_special_tokens=True)

def process_item(item):
    return f"emotion: {item}"

def emotion(input_text, model_name: str = None, tokenizer_name: str = None):

    if isinstance(input_text, list):
        input_text = [process_item(item) for item in input_text]
    else:
        input_text = process_item(input_text)

    return generate(input_text, model_name=model_name,
                    tokenizer_name=tokenizer_name)

Best,
Chirag

self._tokenizer.no_truncation() RuntimeError: Already borrowed`

Hi,

I am running the emotion service on cuda and I get this following RuntimeError:

I am sending a batch_size of 256 to extract emotions.
and generate(text, do_sample=False, max_length=512)
Is this something related to max seq length?

File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app
response = self.full_dispatch_request()
File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request
rv = self.handle_user_exception(e)
File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception
reraise(exc_type, exc_value, tb)
File "/opt/conda/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
raise value
File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
rv = self.dispatch_request()
File "/opt/conda/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
return self.view_functionsrule.endpoint
File "/root/sales_emotion/main.py", line 34, in task_level_emotion
response = emotion_task_to_db(task_id)
File "/root/sales_emotion/src/services/emotion_task_level.py", line 67, in emotion_task_to_db
df_insert = task_level_emo(task_id)
File "/root/sales_emotion/src/services/emotion_task_level.py", line 40, in task_level_emo
emotion_labels.extend(kiri.emotion(text))
File "/root/sales_emotion/core.py", line 50, in emotion
return self._emotion(input_text)
File "/root/sales_emotion/models/tasks/emotion.py", line 42, in call
return self.model.emotion(text)
File "/root/sales_emotion/models/custom_models.py", line 25, in emotion
return self.generate(text, do_sample=False, max_length=512)
File "/root/sales_emotion/models/models.py", line 138, in generate
features = self.tokenizer(text, return_tensors="pt")
File "/opt/conda/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 2356, in call
**kwargs,
File "/opt/conda/lib/python3.7/site-packages/transformers/tokenization_utils_base.py", line 2426, in encode_plus
**kwargs,
File "/opt/conda/lib/python3.7/site-packages/transformers/tokenization_utils_fast.py", line 465, in _encode_plus
**kwargs,
File "/opt/conda/lib/python3.7/site-packages/transformers/tokenization_utils_fast.py", line 372, in _batch_encode_plus
pad_to_multiple_of=pad_to_multiple_of,
File "/opt/conda/lib/python3.7/site-packages/transformers/tokenization_utils_fast.py", line 325, in set_truncation_and_paddi
ng
self._tokenizer.no_truncation()
RuntimeError: Already borrowed

Please help.

Chirag Sanghvi

Example for Fine Tuning of CLIP

Hi, thank you for the amazing work.

Could you please add the example notebook/code for fine-tuning of the CLIP model?

How can I finetune the CLIP model for the image classification task? could you please add an example notebook other than EfficientNet finetuning?

Documentation of training on Emotion

Hi,

First of all thanks a ton for this repository, the results that I am looking for is quite promising without any down-stream task training. :)

Are you planning to publish the paper/documentation on the data T5 is trained for emotion ?

Best,
Chirag Sanghvi

Finetuning & Cuda

Hello Backprop Team!

Great job on the library.

I was trying to replicate the Generate Questions Fine-tuning example: https://github.com/backprop-ai/backprop/blob/main/examples/Finetuning_GettingStarted.ipynb

However, I'm facing the following error:

Exception: You need a cuda capable (Nvidia) GPU for fine-tuning.

When I add the following line to my code while calling the TextGeneration Model
device="cuda"
the error changes to No CUDA GPUs are available

I'm using the following AWS EC2 instance which claims to have NVIDIA CUDA: https://aws.amazon.com/marketplace/pp/Amazon-Web-Services-AWS-Deep-Learning-Base-AMI-Ubu/B07Y3VDBNS#pdp-overview

Moreover, when I run the command nvcc --version, I see the following output:

Please help. Where am I going wrong?

Best,
Karan

Can't use the ai

Hello, whenever i go to clip.backprop.co and put a picture with some labels and click on "Predict image" i get "something went wrong, try again". Why is that?

Thank you

PDF issue

Hello, I have multiple PDFs and I convert them from pdftotext library, I have 2 questions:

QA: If I pass the entire PDF object it shows unknown whereas checking each page shows the correct result i.e for page in pdf works if page is context for QA whereas pdf itself as the context shows 'unknown' as result
Is there a way to use multiple docs/PDF in some way and generate the result based on the most likely answer?

CLIP Model

Which clip model and weights are used, is it the same as in "openai/clip-vit-base-patch32"?