Comments (6)
Personally would love to have streaming support in pipelines - itβs the one missing feature. Currently, streaming is quite difficult to use, but this would make it so much easier.
from transformers.
Hi @not-lain, thanks for opening a feature request!
using tokenizer.apply_chat_template then other stuff then model.generate is pretty repetitive
Could you elaborate on this a bit e.g. with a code snippet? Is is the streaming feature when generating you wish to be able to use?
from transformers.
@amyeroberts
normally when someone wants to stream their output (example: https://huggingface.co/spaces/ysharma/Chat_with_Meta_llama3_8b) they need to apply all that code, and this has been quite a repetitive process for AI models, and I thought we can implement this within the transformers library.
from transformers.
I was thinking about integrating this with only text-generation models, but I think we can do that too with image-to-text models.
this is a good resource for that: https://huggingface.co/blog/idefics#getting-started-with-idefics
from transformers.
Thanks for sharing an example!
I'm not sure this is really something we want to add to the pipelines. Pipelines are intended to be simple objects which enable users to get predictions in one line, they're not intended to support all transformers' functionality. In this case, I think it makes sense to leave streaming outside as it enables the user to have full control of the threads and yielding logic.
cc @Rocketknight1 @gante for your thoughts
from transformers.
Yeah, I'm on @amyeroberts's side here - pipelines are (imo) a sort of high-level "on-ramp" API for transformers
, which make it easy for users to quickly get outputs from common workflows. We definitely don't want to pack them full of features to handle every use-case - that's what the lower-level API is for! If we make pipelines very feature-heavy, then they become very big and confusing for new users, which defeats their purpose.
Once users are streaming output and working with threads/yielding/async/etc. they're probably advanced enough that they don't need the pipelines anyway.
from transformers.
Related Issues (20)
- DDP error with load_best_model_at_end enabled
- Error while moving model to GPU `NotImplementedError: Cannot copy out of meta tensor; no data!` HOT 6
- KV cache with CPU offloading HOT 6
- Refusal rejection removal as a feature
- Add static cache support for Whisper HOT 8
- from_pretrained torch_dtype DO NOT affect model buffers HOT 4
- Error with tf-keras when trying to geneate random seeds HOT 2
- Error while runing T5 trainer: TypeError: argument 'ids': 'list' object cannot be interpreted as an integer HOT 3
- Is `model. generate` supported during the training process? HOT 4
- CLIPProcessor is not loading the saved Processor of the same version HOT 12
- Failed to Download GPT2-large Model from Hub HOT 3
- Add TableTransformerImageProcessor HOT 3
- error when convert llama1 ckpts to hf formath HOT 7
- `hub_strategy="every_save"` won't push the model to the Hub if large
- Support for Multiple Datasets and Domain-Specific Loss Calculation in Trainer HOT 2
- AttributeError: 'HQQLinear' object has no attribute 'weight' HOT 8
- Assisted model doesn't seem to be working for Meta-Llama-3-8B HOT 2
- Mixtral past_key_values and output_router_logits incompatible HOT 1
- Disable Progress Bar? HOT 1
- Meet problems when I use the file src/transformers/models/llama/convert_llama_weights_to_hf.py to transfer LlaMa-7B HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
π Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. πππ
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google β€οΈ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from transformers.