Comments (8)
Hi @krrishdholakia that docker pull command works on my M1 MacBook. Any further context you can provide about your environment?
I did uncover an issue with the latest nightly docker though, so we will be looking into that:
➜ ~ docker pull ghcr.io/neuralmagic/deepsparse-nightly
Using default tag: latest
latest: Pulling from neuralmagic/deepsparse-nightly
no matching manifest for linux/arm64/v8 in the manifest list entries
EDIT: It seems we do not have docker image available for ARM. Please try installing deepsparse-nightly using PyPi instead
from deepsparse.
@mgoin i'm trying to get a local server running, so i can add support for it via litellm. How do i do that with the pip package?
pip install deepsparse-nightly
--task sentiment-analysis \
--model_path zoo:nlp/sentiment_analysis/obert-base/pytorch/huggingface/sst2/pruned90_quant-none```
<img width="825" alt="Screenshot 2023-12-09 at 10 57 49 AM" src="https://github.com/neuralmagic/deepsparse/assets/17561003/9aadd52b-9327-4252-89b3-375083f801d9">
from deepsparse.
from deepsparse.
Hi @krrishdholakia what version of python do you have? We support python 3.8-3.11 as of 1.6 stable or nightly.
If you want to use the server and transformers, you need to install those extras - as in pip install -U deepsparse-nightly[server,transformers]
.
Here is a colab notebook as an example: https://colab.research.google.com/drive/1Ng10jwBLUs81SDzZLE9P8G-q8D2YyKeL?usp=sharing
from deepsparse.
thanks for the colab @mgoin i have 3.11
from deepsparse.
@mgoin any suggestions for how i can actually use / test the server from the colab, i was thinking about running it via ngrok, but not sure how i could wrap it.
from deepsparse.
Hey @krrishdholakia I wouldn't recommend trying to host a server from colab since it isn't a supported flow from Google.
You should be able to run that notebook locally from your macbook since we have native ARM MacOS support in the deepsparse release, I just shared the colab as an example of an environment working without docker.
Please let me know if you need a docker image built for ARM, otherwise please go the pip install -U deepsparse-nightly[server,llm]
route in your local python environment.
We have docker images ready if you are on an x86 machine - here is an example running on windows docker:
Server:
docker run -p 5543:5543 -it ghcr.io/neuralmagic/deepsparse-nightly:20231220 deepsparse.server --task text-generation --integration openai --model_path hf:mgoin/llama2.c-stories15M-ds
Client:
curl http://localhost:5543/v1/chat/completions \
-H "Content-Type: application/json" \
-H "Authorization: Bearer dummy" \
-d '{
"model": "hf:mgoin/llama2.c-stories15M-ds",
"messages": "Once upon a time"
}'
from deepsparse.
Hi @krrishdholakia
As some time has passed with no further updates, I am going to go ahead and close out this issue.
Please re-open if you want to continue the conversation.
Best, Jeannie / Neural Magic
from deepsparse.
Related Issues (20)
- Research: 4-bit quantization HOT 5
- Assertion `!cache_sizes.empty()' failed HOT 2
- transformers_embedding-extraction for text-generation tasks HOT 3
- Question on quantization size HOT 1
- NM: error: Node (/model/Add_1) Op (Add) [ShapeInferenceError] Incompatible dimension HOT 5
- Using output_value as "token_embeddings" is broken for Sentence Transformer HOT 2
- Unsupported ONNX type 10 for FP16 HOT 5
- Assertion at src/lib/core/topology.cpp:627 HOT 1
- yolo-v8 in onnx-runtime outperforms deepsparse on iMX8 HOT 3
- Python3.12? HOT 3
- How to use for the fintuned roberta model for text classification HOT 3
- Purpose of exporter.export_onnx(sample_batch=torch.randn(1, 1, 28, 28)) HOT 4
- deepsparse.TextGeneration doesn't accept `trust_remote_code` as an arg anymore HOT 1
- YOLOv8 - Display bounding boxes and classes names in image using python. HOT 3
- [Question] about converting onnx model with dynamic batch size input to deepsparse model HOT 2
- How can I make proper request to server HOT 1
- Unable to load DeepSparseSentenceTransformer HOT 1
- How can I use deepsparse instead of ultralytics HOT 2
- Unknown Pipeline task yolov8. Currently supported tasks are ['text_generation', 'opt', 'llama', 'code_gen', 'code_generation', 'codegen', 'image_classification', 'mpt'] HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepsparse.