Comments (5)
This command for export transformers to onnx model
!optimum-cli export onnx --model /content/bert-base-indonesian-1.5G-sentiment-analysis-smsa bert-sentiment-onnx-fp16-opset/ --opset 13 --task text-classification --optimize 'O1' --device 'cuda' --fp16
from deepsparse.
HI @farizalmustaqim fp16 ONNX models aren't supported in DeepSparse or CPU runtimes generally, so please try your command with these edits
optimum-cli export onnx --model /content/bert-base-indonesian-1.5G-sentiment-analysis-smsa bert-sentiment-onnx-fp32-opset/ --opset 13 --task text-classification
from deepsparse.
Oh really, but I was able to do yolov8 inference using the onnx model with the f16 option to reduce the model size in the deepsparse pipeline. Is that not possible for NLP?
from deepsparse.
@farizalmustaqim That is interesting to hear, I guess it might be possible it would just run in a naive backend for sure. Even in the optimum codebase, they raise an exception if you try to export fp16 on a CPU device https://github.com/huggingface/optimum/blob/5017d06603488f396537e69ff77055907fae79d0/optimum/exporters/onnx/__main__.py#L295
from deepsparse.
Hi @farizalmustaqim
As some time has passed with no further updates, I am going to go ahead and close out this issue.
Please re-open if you want to continue the conversation.
Best, Jeannie / Neural Magic
from deepsparse.
Related Issues (20)
- Research: 4-bit quantization HOT 5
- Assertion `!cache_sizes.empty()' failed HOT 2
- transformers_embedding-extraction for text-generation tasks HOT 3
- Question on quantization size HOT 1
- NM: error: Node (/model/Add_1) Op (Add) [ShapeInferenceError] Incompatible dimension HOT 5
- Using output_value as "token_embeddings" is broken for Sentence Transformer HOT 2
- docker access denied error HOT 8
- Assertion at src/lib/core/topology.cpp:627 HOT 1
- yolo-v8 in onnx-runtime outperforms deepsparse on iMX8 HOT 3
- Python3.12? HOT 3
- How to use for the fintuned roberta model for text classification HOT 3
- Purpose of exporter.export_onnx(sample_batch=torch.randn(1, 1, 28, 28)) HOT 4
- deepsparse.TextGeneration doesn't accept `trust_remote_code` as an arg anymore HOT 1
- YOLOv8 - Display bounding boxes and classes names in image using python. HOT 3
- [Question] about converting onnx model with dynamic batch size input to deepsparse model HOT 2
- How can I make proper request to server HOT 1
- Unable to load DeepSparseSentenceTransformer HOT 1
- How can I use deepsparse instead of ultralytics HOT 2
- Unknown Pipeline task yolov8. Currently supported tasks are ['text_generation', 'opt', 'llama', 'code_gen', 'code_generation', 'codegen', 'image_classification', 'mpt'] HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from deepsparse.