microsoft / onnxruntime Goto Github PK

ONNX Runtime: cross-platform, high performance ML inferencing and training accelerator

License: MIT License

Batchfile 0.01% Shell 0.06% CMake 0.24% C# 1.13% C++ 89.19% C 2.74% PowerShell 0.02% Python 3.29% Assembly 0.82% Cuda 0.91% Objective-C 0.08% Java 0.24% HLSL 0.01% Jupyter Notebook 0.09% JavaScript 0.39% TypeScript 0.69% Pascal 0.01% Objective-C++ 0.07% NASL 0.01% Kotlin 0.01%

deep-learning onnx neural-networks machine-learning ai-framework hardware-acceleration pytorch tensorflow scikit-learn

onnxruntime's Introduction

ONNX Runtime is a cross-platform inference and training machine-learning accelerator.

ONNX Runtime inference can enable faster customer experiences and lower costs, supporting models from deep learning frameworks such as PyTorch and TensorFlow/Keras as well as classical machine learning libraries such as scikit-learn, LightGBM, XGBoost, etc. ONNX Runtime is compatible with different hardware, drivers, and operating systems, and provides optimal performance by leveraging hardware accelerators where applicable alongside graph optimizations and transforms. Learn more →

ONNX Runtime training can accelerate the model training time on multi-node NVIDIA GPUs for transformer models with a one-line addition for existing PyTorch training scripts. Learn more →

Get Started & Resources

General Information: onnxruntime.ai
Usage documentation and tutorials: onnxruntime.ai/docs
YouTube video tutorials: youtube.com/@ONNXRuntime
Upcoming Release Roadmap
Companion sample repositories:
- ONNX Runtime Inferencing: microsoft/onnxruntime-inference-examples
- ONNX Runtime Training: microsoft/onnxruntime-training-examples

Builtin Pipeline Status

System	Inference	Training
Windows
Linux
Mac
Android
iOS
Web
Other

Third-party Pipeline Status

System	Inference	Training
Linux

Data/Telemetry

Windows distributions of this project may collect usage data and send it to Microsoft to help improve our products and services. See the privacy statement for more details.

Contributions and Feedback

We welcome contributions! Please see the contribution guidelines.

For feature requests or bug reports, please file a GitHub Issue.

For general discussion or questions, please use GitHub Discussions.

Code of Conduct

This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact [email protected] with any additional questions or comments.

License

This project is licensed under the MIT License.

onnxruntime's People

Contributors

Stargazers

Watchers

Forkers

raymondxyang syalachigere fs-eire robertwenquan jywu-msft eundoosong hanabi1224 linyixian awesomemachinelearning erikajob91 adk9 junhuaw eathon fireae buaazb jaylong35 zorrock yishuihanhan sethjuarez davidgphub srkreddy1238 kioco mbyase tanver-hasan lcskrishna nieshaoshuai zjysnow wuxiaolianggit aliushn apwojcik randyshuai walterkangluo newenglandml edgchen1 zwxxx ashku-ms a-and duli2012 pengwa xkszltl wybosys shaunstanislauslau nervanasystems bowenbao jessebenson exlsunshine sunxh16 lhnows1 kant zhijxu-ms jun-yoon weizai118 hqli harrysummer williamtambellini sishtiaq stevenlix jerryshih liwchang paulgureghian1 kkmsft stjordanis jianhuid daparke sreekanth-yalachigere sw6y15 dendisuhubdy mbrukman gramalingam james-bao kestrelm jiafatom ezhangle paulodavid88 nonstatic2014 dingke tmccrmck akavalar cbecker hsintao xadupre maxwillzq deeptechlabs caseycarter utsabsingharoy igordzreyev c00lrain yazici oskarbun mika-fischer daquexian jnorwood shawncal ali-nazem knightyao attackgithub gentlegant minus-one hcsearch zini-julia

onnxruntime's Issues

Snap package support

I was thinking the install and setup could be improved for Ubuntu and other Linux distros by having a snap package version. If it's not a priority would you mind if I made the snap as a contribution?

Crafting your first snap

Build and test snap

Getting Started snap template

python api, onnxtuntime load model error

RuntimeError: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Type Error: Type 'tensor(int32)' of input parameter (Reshape_2/shape) of operator (Reshape) in node (Reshape_2) is invalid.

Errors while using docker build

Host OS: ubuntu 16.04
docker version: Version: 18.09.1

When trying to build ONNX runtime from source by running:

export BUILD_DEVICE=cpu
./tools/ci_build/github/linux/run_dockerbuild.sh

build fails with the following errors. Any tips on what could be missing ?

2019-01-29 19:33:24,143 Build [DEBUG] - Running subprocess in '/home/onnxruntimedev'
['/usr/bin/python3', '-m', 'pip', 'install', '--trusted-host', 'files.pythonhosted.org', 'setuptools', 'wheel', 'numpy']
The directory '/home/onnxruntimedev/.cache/pip/http' or its parent directory is not owned by the current user and the cache has been disabled. Please check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
The directory '/home/onnxruntimedev/.cache/pip' or its parent directory is not owned by the current user and caching wheels has been disabled. check the permissions and owner of that directory. If executing pip with sudo, you may want sudo's -H flag.
Collecting setuptools
Downloading https://files.pythonhosted.org/packages/bf/ae/a23db1762646069742cc21393833577d3fa438eecaa59d11fb04fa57fcd5/setuptools-40.7.1-py2.py3-none-any.whl (574kB)
Collecting wheel
Downloading https://files.pythonhosted.org/packages/ff/47/1dfa4795e24fd6f93d5d58602dd716c3f101cfd5a77cd9acbe519b44a0a9/wheel-0.32.3-py2.py3-none-any.whl
Collecting numpy
Downloading https://files.pythonhosted.org/packages/64/24/2e9c72f44cec8c872000d78c54230e40550c494647e352d1d06724cdaee6/numpy-1.16.0-cp35-cp35m-manylinux1_x86_64.whl (17.2MB)
Installing collected packages: setuptools, wheel, numpy
Exception:
Traceback (most recent call last):
File "/usr/lib/python3/dist-packages/pip/basecommand.py", line 209, in main
status = self.run(options, args)
File "/usr/lib/python3/dist-packages/pip/commands/install.py", line 335, in run
prefix=options.prefix_path,
File "/usr/lib/python3/dist-packages/pip/req/req_set.py", line 732, in install
**kwargs
File "/usr/lib/python3/dist-packages/pip/req/req_install.py", line 837, in install
self.move_wheel_files(self.source_dir, root=root, prefix=prefix)
File "/usr/lib/python3/dist-packages/pip/req/req_install.py", line 1039, in move_wheel_files
isolated=self.isolated,
File "/usr/lib/python3/dist-packages/pip/wheel.py", line 247, in move_wheel_files
prefix=prefix,
File "/usr/lib/python3/dist-packages/pip/locations.py", line 153, in distutils_scheme
i.finalize_options()
File "/usr/lib/python3.5/distutils/command/install.py", line 350, in finalize_options
self.create_home_path()
File "/usr/lib/python3.5/distutils/command/install.py", line 575, in create_home_path
os.makedirs(path, 0o700)
File "/usr/lib/python3.5/os.py", line 241, in makedirs
mkdir(name, mode)
PermissionError: [Errno 13] Permission denied: '/home/onnxruntimedev/.local'
You are using pip version 8.1.1, however version 19.0.1 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.
Traceback (most recent call last):
File "/onnxruntime_src/tools/ci_build/github/linux/../../build.py", line 604, in
sys.exit(main())
File "/onnxruntime_src/tools/ci_build/github/linux/../../build.py", line 561, in main
install_python_deps()
File "/onnxruntime_src/tools/ci_build/github/linux/../../build.py", line 200, in install_python_deps
run_subprocess([sys.executable, '-m', 'pip', 'install', '--trusted-host', 'files.pythonhosted.org'] + dep_packages)
File "/onnxruntime_src/tools/ci_build/github/linux/../../build.py", line 156, in run_subprocess
return subprocess.run(args, cwd=cwd, check=True, stdout=stdout, stderr=stderr, env=my_env, shell=shell)
File "/usr/lib/python3.5/subprocess.py", line 708, in run
output=stdout, stderr=stderr)
subprocess.CalledProcessError: Command '['/usr/bin/python3', '-m', 'pip', 'install', '--trusted-host', 'files.pythonhosted.org', 'setuptools', 'wheel', 'numpy']' returned non-zero exit status 2

Inconsistency of ONNX proto

Describe the bug

https://github.com/Microsoft/onnxruntime/blob/dc8b37f4c491872f75d4fe1963525f98fdb9974b/onnxruntime/core/protobuf/onnx-ml.proto#L524-L563

This part of code shadows the missing #include <onnx/onnx-operators_pb.h>.
When a library links to both official onnx and ort, we get compile errors like "OperatorStatus is not a member of onnx".

Maybe add onnx-operators_pb.h here will help?
https://github.com/Microsoft/onnxruntime/blob/b7cc611563cfd1bafdff14e38fb50ec9c48c3d68/include/onnxruntime/core/graph/onnx_protobuf.h#L29

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Win10
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: master
Python version: 3.7.2
GCC/Compiler version (if compiling from source): VS 2017 15.9

Test failure

Describe the bug

Found the following test failure while building Ort.

[ RUN      ] Scan9.ShortSequenceOneInBatchOneLoopStateVar_NoShapeInMainGraph_NoTypeAndShapeInSubgraph
2019-01-21 12:36:35.1219826 [E:onnxruntime:Default, provider_test_utils.cc:361 onnxruntime::test::OpTester::Run] Resolve failed with status: [ShapeInferenceError] Axis value -1952214784 is invalid for a tensor of rank 2
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(362): error: Value of: status.IsOK()
  Actual: false
Expected: true
[ShapeInferenceError] Axis value -1952214784 is invalid for a tensor of rank 2
[  FAILED  ] Scan9.ShortSequenceOneInBatchOneLoopStateVar_NoShapeInMainGraph_NoTypeAndShapeInSubgraph (2 ms)
[ RUN      ] Scan9.OnnxScalarLoopState
2019-01-21 12:36:35.1243646 [E:onnxruntime:Default, provider_test_utils.cc:361 onnxruntime::test::OpTester::Run] Resolve failed with status: [ShapeInferenceError] Axis value -1952211872 is invalid for a tensor of rank 2
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(362): error: Value of: status.IsOK()
  Actual: false
Expected: true
[ShapeInferenceError] Axis value -1952211872 is invalid for a tensor of rank 2
[  FAILED  ] Scan9.OnnxScalarLoopState (24 ms)
[ RUN      ] Scan9.OuterScopeAccess_NoShapeInMainGraph_TypeAndShapeInSubgraph
2019-01-21 12:36:35.1494243 [E:onnxruntime:Default, provider_test_utils.cc:361 onnxruntime::test::OpTester::Run] Resolve failed with status: [ShapeInferenceError] Axis value -1952214784 is invalid for a tensor of rank 2
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(362): error: Value of: status.IsOK()
  Actual: false
Expected: true
[ShapeInferenceError] Axis value -1952214784 is invalid for a tensor of rank 2
[  FAILED  ] Scan9.OuterScopeAccess_NoShapeInMainGraph_TypeAndShapeInSubgraph (2 ms)
[ RUN      ] Scan9.OuterScopeAccess_ShapeInMainGraph_NoTypeAndShapeInSubgraph
2019-01-21 12:36:35.1517906 [E:onnxruntime:Default, provider_test_utils.cc:361 onnxruntime::test::OpTester::Run] Resolve failed with status: [ShapeInferenceError] Axis value -1952213664 is invalid for a tensor of rank 2
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(362): error: Value of: status.IsOK()
  Actual: false
Expected: true
[ShapeInferenceError] Axis value -1952213664 is invalid for a tensor of rank 2
[  FAILED  ] Scan9.OuterScopeAccess_ShapeInMainGraph_NoTypeAndShapeInSubgraph (1 ms)
[ RUN      ] Scan9.OuterScopeAccess_NoShapeInMainGraph_NoTypeAndShapeInSubgraph
2019-01-21 12:36:35.1541204 [E:onnxruntime:Default, provider_test_utils.cc:361 onnxruntime::test::OpTester::Run] Resolve failed with status: [ShapeInferenceError] Axis value -1952211312 is invalid for a tensor of rank 2
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(362): error: Value of: status.IsOK()
  Actual: false
Expected: true
[ShapeInferenceError] Axis value -1952211312 is invalid for a tensor of rank 2
[  FAILED  ] Scan9.OuterScopeAccess_NoShapeInMainGraph_NoTypeAndShapeInSubgraph (2 ms)
[ RUN      ] Scan9.BadShape
[       OK ] Scan9.BadShape (1 ms)
[ RUN      ] Scan9.ReversedInput
2019-01-21 12:36:35.2563468 [E:onnxruntime:Default, provider_test_utils.cc:361 onnxruntime::test::OpTester::Run] Resolve failed with status: [ShapeInferenceError] Axis value -1952213664 is invalid for a tensor of rank 2
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(362): error: Value of: status.IsOK()
  Actual: false
Expected: true
[ShapeInferenceError] Axis value -1952213664 is invalid for a tensor of rank 2
[  FAILED  ] Scan9.ReversedInput (2 ms)
[ RUN      ] Scan9.ReversedOutput
2019-01-21 12:36:35.2586460 [E:onnxruntime:Default, provider_test_utils.cc:361 onnxruntime::test::OpTester::Run] Resolve failed with status: [ShapeInferenceError] Axis value -1952212544 is invalid for a tensor of rank 2
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(362): error: Value of: status.IsOK()
  Actual: false
Expected: true
[ShapeInferenceError] Axis value -1952212544 is invalid for a tensor of rank 2
[  FAILED  ] Scan9.ReversedOutput (2 ms)
[ RUN      ] Scan9.TransposeInput
2019-01-21 12:36:35.2609655 [E:onnxruntime:Default, provider_test_utils.cc:361 onnxruntime::test::OpTester::Run] Resolve failed with status: Node:node1 Unrecognized attribute: axes for operator Scan
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(362): error: Value of: status.IsOK()
  Actual: false
Expected: true
Node:node1 Unrecognized attribute: axes for operator Scan
[  FAILED  ] Scan9.TransposeInput (1 ms)
[ RUN      ] Scan9.InvalidInput
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(359): error: Value of: status.ErrorMessage()
Expected: has substring "Invalid values in 'scan_input_directions'."
  Actual: "[ShapeInferenceError] Axis value -1952622704 is invalid for a tensor of rank 2"
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(359): error: Value of: status.ErrorMessage()
Expected: has substring "Invalid values in 'scan_output_directions'."
  Actual: "[ShapeInferenceError] Axis value -1952214448 is invalid for a tensor of rank 2"
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(359): error: Value of: status.ErrorMessage()
Expected: has substring "Number of entries in 'scan_input_directions' was 3 but expected 2"
  Actual: "[ShapeInferenceError] Axis value -1952211984 is invalid for a tensor of rank 2"
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(359): error: Value of: status.ErrorMessage()
Expected: has substring "Number of entries in 'scan_output_directions' was 3 but expected 4"
  Actual: "[ShapeInferenceError] Axis value -1952211984 is invalid for a tensor of rank 2"
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(359): error: Value of: status.ErrorMessage()
Expected: has substring "Invalid value in axes for input 0 of 2. Input tensor rank was 2"
  Actual: "Node:node1 Unrecognized attribute: axes for operator Scan"
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\providers\provider_test_utils.cc(359): error: Value of: status.ErrorMessage()
Expected: has substring "[ShapeInferenceError] Number of axes specified (3) is not equal to number of scan inputs (2)."
  Actual: "Node:node1 Unrecognized attribute: axes for operator Scan"
[  FAILED  ] Scan9.InvalidInput (8 ms)

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Win10
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: master
Python version: 3.7.2
GCC/Compiler version (if compiling from source): VS2017 15.9.4

Screenshots

CUDA Test failed on Windows

Describe the bug

CUDA test failed on Windows with an old GPU.
See log for more details.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Win10
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: master
Python version: 3.7
GCC/Compiler version (if compiling from source): VS2017
CUDA/cuDNN version: 10.0/7.4.2.24
GPU model and memory: Quadro K620 (also have 1080Ti on the system but K620 is GPU0)

To Reproduce

Run onnxruntime_test_all.exe and onnxruntime_shared_lib_test.exe.

Screenshots

onnxruntime_test_all:

[----------] 2 tests from CUDAFenceTests
[ RUN      ] CUDAFenceTests.TileWithInitializer
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(185): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 5.60519e-45
  expected_output[i]
    Which is: -1
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(185): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: -1.4013e-45
  expected_output[i]
    Which is: 2
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(185): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: -1
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(185): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: 2
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(185): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: 3
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(185): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: -4
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(185): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: 3
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(185): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: -4
[  FAILED  ] CUDAFenceTests.TileWithInitializer (1716 ms)
[ RUN      ] CUDAFenceTests.TileWithComputedInput
2018-12-27 18:55:50.6266368 [E:onnxruntime:Default, cuda_call.cc:93 onnxruntime::CudaCall] CUDA failure 77: an illegal memory access was encountered ; GPU=0 ; hostname=LTL-DP ; expr=cudaMemcpy_ptds(dst_data, src_data, bytes, cudaMemcpyDeviceToHost);
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: -1
  expected_output[i]
    Which is: 7
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 2
  expected_output[i]
    Which is: -10
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 3
  expected_output[i]
    Which is: 7
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: -4
  expected_output[i]
    Which is: -10
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: -15
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: 22
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: -15
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: 22
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: 7
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: -10
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: 7
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: -10
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: -15
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: 22
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: -15
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\framework\cuda\fence_cuda_test.cc(251): error: Expected equality of these values:
  output.template Data<float>()[i]
    Which is: 0
  expected_output[i]
    Which is: 22

onnxruntime_shared_lib_test:

[ RUN      ] CApiTestWithProviders/CApiTestWithProvider.simple/1
Running simple inference with cuda provider
2018-12-27 19:06:10.7618910 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 256
2018-12-27 19:06:10.7628040 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 512
2018-12-27 19:06:10.7629832 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 1024
2018-12-27 19:06:10.7631395 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 2048
2018-12-27 19:06:10.7632994 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 4096
2018-12-27 19:06:10.7634528 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 8192
2018-12-27 19:06:10.7636054 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 16384
2018-12-27 19:06:10.7637581 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 32768
2018-12-27 19:06:10.7639104 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 65536
2018-12-27 19:06:10.7640627 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 131072
2018-12-27 19:06:10.7642152 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 262144
2018-12-27 19:06:10.7643677 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 524288
2018-12-27 19:06:10.7645194 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 1048576
2018-12-27 19:06:10.7646717 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 2097152
2018-12-27 19:06:10.7648234 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 4194304
2018-12-27 19:06:10.7649758 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 8388608
2018-12-27 19:06:10.7651281 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 16777216
2018-12-27 19:06:10.7652814 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 33554432
2018-12-27 19:06:10.7654338 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 67108864
2018-12-27 19:06:10.7655868 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 134217728
2018-12-27 19:06:10.7657393 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 268435456
2018-12-27 19:06:10.7659054 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 256
2018-12-27 19:06:10.7660596 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 512
2018-12-27 19:06:10.7662132 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 1024
2018-12-27 19:06:10.7663660 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 2048
2018-12-27 19:06:10.7665177 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 4096
2018-12-27 19:06:10.7666698 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 8192
2018-12-27 19:06:10.7668210 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 16384
2018-12-27 19:06:10.7669726 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 32768
2018-12-27 19:06:10.7671262 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 65536
2018-12-27 19:06:10.7673427 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 131072
2018-12-27 19:06:10.7675006 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 262144
2018-12-27 19:06:10.7676537 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 524288
2018-12-27 19:06:10.7678051 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 1048576
2018-12-27 19:06:10.7679572 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 2097152
2018-12-27 19:06:10.7681093 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 4194304
2018-12-27 19:06:10.7682677 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 8388608
2018-12-27 19:06:10.7684198 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 16777216
2018-12-27 19:06:10.7685717 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 33554432
2018-12-27 19:06:10.7687234 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 67108864
2018-12-27 19:06:10.7688753 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 134217728
2018-12-27 19:06:10.7690271 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 268435456
2018-12-27 19:06:10.7694816 [I:onnxruntime:InferenceSession, inference_session.cc:350 onnxruntime::InferenceSession::Impl::Initialize] Initializing session.
2018-12-27 19:06:10.7696910 [I:onnxruntime:InferenceSession, inference_session.cc:364 onnxruntime::InferenceSession::Impl::Initialize] Adding default CPU execution provider.
2018-12-27 19:06:10.7698515 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 256
2018-12-27 19:06:10.7700105 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 512
2018-12-27 19:06:10.7701609 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 1024
2018-12-27 19:06:10.7703700 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 2048
2018-12-27 19:06:10.7705302 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 4096
2018-12-27 19:06:10.7706810 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 8192
2018-12-27 19:06:10.7708314 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 16384
2018-12-27 19:06:10.7709824 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 32768
2018-12-27 19:06:10.7711326 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 65536
2018-12-27 19:06:10.7918542 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 131072
2018-12-27 19:06:10.7920390 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 262144
2018-12-27 19:06:10.7922010 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 524288
2018-12-27 19:06:10.7923545 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 1048576
2018-12-27 19:06:10.7925052 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 2097152
2018-12-27 19:06:10.7926556 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 4194304
2018-12-27 19:06:10.7928185 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 8388608
2018-12-27 19:06:10.7929689 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 16777216
2018-12-27 19:06:10.7931194 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 33554432
2018-12-27 19:06:10.7933425 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 67108864
2018-12-27 19:06:10.7935094 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 134217728
2018-12-27 19:06:10.7936630 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 268435456
2018-12-27 19:06:10.7949537 [I:onnxruntime:Default, kernel_registry.cc:220 onnxruntime::KernelRegistry::TryFindKernel] Mul kernel is not supported in CUDAExecutionProvider Encountered following errors: Op: Mul Execution provider mismatch. Expected: CUDAExecutionProvider Acutal: CPUExecutionProvider Op: Mul Execution provider mismatch. Expected: CUDAExecutionProvider Acutal: CPUExecutionProvider Op: Mul Execution provider mismatch. Expected: CUDAExecutionProvider Acutal: CPUExecutionProvider Op: Mul Execution provider mismatch. Expected: CUDAExecutionProvider Acutal: CPUExecutionProvider
2018-12-27 19:06:10.7957519 [I:onnxruntime:Default, kernel_registry.cc:220 onnxruntime::KernelRegistry::TryFindKernel] Mul kernel is not supported in CPUExecutionProvider Encountered following errors: Op: Mul Execution provider mismatch. Expected: CPUExecutionProvider Acutal: CUDAExecutionProvider Op: Mul Execution provider mismatch. Expected: CPUExecutionProvider Acutal: CUDAExecutionProvider Op: Mul Execution provider mismatch. Expected: CPUExecutionProvider Acutal: CUDAExecutionProvider Op: Mul Execution provider mismatch. Expected: CPUExecutionProvider Acutal: CUDAExecutionProvider Op: Mul Execution provider mismatch. Expected: CPUExecutionProvider Acutal: CUDAExecutionProvider Op: Mul Execution provider mismatch. Expected: CPUExecutionProvider Acutal: CUDAExecutionProvider Op: Mul Execution provider mismatch. Expected: CPUExecutionProvider Acutal: CUDAExecutionProvider
2018-12-27 19:06:10.7967512 [I:onnxruntime:Default, kernel_registry.cc:220 onnxruntime::KernelRegistry::TryFindKernel] Mul kernel is not supported in  Encountered following errors:
2018-12-27 19:06:10.7969745 [I:onnxruntime:InferenceSession, session_state_initializer.cc:137 onnxruntime::SaveMLValueNameIndexMapping] SaveMLValueNameIndexMapping
2018-12-27 19:06:10.7972119 [I:onnxruntime:InferenceSession, session_state_initializer.cc:183 onnxruntime::SaveMLValueNameIndexMapping] Done saving MLValue mappings.
2018-12-27 19:06:10.7973902 [I:onnxruntime:Default, kernel_registry.cc:220 onnxruntime::KernelRegistry::TryFindKernel] Mul kernel is not supported in  Encountered following errors:
2018-12-27 19:06:10.7977274 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 256
2018-12-27 19:06:10.7979259 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 512
2018-12-27 19:06:10.7981400 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 1024
2018-12-27 19:06:10.7983035 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 2048
2018-12-27 19:06:10.7985619 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 4096
2018-12-27 19:06:10.7987218 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 8192
2018-12-27 19:06:10.7988733 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 16384
2018-12-27 19:06:10.7990237 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 32768
2018-12-27 19:06:10.7991751 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 65536
2018-12-27 19:06:10.7993321 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 131072
2018-12-27 19:06:10.7994840 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 262144
2018-12-27 19:06:10.7996905 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 524288
2018-12-27 19:06:10.7998502 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 1048576
2018-12-27 19:06:10.8000008 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 2097152
2018-12-27 19:06:10.8001503 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 4194304
2018-12-27 19:06:10.8185119 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 8388608
2018-12-27 19:06:10.8186971 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 16777216
2018-12-27 19:06:10.8188517 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 33554432
2018-12-27 19:06:10.8190039 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 67108864
2018-12-27 19:06:10.8191546 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 134217728
2018-12-27 19:06:10.8193067 [I:onnxruntime:Default, bfc_arena.cc:25 onnxruntime::BFCArena::BFCArena] Creating bin of max chunk size 268435456
2018-12-27 19:06:10.8194734 [I:onnxruntime:Default, kernel_registry.cc:220 onnxruntime::KernelRegistry::TryFindKernel] Mul kernel is not supported in  Encountered following errors:
2018-12-27 19:06:10.8196636 [I:onnxruntime:InferenceSession, session_state_initializer.cc:252 onnxruntime::SaveInitializedTensorsWithMemPattern] Saving initialized tensors.
2018-12-27 19:06:10.8208022 [I:onnxruntime:Default, bfc_arena.cc:102 onnxruntime::BFCArena::Extend] Extending allocation by 1048576 bytes.
2018-12-27 19:06:10.8210931 [I:onnxruntime:Default, bfc_arena.cc:106 onnxruntime::BFCArena::Extend] Total allocated bytes: 1048576
2018-12-27 19:06:10.8213834 [I:onnxruntime:Default, bfc_arena.cc:109 onnxruntime::BFCArena::Extend] Allocated memory at 0000001409000000 to 0000001409100000
2018-12-27 19:06:10.8217212 [I:onnxruntime:Default, bfc_arena.cc:102 onnxruntime::BFCArena::Extend] Extending allocation by 1048576 bytes.
2018-12-27 19:06:10.8220052 [I:onnxruntime:Default, bfc_arena.cc:106 onnxruntime::BFCArena::Extend] Total allocated bytes: 1048576
2018-12-27 19:06:10.8223281 [I:onnxruntime:Default, bfc_arena.cc:109 onnxruntime::BFCArena::Extend] Allocated memory at 000002016EAD1040 to 000002016EBD1040
2018-12-27 19:06:10.8227791 [I:onnxruntime:InferenceSession, session_state_initializer.cc:324 onnxruntime::SaveInitializedTensorsWithMemPattern] Done saving initialized tensors
2018-12-27 19:06:10.8229741 [I:onnxruntime:InferenceSession, session_state_initializer.cc:412 onnxruntime::SaveKernels] Saving kernels.
2018-12-27 19:06:10.8232252 [I:onnxruntime:Default, kernel_registry.cc:220 onnxruntime::KernelRegistry::TryFindKernel] Mul kernel is not supported in CUDAExecutionProvider Encountered following errors:
2018-12-27 19:06:10.8236029 [I:onnxruntime:InferenceSession, session_state_initializer.cc:421 onnxruntime::SaveKernels] Done saving kernels.
2018-12-27 19:06:10.8237616 [I:onnxruntime:Default, kernel_registry.cc:220 onnxruntime::KernelRegistry::TryFindKernel] Mul kernel is not supported in  Encountered following errors:
2018-12-27 19:06:10.8242002 [I:onnxruntime:Default, kernel_registry.cc:220 onnxruntime::KernelRegistry::TryFindKernel] Mul kernel is not supported in  Encountered following errors:
2018-12-27 19:06:10.8245266 [I:onnxruntime:InferenceSession, inference_session.cc:410 onnxruntime::InferenceSession::Impl::Initialize] Session successfully initialized.
2018-12-27 19:06:12.4997850 [I:onnxruntime:, sequential_executor.cc:38 onnxruntime::SequentialExecutor::Execute] Begin execution
C:\Users\tolia\AppData\Local\Temp\onnxruntime\onnxruntime\test\shared_lib\test_inference.cc(51): error: Expected equality of these values:
  values_y[i]
    Which is: 4
  f[i]
    Which is: 2
[  FAILED  ] CApiTestWithProviders/CApiTestWithProvider.simple/1, where GetParam() = 1 (2333 ms)

Lack of reproducibility

Describe the bug

This CircleCI job executes shufflenet.onnx 2 times with same input, and check differences of these 2 outputs. The job failed by differences. It means the lack of reproducibility.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):

Linux (CircleCI, debian stretch container)

ONNX Runtime installed from (source or binary):

pip3 install onnxruntime
Collecting onnxruntime
  Downloading https://files.pythonhosted.org/packages/e6/97/39c630134268a29a7c26f5f1c8fd2f7ff089ccee567cb076087ddf1cb6e0/onnxruntime-0.1.4-cp35-cp35m-manylinux1_x86_64.whl (4.6MB)
    100% |################################| 4.6MB 389kB/s 
Installing collected packages: onnxruntime
Successfully installed onnxruntime-0.1.4

ONNX Runtime version:

0.1.4

Python version:

3.5.3

GCC/Compiler version (if compiling from source):
CUDA/cuDNN version:
GPU model and memory:

To Reproduce

Clone https://github.com/notogawa/test-onnxruntime and register it to CircleCI

Expected behavior

Every execution output the same result.

When execute on MacBook Pro (OSX, debian stretch container) and desktop machine (Ubuntu, debian stretch container), it works as i expect.

Screenshots

see https://circleci.com/gh/notogawa/test-onnxruntime/7

Additional context

CPU runtime inference speed slow

Describe the bug
Inference speed in CPU build for ONNX(converted from Keras) is nearly 6x slower than CPU Keras(TensorFlow with Intel® MKL DNN backend) on same model.
Have we integrated MKL in the runtime, or else can we expect an approach to speed up inference?

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
ONNX Runtime installed from (source or binary): binary
ONNX Runtime version: 0.1.4
Python version: 3.6.0
GCC/Compiler version (if compiling from source): 5.4.0 20160609
CUDA/cuDNN version: No
GPU model and memory: No

To Reproduce

import time
import numpy as np

x = np.random.random((1,416, 416, 3))
iters = 20

import onnxruntime as rt

sess = rt.InferenceSession("image.onnx")

input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name

_start_time = time.time()
for i in range(iters):
    sess.run([label_name], {input_name: x.astype(np.float32)})[0]
duration = time.time() - _start_time

print("ONNX prediction total duration is %s, per round is %.3f" %
              (duration, duration / iters))

from keras.models import load_model

model = load_model("image.h5")

_start_time = time.time()

for i in range(iters):
    model.predict(x.astype(np.float32))

duration = time.time() - _start_time

print("Keras prediction total duration is %s, per round is %.3f" %
              (duration, duration / iters))

Screenshots

GTest submodule conflicts

Describe the bug

Custom build gtest 1.8.1 installed under /usr/local.
This causes conflicts (undefined symbols) with the gtest submodule.
A better way is to have cmake referencing system gtest first using find_package, and fallback to embedded gtest only when it's not found.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): CentOS 7
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: master
Python version: 3.6 (rh-python36)
GCC/Compiler version (if compiling from source): 7.3.1 (devtoolset-7)
CUDA/cuDNN version: 10
GPU model and memory: Volta

"MKL-DNN is supported in this build"?

Describe the bug

Just curious, is this a typo?
Looks like it's "not supported" instead of "supported".

https://github.com/Microsoft/onnxruntime/blob/11b369a8641d9dc8989739f3c530a8cf330d7f8b/onnxruntime/test/onnx/main.cc#L230

Similarly for other arch.

How to choose CPU/GPU as the onnxruntime engine?

Is your feature request related to a problem? Please describe.
I am testing the performance of onnx runtime on a machine with both CPU and GPU. Since I have installed both MKL-DNN and TensorRT, I am confused about whether my model is run on CPU or GPU. I have installed the packages onnxruntime and onnxruntime-gpu form pypi.

System information

ONNX Runtime version (you are using):
onnxruntime 0.1.3 and onnxruntime-gpu 0.1.3

Describe the solution you'd like
I want to choose the engine by myself through a python API.

set session_thread_pool_size failed.

According the API. Class onnxruntime.SessionOptions should have attribute enable_sequential_execution and session_thread_pool_size. But when I can set the thread_pool_size, I got this error.

>>> import onnxruntime
>>> options = onnxruntime.SessionOptions()
>>> options.enable_sequential_execution = False
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions' object has no attribute 'enable_sequential_execution'
>>> options.session_thread_pool_size = 1
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
AttributeError: 'onnxruntime.capi.onnxruntime_pybind11_state.SessionOptions' object has no attribute 'session_thread_pool_size'

ONNXRuntime Issue: Output:Y [ShapeInferenceError] Mismatch between number of source and target dimensions

Describe the bug
A clear and concise description of what the bug is.
I am trying to build a onnx graph using helper APIs. The simplest example I started is the following. A MatMul op that takes two [1] matrix inputs (X and W), and produces [1] matrix output Y.

import numpy as np
import onnxruntime as rt
from onnx import *
from onnxmltools.utils import save_mode

initializer = []
initializer.append(helper.make_tensor(name="W", data_type=TensorProto.FLOAT, dims=(1,), vals=np.ones(1).tolist()))

graph = helper.make_graph(
    [
        helper.make_node('MatMul', ["X", "W"], ["Y"]),
    ],
    "TEST",
    [
        helper.make_tensor_value_info('X' , TensorProto.FLOAT, [1]),
        helper.make_tensor_value_info('W', TensorProto.FLOAT, [1]),
    ],
    [
        helper.make_tensor_value_info('Y', TensorProto.FLOAT, [1]),
    ],
    initializer=initializer,
    )

checker.check_graph(graph)
model = helper.make_model(graph, producer_name='TEST')
save_model(model, "model.onnx")
sess = rt.InferenceSession('model.onnx')

When I ran this, it complains like this:

Traceback (most recent call last):
File "onnxruntime_test.py", line 35, in <module>
sess = rt.InferenceSession('model.onnx')
File "/usr/local/lib/python3.5/dist-packages/onnxruntime/capi/session.py", line 29, in __init__
self._sess.load_model(path_or_bytes)
RuntimeError: [ONNXRuntimeError] : 1 : GENERAL ERROR : Node: Output:Y [ShapeInferenceError] Mismatch between number of source and target dimensions. Source=0 Target=1

I am stuck here for hours. Could anybody please give me any help?

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
ONNX Runtime installed from (source or binary): binary
ONNX Runtime version: 0.1.4
Python version: 3.5.2
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version: 9.1
GPU model and memory: Titan V

To Reproduce
Describe steps/code to reproduce the behavior:
python3 CODE.py

Expected behavior
A clear and concise description of what you expected to happen.
I expect the InferenceSession is created successfully without any issues.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

C++ API for developers using CNTK C++ API for training

Right now it is TBD, but I really think that for production systems, it would be really beneficial to have a proper C++ inferface?

Wrong model prediction for keras application models NASNetMobile and NASNetLarge

Describe the bug
I am converting keras application models https://github.com/onnx/keras-onnx. (1) I use onnx runtime release 0.1.4, it can convert the keras model (NASNetMobile and NASNetLarge) to onnx model, but it gives me the wrong prediction. For NASNetMobile, all predictions are NAN, for NASNetLarge, most of them are 0.0. (2) I check out the release 0.1.4 and build it locally on my machine using "build.bat --config RelWithDebInfo --build_wheel --enable_pybind", it gives me the correct prediction. So the official release fails on this prediction.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Microsoft Windows 10 Enterprise Version 10.0.17763 Build 17763
ONNX Runtime installed from (source or binary): both (see above)
ONNX Runtime version: 0.1.4
Python version: 3.6
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version:
GPU model and memory:

To Reproduce
Describe steps/code to reproduce the behavior:
Put the following code into keras-onnx/tests/test_layer.py, then run unit tests.

def test_NASNetMobile(self):
    from keras.applications.nasnet import NASNetMobile
    model = NASNetMobile(include_top=True, weights='imagenet')
    self._test_keras_model(model)

def test_NASNetLarge(self):
    from keras.applications.nasnet import NASNetLarge
    model = NASNetLarge(include_top=True, weights='imagenet')
    self._test_keras_model(model, img_size=331)

Expected behavior
A clear and concise description of what you expected to happen.
Should be a real-value 1*1000 vector which is probability (sum to 1)

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Are there any performance/benchmark numbers available to share?

Particularly, performance comparison against different platforms (PyTorch, MXNet, TF). Also performance comparison among different hardware: cpu vs gpu(cuda) vs MKL-DNN.

Interface between Onnx Runtime and Execution Provider

Is it a standard interface between Onnx Runtime and Execution Provider? or it is similar to ONNXIFI?
As mentioned in reference document, Onnx Runtime will partition a model graph into subgraphs, does it mean that Onnx Runtime is responsible for transforming initial graph with Onnx format into a subgraph with new format(not Onnx) based on the capability of target Execution Provider?

Build Test "failed", "bidirectional simple weights no bias" test from GRUTest takes too much time

Describe the bug
A clear and concise description of what the bug is.
I am building onnx runtime (master branch) in ubuntu 16.04 virtual machine. The build progress has finished, followed with test phases. However, the test phase stopped at GRUTest.BidirectionalDefaultActivationsSimpleWeightNoBiasTwoRows. I have waited for more than 4 hours in this step, so I think there is something wrong.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Linux Ubuntu 16.04
ONNX Runtime installed from (source or binary): building
ONNX Runtime version: master branch
Python version: 3.7.1
GCC/Compiler version (if compiling from source): gcc 5.4.0
CUDA/cuDNN version: -
GPU model and memory: -

To Reproduce
Describe steps/code to reproduce the behavior:
git clone
./build.sh --config RelWithDebInfo --build_wheel
or
./build.sh --build_wheel

Expected behavior
A clear and concise description of what you expected to happen.
Build and test successful

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Missing boundary case in GEMM: zero size input

in file provider/cpu/math/gemm_helper.h, line 43
ORT_ENFORCE(M_ > 0 && N_ > 0 && K_ > 0);
the check is too strict, M or N can be 0.
In some case, the number of input of FC layer can be 0, in Fast-RCNN, if you give an empty image, the number of ROI after proposal layer could be 0, so the input data of following FC layer is a zero size vector. you can't say that it's invalid input.

exception when Loop input "M" "cond" are empty strings

Describe the bug
In ONNX Loop spec, "M" and "cond" input can be empty strings, if we don't need them.

Running the Loop, we will have following error:

RuntimeError: [ONNXRuntimeError] : 10 : INVALID_GRAPH : Node:Loop__5 Node (Loop__5)'s input 1 is marked single but has an empty string in the graph

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows 10
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: master
Python version: 3.5
GCC/Compiler version (if compiling from source): VS2017
CUDA/cuDNN version:
GPU model and memory:

To Reproduce
Define a Loop having empty strings for M or cond input, run it.

Expected behavior
should run without errors.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Few newbie questions

I managed to build and take out onnxruntime for a spin and I had a bunch of absolute newbie questions. I apologize if this is not the right forum for this or if this is already available in the API docs.

Is it fair to say that we can use onnxruntime to build an inference application specifically for ONNX models, with some additional code on top ? (something similar to tensorflow-serving)
We do not do any sort of training in onnxruntime. This is purely for running inference from the pre-trained ONNX model. Is that correct?
Question on concurrency: I noticed the API provides parallel execution with a session thread pool size. Is this like application threads as opposed to the concurrency that we can control by setting the OMP_NUM_THREADS (in case we use the MKL_DNN library as execution provider) How do these two relate? Does session_thread_pool_size override OMP_NUM_THREADS ?
Is it possible to batch multiple input requests together while running it via the InferenceSession ?

op "Div" not supports double input while double is said to be supported in onnx

also, the output error log "run_onnx FAIL [ONNXRuntimeError] : 1 : GENERAL ERROR : No suitable kernel definition found for op Div(7)" is wrong

the error log seems like to be "run_onnx FAIL [ONNXRuntimeError] : 1 : GENERAL ERROR : No suitable kernel definition found for op Div(11)"

Will have plan to support RISC-V processor? Or how to port to different processor?

Will you have plan to support more processors or can provide porting guide.

Cross compile error on Linux for ARM

Describe the bug
I tried to build ARM package with cross-compiling on my Linux PC (Ubuntu 16.04), following the instruction BUILD.md. Cmake completed, but during the make, I saw the following error:

  :
[ 31%] Built target protoc
[ 31%] Running cpp protocol buffer compiler on /home/ytakeda/Projects/onnxruntime/onnxruntime/onnxruntime/core/protobuf/onnx-operators-ml.proto
../external/protobuf/cmake/protoc: 1: ../external/protobuf/cmake/protoc: Syntax error: word unexpected (expecting ")")
onnx/CMakeFiles/onnx_proto.dir/build.make:70: recipe for target 'onnx/onnx-operators-ml.pb.h' failed
make[2]: *** [onnx/onnx-operators-ml.pb.h] Error 2
CMakeFiles/Makefile2:1185: recipe for target 'onnx/CMakeFiles/onnx_proto.dir/all' failed
make[1]: *** [onnx/CMakeFiles/onnx_proto.dir/all] Error 2
Makefile:140: recipe for target 'all' failed
make: *** [all] Error 2

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: master rev 4801e and resease-0.1.5 (the same result)
Python version: 2.7.12 and 3.5.2 installed (I guess not used by cmake/make)
GCC/Compiler version (if compiling from source): gcc-linaro-6.3.1-2017.05-x86_64_arm-linux-gnueabihf
CUDA/cuDNN version: n/a
GPU model and memory: n/a

To Reproduce
First, downloaded gcc-linaro-6.3.1 as instructed, extracted it into /opt/.
Then, downloaded pre-compiled protoc from here, then moved ./bin/protoc into /usr/bin/.
Then, on the terminal...

   export PATH=/opt/gcc-linaro-6.3.1-2017.05-x86_64_arm-linux-gnueabihf/bin:$PATH
   export CC=arm-linux-gnueabihf-gcc
   export CXX=arm-linux-gnueabihf-g++```

Next, I created tool.cmake file as following:

set(CMAKE_SYSTEM_NAME Linux)
set(CMAKE_SYSTEM_PROCESSOR arm)
set(CMAKE_CXX_COMPILER arm-linux-gnueabihf-c++)
set(CMAKE_C_COMPILER arm-linux-gnueabihf-gcc)
set(CMAKE_FIND_ROOT_PATH_MODE_PROGRAM NEVER)
set(CMAKE_FIND_ROOT_PATH_MODE_LIBRARY ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_INCLUDE ONLY)
set(CMAKE_FIND_ROOT_PATH_MODE_PACKAGE ONLY)

Note: both arm-linux-gnueabihf-c++ and arm-linux-gnueabihf-gcc had to be modified from what BUILD.md had. (this should be updated)

git clone --recursive https://github.com/Microsoft/onnxruntime
cd onnxruntime
mkdir my_build
cd my_build
cmake -DCMAKE_TOOLCHAIN_FILE=../../tool.cmake ../cmake

Then;

make

Expected behavior
make command to be complete successfully.

Additional context
Add any other context about the problem here.

Please let me know what could cause the error. Thanks.

[Q] how to run test_file with onnxruntime_exec ?

commit : 260639c
I have (cross-)built onnxruntime_exec binary for odroid-xu4(arm ubuntu 16.04).
When I try to run onnx model file(
https://s3.amazonaws.com/download.onnx/models/opset_9/resnet50.tar.gz), it shows the following error.

/onnxruntime_exec -m model.onnx -t ./test_data_2.npz
2019-01-15 04:25:03.793697147 [W:onnxruntime:Default, graph.cc:2178 CleanUnusedInitializers] gpu_0/imagenet1k_blobs_queue_f22e83c9-22cd-4a8b-a66d-113af6b832b4_0 exists in this graph's initializers but it is not used by any node
'model.onnx' loaded successfully.
Done loading model: model.onnx
Execution Status: DATA_LOADING_FAILURE
Exception msg: Not enough features in sample.

Unable to find an entry point named 'ReleaseOrtAllocatorInfo' in DLL 'onnxruntime.dll

Describe the bug
The C# library fails with message "System.EntryPointNotFoundException: 'Unable to find an entry point named 'ReleaseOrtAllocatorInfo' in DLL 'onnxruntime.dll'."

To Reproduce
See dotnet/machinelearning#2149

Builds downloads too much data test data (> 4Gb) by default.

Describe the bug
Builds downloads 4.1Gb of test data. It is huge and can be quite long (~1h).

System information

OS: Windows
ONNX Runtime installed from source
Python version: 3.7

To Reproduce
python %~dp0onnxruntime\tools\ci_build\build.py --build_dir onnxruntime\build\Windows --config Release --build_wheel --use_mkldnn

Expected behavior
Skip the downloading.

Is InferenceSession.Run thread-safe?

Is it true that i can keep a single instance for 1 model and call Run method concurrently with no problem? Or should I lock around Run or make a pool of InferenceSession?

How to get rid of large native/onnxruntime.pdb

Describe the bug
The onnxruntime.pdb is ~100M big and makes published netcoreapp2.1 package super large, is there any way to opt out this file? Thanks.

BTW, I would argue that given the file is so large, it will be better to opt-in packaging this pbd file than opt-out.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows
ONNX Runtime installed from (source or binary): dotnet nuget package v0.1.5
ONNX Runtime version: dotnet nuget package v0.1.5
Python version:
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version:
GPU model and memory:

To Reproduce
Describe steps/code to reproduce the behavior:

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Tokenize a string with a regular expression.

Tokenizer uses separators to tokenizer a string, scikit-learn is using regular expressions.
Possible implementation: add an option to use regular expression to tokenize. See onnx/onnxmltools#204.

Build fails, protobuf needs to handle russian characters in the path of build command

Is your feature request related to a problem? Please describe.
Build fails, protobuf needs to handle russian characters in the path of build command
Command:
C:<path_in_russian>\microsoft\onnxruntime>build\Windows\Debug\external\protobuf\cmake\Debug\protoc.exe --cpp_out C:/<path_in_russian>/microsoft/onnxruntime/build/Windows/Debug/onnx -I C:/<path_in_russian>/microsoft/onnxruntime/onnxruntime/core/protobuf C:/<path_in_russian>/microsoft/onnxruntime/onnxruntime/core/protobuf/onnx-ml.proto

System information

ONNX Runtime version (you are using):

Describe the solution you'd like
A clear and concise description of what you want to happen.

I am trying to move build of protoc.exe to Unicode. After enabling UNICODE preprocessor and Unicode flag for the build it appears some issue is in promoting ascii strings to Unicode for the parameters passed to executable.

Describe alternatives you've considered
Currently converting argument parsing routines to consider Unicode characters.

Additional context
C:/чюфўшщ/ушЄ/microsoft/onnxruntime/onnxruntime/core/protobuf: warning: directory does not exist.
C:/чюфўшщ/ушЄ/microsoft/onnxruntime/onnxruntime/core/protobuf/onnx-ml.proto: No such file or directory

Please note garbled symbols чюфўшщ/ушЄ, these are characters corrupted due to incorrect handling of Unicode.

What's the next step after build the onnxruntime from source code

I follow the "Build ONNX Runtime" on Ubuntu 16.04 environment with the following commands:
git clone --recursive https://github.com/Microsoft/onnxruntime
cd onnxruntime
./build.sh --config RelWithDebInfo --build_wheel
The build is successful, and start to run the tests.
Since the tests stay for a long time, so I use CTRL-C to break it.

Now, I am trying to use the pre-trained model to run this onnx runtime.
But I can't find any useful information to run it.

The example is using Python to import the onnx_runtime.
I have no idea which file and where is needed for the Python program to import it.
Where can I find the related information?

Thanks for your time.

Get reshape error for Tensorflow BERT model

Hi,

I converted tensorflow BERT model to onnx by https://github.com/onnx/tensorflow-onnx, but when i do inference, i met error as

GENERAL ERROR : c:\agent_work\3\s\onnxruntime\core\providers\cpu\tensor\reshape_helper.h:43 onnxruntime::ReshapeHelper::ReshapeHelper gsl::narrow_cast<int64_t>(input_shape.Size()) == size was false. The input tensor cannot be reshaped to the requested shape. Input shape:{70,767}

the model have three inputs with shape [100,70], i don't know where did 767 come from? can anyone help?

roi_pool operator implementation error about FLT_MIN

Describe the bug
in file roi_pool.cc, line 83:
Ydata[pool_index] = is_empty ? 0 : std::numeric_limits::min();

FLT_MIN's value is 1.175494e-38, you should use -FLT_MAX here.

how long may GPU version of C#/C++/C be available?

we are planing a windows desktop project which need the GPU versioin of C#/C++/C, any language would be great. The status now is coming soon or TBD. We would want to know, how long we can expect to use the features.
And last, Great thanks for your work!

pip install failed

Describe the bug

exlsunshine@exlsunshine:~/downloads $ pip install onnxruntime
Collecting onnxruntime
  Could not find a version that satisfies the requirement onnxruntime (from versions: )
No matching distribution found for onnxruntime

System information

- OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
No LSB modules are available.
Distributor ID: Ubuntu
Description:    Ubuntu 18.04.1 LTS
Release:        18.04
Codename:       bionic

ONNX Runtime installed from (source or binary):
pip install
ONNX Runtime version:
0.1.4
Python version:

exlsunshine@exlsunshine:~/downloads $ python -V
Python 2.7.15rc1
exlsunshine@exlsunshine:~/downloads $ python3 -V
Python 3.6.7

GCC/Compiler version (if compiling from source):

exlsunshine@exlsunshine:~/downloads $ gcc --version
gcc (Ubuntu 7.3.0-27ubuntu1~18.04) 7.3.0
Copyright (C) 2017 Free Software Foundation, Inc.
This is free software; see the source for copying conditions.  There is NO
warranty; not even for MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.

CUDA/cuDNN version:

To Reproduce
pip install onnxruntime

Expected behavior
install package

Screenshots

Additional context

ONNX Runtime and ONNXIFI

Hi,

I'm trying to understand the relationship between ONNX Runtime and ONNXIFI. ONNXIFI was released as part of ONNX 1.3, and Glow backend is already implemented by Facebook, nGraph also has some plan to support it. And for ONNX runtime, we have different execution providers (such as nGraph, CUDA, TensorRT, etc) to support various hardware accelerators. To avoid some kind of fragmentation in the ONNX ecosystem, how ONNX runtime and ONNXIFI can work together?

Unable to load DLL 'onnxruntime.dll': The specified module could not be found.

Describe the bug
After installing Nuget package and attempting to create a session I get the following error in a WinForms app on .NET Framework version 4.6.1:

Unable to load DLL 'onnxruntime.dll': The specified module could not be found. (Exception from HRESULT: 0x8007007E)

at line 33 in Microsoft.ML.OnnxRuntime.SessionOptions:

public SessionOptions()
{
     _nativeOption = new NativeOnnxObjectHandle(NativeMethods.ONNXRuntimeCreateSessionOptions());
}

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Windows
ONNX Runtime installed from (source or binary): Nuget Installed Microsoft.ML.OnnxRuntime in VS17
ONNX Runtime version: 0.1.5

To Reproduce
Added Nuget package and attempted the following code with a valid onnx file:

var options = SessionOptions.Default;
options.AppendExecutionProvider(ExecutionProvider.Cpu);
_session = new InferenceSession(file, options);

I tried running from Visual Studio on x64

Expected behavior
The dll should load and the session should start.

Note
I am sure this is a silly error on my part but maybe others will run into it.

Alnex.net test error

Describe the bug
RuntimeError: [ONNXRuntimeError] : 1 : GENERAL ERROR : No suitable kernel definition found for op Dropout(6)

System information

windows 10:
onnxruntime-0.1.4-cp36-cp36m-win_amd64.whl:
ONNX Runtime version:0.1.4
Python version:3.6

code:
from torch.autograd import Variable
import torch.onnx
import torchvision
import os
def get_runningfile_path():
return os.path.dirname(os.path.abspath(file))
root_path = get_runningfile_path()
dummy_input = Variable(torch.randn(1, 3, 224, 224))
model = torchvision.models.alexnet(pretrained=True)
torch.onnx.export(model, dummy_input,os.path.join(root_path,'alexnet.onnx'))

Compute the prediction with ONNX Runtime

import onnxruntime as rt
import numpy

img = numpy.random.rand(1,3,224,224)
sess = rt.InferenceSession(os.path.join(root_path,'alexnet.onnx'))
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred_onx = sess.run([label_name], {input_name: img.astype(numpy.float32)})[0]
print(pred_onx)

onnxruntime python package conflicts with pytorch

Describe the bug
can not successfully import onnxruntime , it raise error

System information

Docker image of :
- pytorch/pytorch:1.0-cuda10.0-cudnn7-devel
Runtime installed from (source or binary)
- pip install onnxruntime
ONNX Runtime version:
- 0.1.4
Python version:
- 3.6.7

To Reproduce
Describe steps/code to reproduce the behavior:

import torch
from onnxruntime import backend

Expected behavior
success run the code

Additional context

/opt/conda/lib/python3.6/site-packages/onnxruntime/capi/_pybind_state.py:12: UserWarning: Cannot load onnxruntime.capi. Error: '/opt/conda/lib/python3.6/site-packages/onnxruntime/capi/onnxruntime_pybind11_state.so: undefined symbol: mkldnn_sgemm'
  warnings.warn("Cannot load onnxruntime.capi. Error: '{0}'".format(str(e)))
Traceback (most recent call last):
  File "info.py", line 2, in <module>
    from onnxruntime import backend
  File "/opt/conda/lib/python3.6/site-packages/onnxruntime/__init__.py", line 21, in <module>
    from onnxruntime.capi._pybind_state import RunOptions, SessionOptions, get_device, NodeArg, ModelMetadata```

deploy on ARM device(Android)

Is your feature request related to a problem? Please describe.
I want to deploy onnxruntime on android device, but I can't make progress because of the lack of
related guide from github, only find "Cross compiling on Linux" ,but this is not enough for me, would
you please update the ARM build guide in more detail?
System information

ONNX Runtime version (you are using):
0.1.5
Describe the solution you'd like
an ARM build guide(Linux) in detail

Describe alternatives you've considered
N/A

Additional context
NA

OnnxRuntime C# library - add support for Sequence and Map model output types

Add support in C# OnnxRuntime Library for sequences and maps (complex types). Tensors are the only supported type currently.

segmentation fault when Tile's repeat contains 0

Describe the bug
When the input repeats of op Tile contains 0, onnxruntime will throws segmentation fault.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Ubuntu 16.04
ONNX Runtime installed from (source or binary): binary
ONNX Runtime version: 0.1.3
Python version: 3.6
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version: None
GPU model and memory: None

To Reproduce
Describe steps/code to reproduce the behavior:

test.zip

m = onnxruntime.InferenceSession('./test.pb')
x = np.array([0,1,2,3], np.float32)
y = np.array([0], np.int64)
m.run('z', inputs)

Expected behavior
A clear and concise description of what you expected to happen.

Screenshots
If applicable, add screenshots to help explain your problem.

Additional context
Add any other context about the problem here.

Build error

Describe the bug
Build error.

System information

OS Platform and Distribution
Linux Ubuntu 16.04
ONNX Runtime installed from
source
ONNX Runtime version:
Python 3:
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version:
GPU model and memory:

To Reproduce

./build.sh --config RelWithDebInfo --build_wheel

Expected behavior
Build successful.

Screenshots

2018-12-21 09:10:49,430 Build [DEBUG] - Defaulting to running update, build and test. 2018-12-21 09:10:49,431 Build [INFO] - Build started 2018-12-21 09:10:49,431 Build [DEBUG] - Running subprocess in '/home/exlsunshine/projects/onnxruntime' ['git', 'submodule', 'update', '--init', '--recursive'] 2018-12-21 09:10:49,638 Build [INFO] - Generating CMake build tree 2018-12-21 09:10:49,639 Build [DEBUG] - Running subprocess in '/home/exlsunshine/projects/onnxruntime/build/Linux/Debug' ['/usr/local/bin/cmake', '/home/exlsunshine/projects/onnxruntime/cmake', '-Donnxruntime_RUN_ONNX_TESTS=OFF', '-Donnxruntime_GENERATE_TEST_REPORTS=ON', '-Donnxruntime_DEV_MODE=ON', '-DPYTHON_EXECUTABLE=/usr/bin/python3', '-Donnxruntime_USE_CUDA=OFF', '-Donnxruntime_CUDA_HOME=', '-Donnxruntime_CUDNN_HOME=', '-Donnxruntime_USE_JEMALLOC=OFF', '-Donnxruntime_ENABLE_PYTHON=ON', '-Donnxruntime_BUILD_CSHARP=OFF', '-Donnxruntime_BUILD_SHARED_LIB=OFF', '-Donnxruntime_USE_EIGEN_FOR_BLAS=ON', '-Donnxruntime_USE_OPENBLAS=OFF', '-Donnxruntime_USE_MKLDNN=OFF', '-Donnxruntime_USE_MKLML=OFF', '-Donnxruntime_USE_OPENMP=OFF', '-Donnxruntime_USE_TVM=OFF', '-Donnxruntime_USE_LLVM=OFF', '-Donnxruntime_ENABLE_MICROSOFT_INTERNAL=OFF', '-Donnxruntime_USE_BRAINSLICE=OFF', '-Donnxruntime_USE_NUPHAR=OFF', '-DCMAKE_BUILD_TYPE=Debug'] CMake Error at CMakeLists.txt:6 (onnxruntime_protobuf_generate): Unknown CMake command "onnxruntime_protobuf_generate". CMake Warning (dev) in CMakeLists.txt: No cmake_minimum_required command is present. A line of code such as cmake_minimum_required(VERSION 3.13) should be added at the top of the file. The version specified may be lower if you wish to support older CMake versions for this project. For more information run "cmake --help-policy CMP0000". This warning is for project developers. Use -Wno-dev to suppress it. -- Configuring incomplete, errors occurred! See also "/home/exlsunshine/projects/onnxruntime/cmake/CMakeFiles/CMakeOutput.log". Traceback (most recent call last): File "/home/exlsunshine/projects/onnxruntime/tools/ci_build/build.py", line 572, in sys.exit(main()) File "/home/exlsunshine/projects/onnxruntime/tools/ci_build/build.py", line 545, in main args, cmake_extra_args) File "/home/exlsunshine/projects/onnxruntime/tools/ci_build/build.py", line 315, in generate_build_tree run_subprocess(cmake_args + ["-DCMAKE_BUILD_TYPE={}".format(config)], cwd=config_build_dir) File "/home/exlsunshine/projects/onnxruntime/tools/ci_build/build.py", line 143, in run_subprocess result = subprocess.run(args, cwd=cwd, check=True, env=my_env) File "/usr/lib/python3.6/subprocess.py", line 418, in run output=stdout, stderr=stderr) subprocess.CalledProcessError: Command '['/usr/local/bin/cmake', '/home/exlsunshine/projects/onnxruntime/cmake', '-Donnxruntime_RUN_ONNX_TESTS=OFF', '-Donnxruntime_GENERATE_TEST_REPORTS=ON', '-Donnxruntime_DEV_MODE=ON', '-DPYTHON_EXECUTABLE=/usr/bin/python3', '-Donnxruntime_USE_CUDA=OFF', '-Donnxruntime_CUDA_HOME=', '-Donnxruntime_CUDNN_HOME=', '-Donnxruntime_USE_JEMALLOC=OFF', '-Donnxruntime_ENABLE_PYTHON=ON', '-Donnxruntime_BUILD_CSHARP=OFF', '-Donnxruntime_BUILD_SHARED_LIB=OFF', '-Donnxruntime_USE_EIGEN_FOR_BLAS=ON', '-Donnxruntime_USE_OPENBLAS=OFF', '-Donnxruntime_USE_MKLDNN=OFF', '-Donnxruntime_USE_MKLML=OFF', '-Donnxruntime_USE_OPENMP=OFF', '-Donnxruntime_USE_TVM=OFF', '-Donnxruntime_USE_LLVM=OFF', '-Donnxruntime_ENABLE_MICROSOFT_INTERNAL=OFF', '-Donnxruntime_USE_BRAINSLICE=OFF', '-Donnxruntime_USE_NUPHAR=OFF', '-DCMAKE_BUILD_TYPE=Debug']' returned non-zero exit status 1.

Inconsistency of Squeeze spec between Ort and ONNX

Describe the bug

Ort uses 0-based index for axes in Squeeze but looks like ONNX says it should be 1-based.
https://github.com/onnx/onnx/blob/master/docs/Operators.md#Squeeze

axes : list of ints
List of positive integers, indicate the dimensions to squeeze.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Win10
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: master
Python version: 3.7.2
GCC/Compiler version (if compiling from source): VS2017

build with multiple threads

./built.sh takes a while. Is there way to use make it faster with multiple threads?

Error: DLL load failed

Describe the bug
Program not work.

C:\Users\guyan\AppData\Local\Programs\Python\Python36\lib\site-packages\onnxruntime\capi\_pybind_state.py:12: UserWarning: Cannot load onnxruntime.capi. Error: 'DLL load failed: The specified module could not be found.'
  warnings.warn("Cannot load onnxruntime.capi. Error: '{0}'".format(str(e)))
Traceback (most recent call last):
  File ".\inference.py", line 1, in <module>
    import onnxruntime
  File "C:\Users\guyan\AppData\Local\Programs\Python\Python36\lib\site-packages\onnxruntime\__init__.py", line 21, in <module>
    from onnxruntime.capi._pybind_state import RunOptions, SessionOptions, get_device, NodeArg, ModelMetadata
ImportError: cannot import name 'RunOptions'

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04):
Windows 10 Enterprise
ONNX Runtime installed from (source or binary):
pip install onnxruntime
ONNX Runtime version:
0.1.4
Python version:
3.6.6
GCC/Compiler version (if compiling from source):
CUDA/cuDNN version:
GPU model and memory:

To Reproduce
run the following code:

import onnxruntime

session = onnxruntime.InferenceSession("alexnet.onnx")

Expected behavior
No error messages.

Slice op: 'starts' and 'ends' values resulted in a negative dimension

sometime slice need -1 in starts or ends, onnx doc actually allow https://github.com/onnx/onnx/blob/master/docs/Changelog.md#slice-1

starts and ends attributes to specify the start and end dimension for each axis in the list of axes, it uses this information to slice the input data tensor. If a negative value is passed for any of the start or end indices, it represent number of elements before the end of that dimension.

So I think this is a bug. This is blocking a real model conversion, could anybody familiar with this take a look? Thanks a lot.

Installation issue in windows10 RS4 with py3.6 (both CPU and CUDA)

pip install onnxruntime
Requirement already satisfied: onnxruntime in c:\program files (x86)\microsoft visual studio\shared\python36_64\lib\site-packages (0.1.4)

python -c "import onnxruntime"
C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\onnxruntime\capi_pybind_state.py:12: UserWarning: Cannot load onnxruntime.capi. Error: 'DLL load failed: The specified module could not be found.'
warnings.warn("Cannot load onnxruntime.capi. Error: '{0}'".format(str(e)))
Traceback (most recent call last):
File "", line 1, in
File "C:\Program Files (x86)\Microsoft Visual Studio\Shared\Python36_64\lib\site-packages\onnxruntime_init_.py", line 21, in
from onnxruntime.capi._pybind_state import RunOptions, SessionOptions, get_device, NodeArg, ModelMetadata
ImportError: cannot import name 'RunOptions'

Delay load cudart

Describe the bug

CUDA build doesn't work on non-CUDA machine since cudart is not delay-loaded.
Since cublas and cudnn are already delay-loaded, please also add cudart.

System information

OS Platform and Distribution (e.g., Linux Ubuntu 16.04): Win10
ONNX Runtime installed from (source or binary): source
ONNX Runtime version: master
Python version: 3.7.2
GCC/Compiler version (if compiling from source): VS 2017 15.9
CUDA/cuDNN version: 10/7
GPU model and memory: 1080Ti

microsoft / onnxruntime Goto Github PK

onnxruntime's Introduction

Get Started & Resources

Builtin Pipeline Status

Third-party Pipeline Status

Data/Telemetry

Contributions and Feedback

Code of Conduct

License

onnxruntime's People

Contributors

Stargazers

Watchers

Forkers

onnxruntime's Issues

Compute the prediction with ONNX Runtime

Recommend Projects

Recommend Topics

Recommend Org