Giter Club home page Giter Club logo

tencent / tnn Goto Github PK

View Code? Open in Web Editor NEW
4.3K 92.0 751.0 56.56 MB

TNN: developed by Tencent Youtu Lab and Guangying Lab, a uniform deep learning inference framework for mobile、desktop and server. TNN is distinguished by several outstanding features, including its cross-platform capability, high performance, model compression and code pruning. Based on ncnn and Rapidnet, TNN further strengthens the support and performance optimization for mobile devices, and also draws on the advantages of good extensibility and high performance from existed open source efforts. TNN has been deployed in multiple Apps from Tencent, such as Mobile QQ, Weishi, Pitu, etc. Contributions are welcome to work in collaborative with us and make TNN a better framework.

License: Other

CMake 0.69% Shell 0.87% Objective-C 0.19% Objective-C++ 4.03% C++ 70.98% C 11.21% Batchfile 0.14% Assembly 2.88% Metal 2.37% Python 3.76% Dockerfile 0.01% Makefile 0.05% Java 0.03% Cuda 2.78% SourcePawn 0.01%
deep-learning mnn ncnn inference pytorch tensorflow coreml tensorrt tengine openvino

tnn's Introduction

中文版本

Introduction

TNN: A high-performance, lightweight neural network inference framework open sourced by Tencent Youtu Lab. It also has many outstanding advantages such as cross-platform, high performance, model compression, and code tailoring. The TNN framework further strengthens the support and performance optimization of mobile devices on the basis of the original Rapidnet and ncnn frameworks. At the same time, it refers to the high performance and good scalability characteristics of the industry's mainstream open source frameworks, and expands the support for X86 and NV GPUs. On the mobile phone, TNN has been used by many applications such as mobile QQ, weishi, and Pitu. As a basic acceleration framework for Tencent Cloud AI, TNN has provided acceleration support for the implementation of many businesses. Everyone is welcome to participate in the collaborative construction to promote the further improvement of the TNN inference framework.

Effect Example

Face Detection(blazeface) Face Alignment
(from Tencent Youtu Lab)
Hair Segmentation
(from Tencent Guangying Lab)
face_detection
model link: tflite tnn
youtu_face_alignment
model link: tnn
hair_segmentation
model link: tnn
Pose Estimation
(from Tencent Guangliu)
Pose Estimation
(blazepose)
Chinese OCR
skeleton
model link: tnn
blazepose
model link: tflite tnn
chinese-ocr
model link: onnx tnn
Object Detection(yolov5s) Object Detection(MobilenetV2-SSD) Reading Comprehension
yolov5
model link: onnx tnn
mobilenetv2_ssd
model link: tensorflow tnn
bertsquad10
model link: onnx tnn

Chinese OCR demo is the TNN implementation of chineseocr_lite project. It is lightweight and supports tilted, rotated and vertical text recognition.

The support for each demo is shown in the following table. You can click the ✅ and find the entrance code for each demo.

demo ARM OpenCL Metal Huawei NPU Apple NPU X86 CUDA
Face Detection (blazeface)
Object Detection (yolov5s)
Face Alignment
Hair Segmentation
Pose Estimation
(from Tencent Guangliu)
Pose Estimation (blazepose)
Chinese OCR
Reading Comprehension

Quick Start

It is very simple to use TNN. If you have a trained model, the model can be deployed on the target platform through three steps.

  1. Convert the trained model into a TNN model. We provide a wealth of tools to help you complete this step, whether you are using Tensorflow, Pytorch, or Caffe, you can easily complete the conversion. Detailed hands-on tutorials can be found here How to Create a TNN Model.

  2. When you have finished converting the model, the second step is to compile the TNN engine of the target platform. You can choose among different acceleration solutions such as ARM/OpenCL/Metal/NPU/X86/CUDA according to the hardware support. For these platforms, TNN provides convenient one-click scripts to compile. For detailed steps, please refer to How to Compile TNN.

  3. The final step is to use the compiled TNN engine for inference. You can make program calls to TNN inside your application. We provide a rich and detailed demo as a reference to help you complete.

Technical Solutions

At present, TNN has been launched in various major businesses, and its following characteristics have been widely praised.

  • Computation optimization

    • The backend operators are primely optimized to make the best use of computing power in different architectures, regarding instruction issue, throughput, delay, cache bandwidth, cache delay, registers, etc..
    • The TNN performance on mainstream hardware platforms (CPU: ARMv7, ARMv8, X86, GPU: Mali, Adreno, Apple, NV GPU, NPU) has been greatly tuned and improved.
    • The convolution function is implemented by various algorithms such as Winograd, Tile-GEMM, Direct Conv, etc., to ensure efficiency under different parameters and sizes.
    • Op fusion: TNN can do offline analysis of network graph, fuse multiple simple operations and reduce overhead such as redundant memory access and kernel startup cost.
  • Low precision computation acceleration

    • TNN supports INT8/FP16 mode, reduces model size & memory consumption, and utilizes specific hardware low-precision instructions to accelerate calculations.
    • TNN supports INT8 WINOGRAD algorithm, (input 6bit), further reduces the model calculation complexity without sacrificing the accuracy.
    • TNN supports mixed-precision data in one model, speeding up the model's calculation speed while preserving its accuracy.
  • Memory optimization

    • Efficient "memory pool" implementation: Based on a full network DAG analysis, the implementation reuses memory between non-dependent nodes which reduces memory cost by 90%.
    • Cross-model memory reduces: This supports external real-time design for network memory so that multiple models can share mutual memory.
  • The performance of mainstream models on TNN: benchmark data

  • TNN architecture diagram:

  • TNN supports TensorFlow, Pytorch, MxNet, Caffe, and other training frameworks through ONNX, leveraging the continuous improvement of the ONNX open-source society. Currently, TNN supports 100+ ONNX operators, consisting of most of the mainstream CNN, NLP operators needed.

  • TNN runs on mainstream operating systems (Android, iOS, embedded Linux, Windows, Linux), and is compatible with ARM CPU,X86 GPU, NPU hardware platform.

  • TNN is constructed through Modular Design, which abstracts and isolates components such as model analysis, graph construction, graph optimization, low-level hardware adaptation, and high-performance kernel. It uses "Factory Mode" to register and build devices, that tries to minimize the cost of supporting more hardware and acceleration solutions.

  • The size of the mobile dynamic library is only around 400KB, and it provides basic image conversion operations, which are light-weight and convenient. TNN uses unified models and interfaces across platforms and can switch easily by configuring just one single parameter.

Learn About TNN Abilities

Manual

API Document

Contribute to TNN

Roadmap

Acknowledgement

TNN referenced the following projects:

License

FAQ

Join Us

  • Everyone is welcome to participate to build the best inference framework in the industry.

  • Technical Discussion QQ Group: 704900079 Answer: TNN

  • Scan the QR code to join the TNN discussion group:

tnn's People

Contributors

103yiran avatar 1627180283 avatar alohali avatar bluaxe avatar bug1989 avatar darrenyao87 avatar devandong avatar gttiankai avatar johnzlli avatar lnmdlong avatar maosquerade avatar masterwan96 avatar mpjlu avatar neiltian-tencent avatar nihui avatar powerpwang avatar qengineering avatar quinnrong94 avatar seanxcwang avatar shaundai-tencent avatar shenpenwang avatar simple0simple avatar sjfeng1999 avatar stephehuang avatar teslawho avatar tpoisonooo avatar winggan avatar xionghc avatar yl16417 avatar yulv-git avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tnn's Issues

支持异构和调用其他厂商的库吗?

如题,

  1. 支持异构吗,比如npu+gpu?自动分割网络,将模型拆分成可以在npu和gpu上执行的子网络
  2. 支持调用第三方厂商提供的库吗,比如在高通上直接调用高通的snpe,在三星上直接调用eden。

arm交叉编译跑demo的时候,初始化还是以原机子的地址初始化,报错

如题,在x86机子A上编译,移到板子B上,跑的时候会提示E/tnn: virtual tnn::Status tnn::TNNSDKSample::Init(const string&, const string&, const string&, tnn::TNNComputeUnits, std::vector) [File /search/odin/xdrao/envs/TNN-master/examples/samples/TNNSDKSample.cc][Line 81] instance.net init failed 4098E/tnn: virtual tnn::Status tnn::TNNSDKSample::Init(const string&, const string&, const string&, tnn::TNNComputeUnits, std::vector) [File /search/odin/xdrao/envs/TNN-master/examples/samples/TNNSDKSample.cc][Line 81] instance.net init failed 4098TNN API ERROR:0x1002E/tnn: virtual tnn::Status tnn::TNNSDKSample::Init(const string&, const string&, const string&, tnn::TNNComputeUnits, std::vector) [File /search/odin/xdrao/envs/TNN-master/examples/samples/TNNSDKSample.cc][Line 81] instance.net init failed 4098/userdata/zhy/TNN-master/examples/armlinux/build #
其中 /search/odin/xdrao/envs/TNN-master/是我x86的地址。。目前还没看到这个初始化在哪。。求解答。。

tools转模型工具有格式不规范和重复代码

tools/onnx2tnn/src/core/objseri/objseri.h 代码重复,并且命名不规范, 头文件宏不规范问题。
tools/onnx2tnn/src/core/onnx2tnn_prefix.h 有代码重复,命名不规范问题。
......
转模型工具有独立工程合入TNN工程,需与主工程保持统一。

安卓demo gradle失败

按照https://github.com/Tencent/TNN/blob/master/doc/cn/user/demo.md中的安卓demo, 出现了如下的问题:
`
debug|armeabi-v7a :-- Arm: ON
debug|armeabi-v7a :-- Metal: OFF
debug|armeabi-v7a :-- OpenCL: ON
debug|armeabi-v7a :-- CUDA: OFF
debug|armeabi-v7a :-- DSP: OFF
debug|armeabi-v7a :-- Atlas: OFF
debug|armeabi-v7a :-- NPU: OFF
debug|armeabi-v7a :-- OpenMP: OFF
debug|armeabi-v7a :-- TEST: OFF
debug|armeabi-v7a :-- --Unit Test: OFF
debug|armeabi-v7a :-- Qantization: OFF
debug|armeabi-v7a :-- ModelCheck: OFF
debug|armeabi-v7a :-- DEBUG:
debug|armeabi-v7a :-- PROFILE: OFF
debug|armeabi-v7a :-- BENCHMARK: OFF
debug|armeabi-v7a :-- BENCHMARK Layer: OFF
debug|armeabi-v7a :-- enable armv7 neon
debug|armeabi-v7a :CMake Error at ../../../platforms/android/CMakeLists.txt:14 (if):
debug|armeabi-v7a : if given arguments:
debug|armeabi-v7a : "21" "GREATER_EQUAL" "21"
debug|armeabi-v7a : Unknown arguments specified
debug|armeabi-v7a :Call Stack (most recent call first):
debug|armeabi-v7a : ../../../CMakeLists.txt:272 (include)
debug|armeabi-v7a :-- Configuring incomplete, errors occurred!
debug|armeabi-v7a :See also "D:/Projects/mobile_frameworks/TNN/examples/android/demo/.cxx/cmake/debug/armeabi-v7a/CMakeFiles/CMakeOutput.log".
executing external native build for cmake D:\Projects\mobile_frameworks\TNN\examples\android\demo\CMakeLists.txt

FAILURE: Build failed with an exception.
`

转换工具构建了半天,卡主不动

刚开始,pip install 卡主不动,换了镜像地址,过去了,
我只需caffe/onnx 转 tnn的, 为啥还要下载 tensorflow阿。

卡在这里,不动谈。估计也是网络问题,没vpn,过不去呀
``

  • [ -d build ]
  • /usr/bin/python3 script/detect_dependency.py

谁能提供个构建好的镜像呀。  

caffemodel转tnnmodel不支持Power

我看tnn支持的op里面有Power,我的caffe模型里有Power层,转换的时候报如下错:
Failed type not support: Power
应该怎么解决呢?

layer {
  name: "input"
  type: "Input"
  top: "data"
  input_param {
    shape {
      dim: 1
      dim: 1
      dim: 256
      dim: 256
    }
  }
}
layer {
  name: "trans"
  type: "Power"
  bottom: "data"
  top: "trans"
  power_param {
    power: 1.0
    scale: 0.0078125
    shift: -1.0
  }
}

转换tensorflow的pb模型出错

Hi, 非常感谢腾讯大佬们的TNN!

我这里遇到一点小问题来到这里请教, 报错信息如下:

...
weight = const_fold_opt__100
weight = Const__82
python3: xxxx/onnx_utility.cc:544: const float* get_tensor_proto_data(const onnx::TensorProto&): Assertion `0` Aborted (core dumped).

根据我的分析,因为assert(0)报错,而在onnx_utility.cc的544行的printf语句并没把对应的数据类型打印到控制台, 我改用std::cout也没有打印到控制台.

请问如何去查看printf的结果呢?以便检查是哪方面的问题?

没有dsp和NPU代码

在source/tnn/device 里面没有找到,CMakeLists.txt里面有这个编译选项

能否增加支持tflite转tnn?

小白用户,tensorflow模型了解不多,但是之前用过tf_lite和MNN,都能很好的转换tensorflow模型到可执行状态,具体做法:
tf_lite直接使用代码转换tenorflow,非常便捷
MNN的路径是tensorflow -> tflite -> MNN,过程不需要输入-in -on参数。因为对于小白用户可能只是引用一下,为了用一下现有模型效果其实并不是很了解模型结构,就不太清楚参数输入

缺少文件 metal_cpu_adapter_acc.mm

1 warning generated.
error: Build input file cannot be found: '/Users/chaostong/Downloads/TNN/source/tnn/device/metal/acc/metal_cpu_adapter_acc.mm' (in target 'tnn' from project 'tnn')
warning: Could not read serialized diagnostics file: Cannot Load File: Failed to open diagnostics file (in target 'tnn' from project 'tnn')
note: Using new build system
note: Planning build
note: Constructing build description
** BUILD FAILED **

model_check 编译存在 bug

将TNN_BUILD_SHARED 设置为 OFF 时,编译对齐工具 model_check,当运行 model_check时,会运行报错,报错信息如下:

 tnn init falied: code: 0x2000 msg: not support mode type! model_checker init failed!

原因是 model_check 的 CmakeList.txt目前还不支持静态编译。

fix compile warnings

DEPRECATED("use Mat(DeviceType, MatType, DimsVector, void*) instead");

linux 交叉编译 android 会报大量的 warning.

/GitProjects/TNN/include/tnn/utils/blob_converter.h:54:5: warning: declaration does not declare anything [-Wmissing-declarations]
    __attribute__((deprecated ("use Mat(DeviceType, MatType, DimsVector, void*) instead")));
    ^

这个是想做个#error还是只是想单纯的__attribute__((deprecated)) 一个接口?

麒麟710F VGG16 性能差

model cpu 1 thread(ms) gpu time(ms)
VGG16 1323 2560
Mobilenet_v2 95 24
squeezenet_v1.1 64 25
Shufflenet_v2.0 15 7

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.