An MLIR-based inference engine for both edge and server system. It has following features:
- Native aggressive kernel fusion on multiple levels of memory
- Native support for static memory allocation (scratchpad memory and DDR)
- Diverse AI accelerators
- Native support for flexible heterogeneous execution on many levels.
- A very fast and efficient auto scheduler, include an accurate and fast cost model.
- tvm.relay
- pytorch
- mhlo
- SoC level
- Processor level