Comments (5)
NVDLA supports Winograd not only 3x3 kernels but also other cases.
The condition of kernel supported by Winograd:
int((original_kernel_width + conv_x_stride - 1) / conv_x_stride) == 3 &&
int((original_kernel_height + conv_y_stride - 1) / conv_y_stride) == 3
Once the condition is reached, that kernel can be supported for Winograd with some extra steps.
简单来说,NVDLA的Winograd模式可以支持3x3以外的kernel,只要保证kernel的水平大小除以水平stride等于3(上取整)以及垂直大小除以垂直stride也等于3(上取整),那么这个kernel就可以被Winograd加速,当然需要一些额外的kernel调整步骤
from doc.
非常感谢您的回复,对我来说非常重要的信息,之前我在研究winograd相关算法的时候一直没有考虑到相关问题,而且我看到winograd的output activation 的反变换是基于固定连接关系的加法器实现,所以一直以为只能支持一种变换; 关于kernel的调整有相关的paper可以参考么,再次感谢!
Thanks so much! It's very important for me since I never considerate that case and I think the inverse transform of output activation is fixed by the adders . Could you please share me some related papers about the adjustment of kernel.
By the way , I think "int" is a bit of confusing , ceiling is better.
from doc.
Actually it's still 3x3 case. A non-3x3 kernel which meetis the formulas above can be transformed to a 3x3 kernel whose stride is 1x1. It is showed in first two diagram of http://nvdla.org/_images/format_channel_extension_and_conversion_for_wingorad.svg
The input activition should be remapped accordingly. CDMA in NVDLA will handle the conversion of input activation automatically.
PS: I'm used to C code :-)
实际上这依然是3x3的kernel。对于非3x3的kernel而言,只要满足上述两个表达式,就可以被转化为3x3并且stride是1x1的kernel
相应的,输入的activation也需要重新排列来适配转化后的kernel。这一点是由NVDLA中的CDMA来完成的。
from doc.
Thanks! happy new year
from doc.
@pengwubj can you please get in touch with me at [email protected]? It concerns a systems programming course I teach.
from doc.
Related Issues (14)
- Regarding the compilation of latex files HOT 2
- doc/doc/hw/v1/ias/ias_image45_cdp_curve.svg is disappeared HOT 2
- nv_small and nv_full difference HOT 1
- Hardware available with DLA cores
- Where is Compilation tools Parser and Compiling?
- Length of a Stripe Operation
- Pooling error for large images
- more details about the register definition HOT 1
- what's the MAX_BUSY_CYCLE mean? HOT 1
- Address Values Wrong in Table HOT 2
- Build fails on Ubuntu with "sw_vers not found" HOT 3
- Genral Questions on NVDLA HOT 3
- Power estimation
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from doc.