Giter Club home page Giter Club logo

armneonoptimization's Introduction

ArmNeonOptimization

Environment

Hisi 3519A-> (2 x A53)

Build And Run

  1. Remove all localized settings

run command

export LC_ALL=C
  1. Make

run command

make clean; make -j4

Speed Test

1. Box Filter Algorithm

Image Resolution Radius Optimization Algorithm Loop Count Time 核心数
4032x3024 3 Origin Algorithm 10 2313.40ms 1xA17
4032x3024 3 RowAndCol Split 10 784.98ms 1xA17
4032x3024 3 RowAndCol Split && Reduce Repeated Computations 10 2496.55ms 1xA17
4032x3024 3 Reduce Cache Miss 10 302.00ms 1xA17
4032x3024 3 Neon Intrinsics 10 188.37ms 1xA17
4032x3024 3 Neon Assembly 10 187.7ms 1xA17
4032x3024 3 Neon Assembly+pld 10 158.70ms 1xA17
4032x3024 3 Neon Assembly+Diff Predeal 10 181.40ms 1xA17
4032x3024 3 Neon AssemblyV2 10 145.92ms 1xA17
4032x3024 3 NCNN Origin 10 281.26ms 1xA17
4032x3024 3 NCNN Neon Intrinsics 10 236.82ms 1xA17
4032x3024 3 NCNN Neon Assembly 10 68.54ms 1xA17
4032x3024 3 NCNN Neon AssemblyV2 10 61.63ms 1xA17

2. WinoGrad3x3s1 F(6, 3) Algorithm Version1.0

inputHeight inputWidth inputChannel KernelSize outChannel Optimization Algorithm Loop Count Time 核心数
15 15 512 3x3 1024 手工优化 10 582.67ms 1
15 15 512 3x3 1024 WinoGrad Version1.0 10 336.81ms 1
56 56 64 3x3 128 手工优化 10 124.63ms 1
56 56 64 3x3 128 WinoGrad Version1.0 10 60.41ms 1
56 56 64 3x3 128 手工优化 10 61.16ms 2
56 56 64 3x3 128 WinoGrad Version1.0 10 32.69ms 2
6 6 512 3x3 1024 手工优化 10 74.03ms 1
6 6 512 3x3 1024 WinoGrad Version1.0 10 76.30ms 1
6 6 512 3x3 1024 手工优化 10 36.49ms 2
6 6 512 3x3 1024 WinoGrad Version1.0 10 41.67ms 2

armneonoptimization's People

Contributors

bbuf avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

armneonoptimization's Issues

some bugs?

//shuipin for(int Y = 0; Y < Radius; Y++){ int Stride = Y * Width; for(int X = 0; X < Width; X++){ colsumPtr[X] += colsumPtr[Stride + X]; } }
should be

//shuipin for(int Y = 0; Y < Radius; Y++){ int Stride = Y * Width; for(int X = 0; X < Width; X++){ colsumPtr[X] += cachePtr[Stride + X]; } }

In BoxFilterNeonIntrinsics function:
for(int Y = 0; Y < Radius; Y++){ int Stride = Y * Width; float* tmpColSumPtr = colsumPtr; float* tmpCachePtr = cachePtr;

for(int Y = 0; Y < Radius; Y++){ int Stride = Y * Width; float* tmpColSumPtr = colsumPtr; float* tmpCachePtr = cachePtr + stride;

Useless code fraement.

where is:
boxFilter.cpp,
Function:
void BoxFilterCache(..)..
//shuipin for(int Y=0;Y Radius;Y++){ int Stride =Y * Width; for(int X = 0;X<Width;X++){ colsumPtr[X]+=colsumPtr[Stride+X; } }
It is actually unused.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.