Comments (7)
I'd need to look closer into 1.
Regarding 2. input_x
and input_y
are computed in unsigned arithmetic, where negative numbers wrap and become very big positive numbers. Thus, the comparison input_x < input_width
is correct: if input_x
is negative when viewed as a signed number, it is a very large positive number when viewed as an unsigned number, and input_x < input_width
(unsigned comparison) is false.
from xnnpack.
@Maratyszcza Thanks for your reply. I didn't notice the 'size_t' type. So using unsigned type here can reduce one branch which should have been added, right?
from xnnpack.
Right, using unsigned type removes half of comparisons (and branches)
from xnnpack.
Thansk, I get this brilliant trick. What about the first problem?
from xnnpack.
The first problem is a bug. It should be kernel_size + (output_width-1) * step_width * kernel_height
as you suggested.
from xnnpack.
@Maratyszcza got it. Thanks!
from xnnpack.
Fixed in 03ff294
from xnnpack.
Related Issues (20)
- Enable HEXAGON to build XNNPack
- Work with the gvisor team on this
- scripts/build-android-armv7.sh fails with NDK 21
- `xnn_weights_cache_provider` look_up doesn't work? HOT 3
- How can I parallelize the execution of this benchmark? (https://github.com/google/XNNPACK/blob/master/bench/spmm-benchmark.h) HOT 1
- cmake build failure with XNNPACK_BUILD_TESTS=ON and XNNPACK_LIBRARY_TYPE=shared
- test/sigmoid_nc_test fails on Hexagon simulator HOT 1
- Load-from-misaligned-address failures on Hexagon simulator HOT 3
- XNNPACK tests that use mmap() fail on Hexagon devices
- Default condition missing for xnnpack_aggregate_library HOT 1
- unsupported instruction `vpdpbusd' HOT 4
- tests for vbinary f16_vsqrdiffc_test are missing a bazel build target
- Add benchmarks for vbinary microkernels HOT 1
- Possible null pointer dereference in logging
- Dynamic shape support follow-up HOT 1
- Enable WASM build on GitHub Actions HOT 1
- Enable QC8/QS8 GEMM/IGEMM for Wasm relaxed integer dot product instruction on x64
- 4x16s4 fp32-gemm kernel have better performance than default(5x16) kernel for meteor lake HOT 1
- QB4W Development
- Failed to compile XNNPACK on WoA(Windows on ARM) device. HOT 4
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from xnnpack.