yyyy3531614 / math-neon Goto Github PK

Automatically exported from code.google.com/p/math-neon

C 100.00%

math-neon's People

math-neon's Issues

Won't compile on ARM64

ARM changed the instructions for 64-bit Neon, so none of this will compile. You 
need to guard using __arm64__ define.

Original issue reported on code.google.com by [email protected] on 13 Feb 2014 at 2:03

atan2f_neon_sfp calls atan2f_c with reverse arguments

Line 164 of math_atan2f.c, should read atan2f_c(x, y)...

(Or, more appropriately, rename the arguments in _sfp to y, x and fix line 161 
to atan2f_neon_hfp(y, x) instead)

Original issue reported on code.google.com by [email protected] on 7 Aug 2011 at 7:51

simple makefile for build math_debug

I had cooked a simple makefile for building the math_debug:

http://gitorious.org/vjaquez-misc/math-neon/commit/14ba470caad37c33cf7245be69efc
9a1366d8f99?format=patch

Original issue reported on code.google.com by [email protected] on 25 Mar 2011 at 11:59

Operands access

Architecture: Xilinx Zynq (ARM Cortex-A9)
Compiler: arm-xilinx-eabi-gcc (Sourcery CodeBench Lite 2012.09-105) 4.7.2
Arguments: -Wall -O0 -g3 -c -fmessage-length=0 
-I../../cpu0_bsp/ps7_cortexa9_0/include
3.

The following warnings appear:
math_sinf.c:123:1: warning: control reaches end of non-void function 
[-Wreturn-type]
math_sinf.c:111:1: warning: control reaches end of non-void function 
[-Wreturn-type]

Also the function sinf_neon() does not return the correct value. However, the 
following code behaves correctly:

float sinf_neon_rms(float x)
{
    asm volatile (

        "vld1.32                d3, [%1]                                \n\t"   //d3 = {invrange, range}
        "vdup.f32               d0, %3                                  \n\t"   //d0 = {x, x}
        "vabs.f32               d1, d0                                  \n\t"   //d1 = {ax, ax}

        "vmul.f32               d2, d1, d3[0]                           \n\t"   //d2 = d1 * d3[0]
        "vcvt.u32.f32           d2, d2                                  \n\t"   //d2 = (int) d2
        "vmov.i32               d5, #1                                  \n\t"   //d5 = 1
        "vcvt.f32.u32           d4, d2                                  \n\t"   //d4 = (float) d2
        "vshr.u32               d7, d2, #1                              \n\t"   //d7 = d2 >> 1
        "vmls.f32               d1, d4, d3[1]                           \n\t"   //d1 = d1 - d4 * d3[1]

        "vand.i32               d5, d2, d5                              \n\t"   //d5 = d2 & d5
        "vclt.f32               d18, d0, #0                             \n\t"   //d18 = (d0 < 0.0)
        "vcvt.f32.u32           d6, d5                                  \n\t"   //d6 = (float) d5
        "vmls.f32               d1, d6, d3[1]                           \n\t"   //d1 = d1 - d6 * d3[1]
        "veor.i32               d5, d5, d7                              \n\t"   //d5 = d5 ^ d7
        "vmul.f32               d2, d1, d1                              \n\t"   //d2 = d1*d1 = {x^2, x^2}

        "vld1.32                {d16, d17}, [%2]                        \n\t"   //q8 = {p7, p3, p5, p1}
        "veor.i32               d5, d5, d18                             \n\t"   //d5 = d5 ^ d18
        "vshl.i32               d5, d5, #31                             \n\t"   //d5 = d5 << 31
        "veor.i32               d1, d1, d5                              \n\t"   //d1 = d1 ^ d5

        "vmul.f32               d3, d2, d2                              \n\t"   //d3 = d2*d2 = {x^4, x^4}
        "vmul.f32               q0, q8, d1[0]                           \n\t"   //q0 = q8 * d1[0] = {p7x, p3x, p5x, p1x}
        "vmla.f32               d1, d0, d2[0]                           \n\t"   //d1 = d1 + d0*d2 = {p5x + p7x^3, p1x + p3x^3}
        "vmla.f32               d1, d3, d1[0]                           \n\t"   //d1 = d1 + d3*d0 = {...., p1x + p3x^3 + p5x^5 + p7x^7}

        "vmov.f32               %0, s3                                  \n\t"   //s0 = s3
        : "=r"(x)
        : "r"(__sinf_rng), "r"(__sinf_lut), "r"(x)
        : "q0", "q1", "q2", "q3", "q8", "q9"
        );

        return x;
}

Original issue reported on code.google.com by [email protected] on 27 Jun 2013 at 9:18

Compilation errors

What steps will reproduce the problem?
1. Compile math_acosf.c
2. Compile math_vec2.c
3. Compile math_vec4.c

What is the expected output? What do you see instead?
There are errors in the functions: dot4_neon_hfp, dot2_neon_hfp, and 
acosf_neon_hfp.  The compiler error is "expected string literal before ')' 
token," and it refers to what appears to be a missing register string at the 
end of the asm block.  I do not know ARM Neon assembly, and for that matter I 
am super-rusty on assembly in general, so I'm trying to figure out how to fix 
it.


What version of the product are you using? On what operating system?
I'm using the only code I've been able to find on the SVN.  The OS is Angstrom 
Linux, running on a beagleboard.

Original issue reported on code.google.com by [email protected] on 5 Oct 2010 at 5:12

not valid intrinsic's dot4_neon()

The function dot4_neon() uses the intrinsics API which is not correctly handled 
by the project, also it collides with the math_neon.h defined signature.

I just commented it out:

http://gitorious.org/vjaquez-misc/math-neon/commit/3ca3102732e0786350486b52329f0
7392554bd97?format=patch

Original issue reported on code.google.com by [email protected] on 25 Mar 2011 at 12:02

asinf_c() does not seem to be giving correct results

What steps will reproduce the problem?
1. Call asinf_c(x) 
2. Call system asinf(x) with same x
3. Compare the two results

What is the expected output? What do you see instead?
When x=-0.9193184972, the system call returns 0.20978982746601104736 and 
asinf_c() returns 0.48538914322853088379

What version of the product are you using? On what operating system?
Trunk version on iPhone simulator.

Please provide any additional information below.

Thanks,
Mike

Original issue reported on code.google.com by [email protected] on 4 Nov 2010 at 2:06

impl. of sqrtf_neon_hpf()


sqrtf_neon_hpf() first computes the inverse of the square root, and then the 
reciprocal, i.e.

t = 1/sqrt(x)
r = 1/t

it might be easier/faster to compute the inverse of the square root, and then 
multiply by the original value, i.e.

t = 1/sqrt(x)
r = x * t

Original issue reported on code.google.com by [email protected] on 15 Sep 2011 at 11:53

frexpf

I had issues getting correct values following your frexpf algorithm.  When I 
switched to the algorithm shown at 
http://code.metager.de/source/xref/sdcc/sdcc/device/lib/frexpf.c I had better 
luck.  Honestly not sure if I just messed up my implementation, or if your 
algorithm is wrong or not.  Just suggesting you might take a second look at it.

Original issue reported on code.google.com by [email protected] on 7 May 2014 at 9:26

The input value is not properly read by the inline asm code

What steps will reproduce the problem?
1. call sinf_neon(PI/2) and cosf_neon(0)

What is the expected output? What do you see instead?
I am testing sin(PI/2) and cos(0) and I expect to see 1 as result for both. By 
default I see 0, but if I change the code to properly load the function 
parameter it works.

What version of the product are you using? On what operating system?
I am using this code on iPad/iPhone 3GS so Cortex-A8 (iOS 4.2 Beta 2 SDK, 
LLVM-GCC or GCC 4.2).


(example with the sinf_neon() function) I need to add this code before the hfp 
variant of the function is called or at the top of the hfp variant for the 
sinf_neon() function to produce the correct result:

    asm volatile ("vdup.f32 d0, %[xInput]   \n\t"
                  :
                  :[xInput] "r" (x)
                  :
                  );

because the default
    asm volatile ("vdup.f32 d0, r0");

is not able to load the input value correctly.

Original issue reported on code.google.com by [email protected] on 13 Oct 2010 at 9:47

yyyy3531614 / math-neon Goto Github PK

math-neon's People

math-neon's Issues

Won't compile on ARM64

atan2f_neon_sfp calls atan2f_c with reverse arguments

simple makefile for build math_debug

Operands access

Compilation errors

not valid intrinsic's dot4_neon()

asinf_c() does not seem to be giving correct results

impl. of sqrtf_neon_hpf()

frexpf

The input value is not properly read by the inline asm code

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent