Giter Club home page Giter Club logo

math-neon's People

math-neon's Issues

Won't compile on ARM64

ARM changed the instructions for 64-bit Neon, so none of this will compile. You 
need to guard using __arm64__ define.

Original issue reported on code.google.com by [email protected] on 13 Feb 2014 at 2:03

simple makefile for build math_debug

I had cooked a simple makefile for building the math_debug:

http://gitorious.org/vjaquez-misc/math-neon/commit/14ba470caad37c33cf7245be69efc
9a1366d8f99?format=patch

Original issue reported on code.google.com by [email protected] on 25 Mar 2011 at 11:59

Operands access

Architecture: Xilinx Zynq (ARM Cortex-A9)
Compiler: arm-xilinx-eabi-gcc (Sourcery CodeBench Lite 2012.09-105) 4.7.2
Arguments: -Wall -O0 -g3 -c -fmessage-length=0 
-I../../cpu0_bsp/ps7_cortexa9_0/include
3.

The following warnings appear:
math_sinf.c:123:1: warning: control reaches end of non-void function 
[-Wreturn-type]
math_sinf.c:111:1: warning: control reaches end of non-void function 
[-Wreturn-type]

Also the function sinf_neon() does not return the correct value. However, the 
following code behaves correctly:

float sinf_neon_rms(float x)
{
    asm volatile (

        "vld1.32                d3, [%1]                                \n\t"   //d3 = {invrange, range}
        "vdup.f32               d0, %3                                  \n\t"   //d0 = {x, x}
        "vabs.f32               d1, d0                                  \n\t"   //d1 = {ax, ax}

        "vmul.f32               d2, d1, d3[0]                           \n\t"   //d2 = d1 * d3[0]
        "vcvt.u32.f32           d2, d2                                  \n\t"   //d2 = (int) d2
        "vmov.i32               d5, #1                                  \n\t"   //d5 = 1
        "vcvt.f32.u32           d4, d2                                  \n\t"   //d4 = (float) d2
        "vshr.u32               d7, d2, #1                              \n\t"   //d7 = d2 >> 1
        "vmls.f32               d1, d4, d3[1]                           \n\t"   //d1 = d1 - d4 * d3[1]

        "vand.i32               d5, d2, d5                              \n\t"   //d5 = d2 & d5
        "vclt.f32               d18, d0, #0                             \n\t"   //d18 = (d0 < 0.0)
        "vcvt.f32.u32           d6, d5                                  \n\t"   //d6 = (float) d5
        "vmls.f32               d1, d6, d3[1]                           \n\t"   //d1 = d1 - d6 * d3[1]
        "veor.i32               d5, d5, d7                              \n\t"   //d5 = d5 ^ d7
        "vmul.f32               d2, d1, d1                              \n\t"   //d2 = d1*d1 = {x^2, x^2}

        "vld1.32                {d16, d17}, [%2]                        \n\t"   //q8 = {p7, p3, p5, p1}
        "veor.i32               d5, d5, d18                             \n\t"   //d5 = d5 ^ d18
        "vshl.i32               d5, d5, #31                             \n\t"   //d5 = d5 << 31
        "veor.i32               d1, d1, d5                              \n\t"   //d1 = d1 ^ d5

        "vmul.f32               d3, d2, d2                              \n\t"   //d3 = d2*d2 = {x^4, x^4}
        "vmul.f32               q0, q8, d1[0]                           \n\t"   //q0 = q8 * d1[0] = {p7x, p3x, p5x, p1x}
        "vmla.f32               d1, d0, d2[0]                           \n\t"   //d1 = d1 + d0*d2 = {p5x + p7x^3, p1x + p3x^3}
        "vmla.f32               d1, d3, d1[0]                           \n\t"   //d1 = d1 + d3*d0 = {...., p1x + p3x^3 + p5x^5 + p7x^7}

        "vmov.f32               %0, s3                                  \n\t"   //s0 = s3
        : "=r"(x)
        : "r"(__sinf_rng), "r"(__sinf_lut), "r"(x)
        : "q0", "q1", "q2", "q3", "q8", "q9"
        );

        return x;
}

Original issue reported on code.google.com by [email protected] on 27 Jun 2013 at 9:18

Compilation errors

What steps will reproduce the problem?
1. Compile math_acosf.c
2. Compile math_vec2.c
3. Compile math_vec4.c

What is the expected output? What do you see instead?
There are errors in the functions: dot4_neon_hfp, dot2_neon_hfp, and 
acosf_neon_hfp.  The compiler error is "expected string literal before ')' 
token," and it refers to what appears to be a missing register string at the 
end of the asm block.  I do not know ARM Neon assembly, and for that matter I 
am super-rusty on assembly in general, so I'm trying to figure out how to fix 
it.


What version of the product are you using? On what operating system?
I'm using the only code I've been able to find on the SVN.  The OS is Angstrom 
Linux, running on a beagleboard.



Original issue reported on code.google.com by [email protected] on 5 Oct 2010 at 5:12

not valid intrinsic's dot4_neon()

The function dot4_neon() uses the intrinsics API which is not correctly handled 
by the project, also it collides with the math_neon.h defined signature.

I just commented it out:

http://gitorious.org/vjaquez-misc/math-neon/commit/3ca3102732e0786350486b52329f0
7392554bd97?format=patch

Original issue reported on code.google.com by [email protected] on 25 Mar 2011 at 12:02

asinf_c() does not seem to be giving correct results

What steps will reproduce the problem?
1. Call asinf_c(x) 
2. Call system asinf(x) with same x
3. Compare the two results

What is the expected output? What do you see instead?
When x=-0.9193184972, the system call returns 0.20978982746601104736 and 
asinf_c() returns 0.48538914322853088379

What version of the product are you using? On what operating system?
Trunk version on iPhone simulator.

Please provide any additional information below.

Thanks,
Mike


Original issue reported on code.google.com by [email protected] on 4 Nov 2010 at 2:06

impl. of sqrtf_neon_hpf()


sqrtf_neon_hpf() first computes the inverse of the square root, and then the 
reciprocal, i.e.

t = 1/sqrt(x)
r = 1/t

it might be easier/faster to compute the inverse of the square root, and then 
multiply by the original value, i.e.

t = 1/sqrt(x)
r = x * t

Original issue reported on code.google.com by [email protected] on 15 Sep 2011 at 11:53

frexpf

I had issues getting correct values following your frexpf algorithm.  When I 
switched to the algorithm shown at 
http://code.metager.de/source/xref/sdcc/sdcc/device/lib/frexpf.c I had better 
luck.  Honestly not sure if I just messed up my implementation, or if your 
algorithm is wrong or not.  Just suggesting you might take a second look at it.

Original issue reported on code.google.com by [email protected] on 7 May 2014 at 9:26

The input value is not properly read by the inline asm code

What steps will reproduce the problem?
1. call sinf_neon(PI/2) and cosf_neon(0)

What is the expected output? What do you see instead?
I am testing sin(PI/2) and cos(0) and I expect to see 1 as result for both. By 
default I see 0, but if I change the code to properly load the function 
parameter it works.

What version of the product are you using? On what operating system?
I am using this code on iPad/iPhone 3GS so Cortex-A8 (iOS 4.2 Beta 2 SDK, 
LLVM-GCC or GCC 4.2).


(example with the sinf_neon() function) I need to add this code before the hfp 
variant of the function is called or at the top of the hfp variant for the 
sinf_neon() function to produce the correct result:

    asm volatile ("vdup.f32 d0, %[xInput]   \n\t"
                  :
                  :[xInput] "r" (x)
                  :
                  );

because the default
    asm volatile ("vdup.f32 d0, r0");

is not able to load the input value correctly.


Original issue reported on code.google.com by [email protected] on 13 Oct 2010 at 9:47

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.