Giter Club home page Giter Club logo

Comments (10)

Trass3r avatar Trass3r commented on May 24, 2024

Reduced version:

import core.sys.posix.stdio;
struct File
{
    private struct Impl
    {
        FILE * handle = null;
    }
    private Impl * p;

}

// Specialization for strings - a very frequent case
void writeln(T...)(T args)
{
fprintf(.stdout.p.handle, "%.*s\n",
                    cast() args[0].length, args[0].ptr) ;
}

extern(C) void std_stdio_static_this()
{
    // stdout
    __gshared File.Impl stdoutImpl;
    stdoutImpl.handle = core.stdc.stdio.stdout;
    .stdout.p = &stdoutImpl;
}
shared static this()
{
    std_stdio_static_this();
}

File stdout;

void main()
{
    char[4096*2] buff = 0x30;
    writeln(buff);
}

from ldc.

dansanduleac avatar dansanduleac commented on May 24, 2024

I ran this (it produces native assembly, so a .s file):

f=$1

# ldc2 compiled using llvm-3.0, binaries also belong to llvm-3.0 installation
time ldc2 "$f.d" -output-ll
time llvm-as "$f.ll"
time llvm-ld -native "$f.bc"

First two steps were really fast, but I got like 3.5s for the last step, so what's taking a lot of time is compiling the llvm bytecode into native assembly.

Ldc2 to to the full compilation takes 8s (much more?) for some reason, and adding verbose to it doesn't really help. All I see is the final linking pass which is really fast (/usr/bin/gcc i49.o -o i49 -Xlinker -L/usr/local/lib -Xlinker -lphobos-ldc -ldl -lpthread -lm -m64).

from ldc.

dnadlinger avatar dnadlinger commented on May 24, 2024

Seems like the LLVM instruction scheduler goes crazy on the generated LLVM IR – almost all the time is spent there, and the emitted code is horrible: It unrolls the loop into an endless series of movs, instead of just emitting a rep stos like DMD does…

@dansanduleac: You can get really detailled log output for the LDC »glue code« part with -vv. In this case, it isn't really helpful, but it would at least tell you that the time is spent after the modules are passed to LLVM for codegen.

from ldc.

dnadlinger avatar dnadlinger commented on May 24, 2024

Reduced:

alias char[4096*2] Big;

void foo(Big big) {}

void main() {
    Big buf = 0x30;
    foo(buf);
}

from ldc.

dnadlinger avatar dnadlinger commented on May 24, 2024

… and some more:

define x86_stdcallcc i32 @foo([8192 x i8] %big_arg) {
entry:
  ret i32 1
}

define x86_stdcallcc i32 @_Dmain({ i64, { i64, i8* }* } %unnamed) {
entry:
  %buf = alloca [8192 x i8], align 1
  %tmp4 = load [8192 x i8]* %buf
  %tmp5 = call x86_stdcallcc i32 @foo([8192 x i8] %tmp4)
  ret i32 %tmp5
}

Seems like parameter passing is the issue – but why?

from ldc.

dnadlinger avatar dnadlinger commented on May 24, 2024

According to Duncan Sands at #llvm, we should generate i32 @foo(byval [8192 x i8]* %big_arg) instead.

from ldc.

Trass3r avatar Trass3r commented on May 24, 2024

What's the purpose of [8192 x i8] as parameter type then if structs and arrays are to be passed via pointer + byval anyways?
Can't really make sense out of http://llvm.org/docs/LangRef.html#byval

from ldc.

dnadlinger avatar dnadlinger commented on May 24, 2024

@Trass3r: Apparently, it is there for cases like complex numbers, where you really want to have a pair/… of registers.

from ldc.

dansanduleac avatar dansanduleac commented on May 24, 2024

Apparently we need to pass around pointed types (e.g. [8192 x i8]* byval) in order to use the byval qualifier (which looks like it might act in a copy-on-write way, but nevertheless does what it should). I managed to make a patch that fixes this issue, and also doesn't allocate any global memory (which I was afraid of if using a pointed value in LLVM). I will submit a pull request in a bit :)

from ldc.

dansanduleac avatar dansanduleac commented on May 24, 2024

With this fix it seems druntime benchmarks also became so much faster (aabench/string now takes 0.31s instead of 3.68s on 64-bit OSX with phobos as shared library) !

from ldc.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.