graalvm / sulong Goto Github PK

View Code? Open in Web Editor NEW

628.0 628.0 65.0 11.21 MB

Obsolete repository. Moved to oracle/graal.

License: Other

Python 1.04% Java 58.73% C 32.87% C++ 6.74% Makefile 0.15% Objective-C 0.02% ANTLR 0.25% LLVM 0.19%

sulong's People

Stargazers

Watchers

sulong's Issues

Current Parser Limitations

Our current parser has three limitations:

The parser bases on the textual format of the LLVM IR which can change between minor versions. Our parser thus only supports LLVM IR versions 3.2-3.4.
The parser uses Xtext which is not easy to integrate in our mx tool. I tried to integrate it once but it seems that a few components in the parser depend on paths relative to the Eclipse project.
The parser is quite slow and memory hungry. It provides more features than we need, since it was originally used in an Eclipse LLVM IR editor.

I think that we should replace the current textual parser by a binary parser. With a binary parser, we would have stability guarantees for one major plus one minor version. We could also include the binary parser in our mx tool, since it presumably would not rely on external tools. A binary parser would also be faster.

Standalone su link tool

We need a tool for linking su files that isn't part of mx, so it can be shipped as a command with GraalVM. Should probably be implemented in Java with a shell script launcher.

mx vm doesn't put Sulong or NFI on the classpath

When I run mx vm in Sulong, I think I should get a JVM that has everything in the project on the classpath and ready to use, but this doesn't seem to be the case. I get Graal and Truffle, but no sign of Sulong, its dependencies, or the NFI.

Also, should I get some options like -Dsulong.DynamicNativeLibraryPath?

The reason I'm looking at this is that we use (abuse?) mx vm -version to get a command line from which we can run JRuby+Truffle. This works great for graal-core and other such projects, but it doesn't also give us what we need to run Sulong.

Or is this not what mx vm is supposed to do?

Support rdtsc instruction in inline assembler (multiple return registers)

Support the following inline assemble snipped on amd64 (as used by argon2 benchmark):

static uint64_t rdtsc(void) {
    uint64_t rax, rdx;
    __asm__ __volatile__("rdtsc" : "=a"(rax), "=d"(rdx) : :);
    return (rdx << 32) | rax;
}

To reproduce, run argon2 benchmark: mx su-bench-argon2

-- line 1 col 2: invalid InlineAssembly
Exception in thread "main" java.lang.AssertionError: STRUCT
    at com.oracle.truffle.llvm.parser.factories.NodeFactoryFacadeImpl.createInlineAssemblerExpression(NodeFactoryFacadeImpl.java:388)
    at com.oracle.truffle.llvm.parser.impl.LLVMVisitor.visitFunctionCall(LLVMVisitor.java:742)
    at com.oracle.truffle.llvm.parser.impl.LLVMVisitor.visitFunctionCall(LLVMVisitor.java:1036)
    at com.oracle.truffle.llvm.parser.impl.LLVMVisitor.visitNamedMiddleInstruction(LLVMVisitor.java:782)
    at com.oracle.truffle.llvm.parser.impl.LLVMVisitor.visitMiddleInstruction(LLVMVisitor.java:703)
    at com.oracle.truffle.llvm.parser.impl.LLVMVisitor.visitInstruction(LLVMVisitor.java:691)
    at com.oracle.truffle.llvm.parser.impl.LLVMVisitor.visitBasicBlock(LLVMVisitor.java:670)
    at com.oracle.truffle.llvm.parser.impl.LLVMVisitor.getFunctionBlockStatements(LLVMVisitor.java:536)
    at com.oracle.truffle.llvm.parser.impl.LLVMVisitor.visitFunction(LLVMVisitor.java:516)
    at com.oracle.truffle.llvm.parser.impl.LLVMVisitor.visit(LLVMVisitor.java:297)
    at com.oracle.truffle.llvm.parser.impl.LLVMVisitor.getMain(LLVMVisitor.java:263)
    at com.oracle.truffle.llvm.LLVM.parseString(LLVM.java:254)
    at com.oracle.truffle.llvm.LLVM$1.lambda$parse$1(LLVM.java:142)
    at com.oracle.truffle.llvm.SulongLibrary.readContents(SulongLibrary.java:88)
    at com.oracle.truffle.llvm.LLVM$1.parse(LLVM.java:137)
    at com.oracle.truffle.llvm.nodes.impl.base.LLVMLanguage.parse(LLVMLanguage.java:101)
    at com.oracle.truffle.api.TruffleLanguage$LanguageImpl.eval(TruffleLanguage.java:574)
    at com.oracle.truffle.api.vm.PolyglotEngine.evalImpl(PolyglotEngine.java:565)
    at com.oracle.truffle.api.vm.PolyglotEngine.eval(PolyglotEngine.java:532)
    at com.oracle.truffle.api.vm.PolyglotEngine.eval(PolyglotEngine.java:469)
    at com.oracle.truffle.llvm.LLVM.evaluateFromSource(LLVM.java:312)
    at com.oracle.truffle.llvm.LLVM.executeMain(LLVM.java:290)
    at com.oracle.truffle.llvm.LLVM.main(LLVM.java:238)

Error when following README.md Getting Started

I get the following error when I follow the readme "Getting Started" section. Do I need to install this dependency locally?

Do you have a recommended way to install this dependency on Mac OSX?

Thank you.

Exception in thread "main" java.lang.AssertionError: java.io.IOException: java.lang.UnsatisfiedLinkError: libgfortran.so.3
    at com.oracle.truffle.llvm.LLVM.evaluateFromSource(LLVM.java:144)
    at com.oracle.truffle.llvm.LLVM.executeMain(LLVM.java:123)
    at com.oracle.truffle.llvm.LLVM.main(LLVM.java:99)
Caused by: java.io.IOException: java.lang.UnsatisfiedLinkError: libgfortran.so.3
    at com.oracle.truffle.api.TruffleLanguage$AccessAPI.eval(TruffleLanguage.java:588)
    at com.oracle.truffle.api.impl.Accessor.eval(Accessor.java:163)
    at com.oracle.truffle.api.vm.PolyglotEngine$SPIAccessor.eval(PolyglotEngine.java:1177)
    at com.oracle.truffle.api.vm.PolyglotEngine.evalImpl(PolyglotEngine.java:541)
    at com.oracle.truffle.api.vm.PolyglotEngine.access$300(PolyglotEngine.java:106)
    at com.oracle.truffle.api.vm.PolyglotEngine$2.compute(PolyglotEngine.java:525)
    at com.oracle.truffle.api.vm.ComputeInExecutor.run(ComputeInExecutor.java:93)
    at com.oracle.truffle.api.vm.ComputeInExecutor.perform(ComputeInExecutor.java:83)
    at com.oracle.truffle.api.vm.PolyglotEngine.eval(PolyglotEngine.java:528)
    at com.oracle.truffle.api.vm.PolyglotEngine.eval(PolyglotEngine.java:464)
    at com.oracle.truffle.llvm.LLVM.evaluateFromSource(LLVM.java:141)
    ... 2 more
Caused by: java.lang.UnsatisfiedLinkError: libgfortran.so.3
    at com.oracle.graal.truffle.hotspot.nfi.HotSpotNativeFunctionInterface.getLibraryHandle(HotSpotNativeFunctionInterface.java:71)
    at com.oracle.graal.truffle.hotspot.nfi.HotSpotNativeFunctionInterface.getLibraryHandle(HotSpotNativeFunctionInterface.java:39)
    at com.oracle.truffle.llvm.nativeint.NativeLookup.getNativeFunctionHandles(NativeLookup.java:93)
    at com.oracle.truffle.llvm.nativeint.NativeLookup.getLibraryHandles(NativeLookup.java:82)
    at com.oracle.truffle.llvm.nativeint.NativeLookup.uncachedGetNativeFunctionHandle(NativeLookup.java:171)
    at com.oracle.truffle.llvm.nativeint.NativeLookup.getNativeHandle(NativeLookup.java:151)
    at com.oracle.truffle.llvm.nodes.impl.base.LLVMContext.getNativeHandle(LLVMContext.java:74)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMUnresolvedCallNode.executeGeneric(LLVMCallNode.java:122)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallUnboxNodeFactory$LLVMI32CallUnboxNodeGen.executeI32(LLVMCallUnboxNodeFactory.java:188)
    at com.oracle.truffle.llvm.nodes.impl.vars.LLVMWriteNodeFactory$LLVMWriteI32NodeGen.executeVoid(LLVMWriteNodeFactory.java:189)
    at com.oracle.truffle.llvm.nodes.impl.others.LLVMBlockNode$LLVMBlockNoControlFlowNode.executeVoid(LLVMBlockNode.java:105)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMFunctionBodyNode.executeGeneric(LLVMFunctionBodyNode.java:49)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMFunctionStartNode.execute(LLVMFunctionStartNode.java:63)
    at com.oracle.graal.truffle.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:443)
    at com.oracle.graal.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:317)
    at com.oracle.graal.truffle.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:303)
    at com.oracle.graal.truffle.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:291)
    at com.oracle.graal.truffle.OptimizedCallTarget.callDirect(OptimizedCallTarget.java:206)
    at com.oracle.graal.truffle.OptimizedDirectCallNode.callProxy(OptimizedDirectCallNode.java:69)
    at com.oracle.graal.truffle.OptimizedDirectCallNode.call(OptimizedDirectCallNode.java:60)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMGlobalRootNode.executeProgram(LLVMGlobalRootNode.java:104)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMGlobalRootNode.execute(LLVMGlobalRootNode.java:84)
    at com.oracle.graal.truffle.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:443)
    at com.oracle.graal.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:317)
    at com.oracle.graal.truffle.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:303)
    at com.oracle.graal.truffle.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:291)
    at com.oracle.graal.truffle.OptimizedCallTarget.call(OptimizedCallTarget.java:199)
    at com.oracle.truffle.llvm.nodes.impl.base.LLVMMainFunctionReturnValueRootNode$LLVMMainFunctionReturnNumberRootNode.execute(LLVMMainFunctionReturnValueRootNode.java:81)
    at com.oracle.graal.truffle.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:443)
    at com.oracle.graal.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:317)
    at com.oracle.graal.truffle.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:303)
    at com.oracle.graal.truffle.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:291)
    at com.oracle.graal.truffle.OptimizedCallTarget.call(OptimizedCallTarget.java:199)
    at com.oracle.truffle.api.TruffleLanguage$AccessAPI.eval(TruffleLanguage.java:584)
    ... 12 more

Introduce Basic Block nodes

Each function in LLVM IR has a list of basic blocks that form the CFG for a function. Each basic block ends with a terminator instruction that specifies which block should be executed next, e.g., a conditional branch is a terminator instruction that continues execution with a block depending on its condition.

Currently, Sulong maps each instruction of each basic block to its own node in a flat hierarchy. Each instruction returns an index to the node to be executed next. Most statements are wrapped in a LLVMWrappedStatementNode that returns the next statement per default.

This approach has the following disadvantages:

it is difficult to track the control flow from the Truffle nodes and in IGV
it is difficult to associate successor block indices from the bitcode file to the successor indices of the Truffle nodes

I think that we should implement a LLVMBasicBlock, which has an array of LLVMNode and a LLVMTerminatorInstruction which returns the successor index. The LLVMBlockNode then executes an array of LLVMBasicBlock instead of the (wrapped) instructions. Doing so does not flatten the hierarchy, makes control flow implicit, and lets successor indices correspond to the indices of the LLVM IR file.

Interop can't handle TruffleObject as LLVMAddress

(#195 needed to see this problem)

My Ruby C extension API implementation has this code:

static void *ruby_cext;

__attribute__((constructor))
void truffle_ruby_load() {
  ruby_cext = truffle_import("ruby_cext");
}

The import gives a TruffleObject and the assign to ruby_cext tries to execute LLVMAddressStoreNode, which executes LVMAddressReadNode, which tries to cast the TruffleObject to LLVMAddress.

java.lang.ClassCastException: com.oracle.graal.truffleom.a.c cannot be cast to com.oracle.truffle.llvm.types.LLVMAddress
    at com.oracle.truffle.llvm.nodes.impl.vars.LLVMReadNodeFactory$LLVMAddressReadNodeGen.executePointee(LLVMReadNodeFactory.java:442)
    at com.oracle.truffle.llvm.nodes.impl.memory.LLVMStoreNodeFactory$LLVMAddressStoreNodeGen.executeVoid(LLVMStoreNodeFactory.java:358)
    at com.oracle.truffle.llvm.nodes.impl.base.LLVMBasicBlockNode.executeGetSuccessorIndex(LLVMBasicBlockNode.java:63)
    at com.oracle.truffle.llvm.nodes.impl.others.LLVMBlockNode$LLVMBlockControlFlowNode.executeGeneric(LLVMBlockNode.java:71)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMFunctionStartNode.execute(LLVMFunctionStartNode.java:63)
    at com.oracle.graal.truffle.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:503)
    at com.oracle.graal.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:365)
    at com.oracle.graal.truffle.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:351)
    at com.oracle.graal.truffle.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:339)
    at com.oracle.graal.truffle.OptimizedCallTarget.callDirect(OptimizedCallTarget.java:254)
    at com.oracle.graal.truffle.OptimizedDirectCallNode.callProxy(OptimizedDirectCallNode.java:70)
    at com.oracle.graal.truffle.OptimizedDirectCallNode.call(OptimizedDirectCallNode.java:61)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMFunctionCallChain.doDirect(LLVMCallNode.java:296)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen$DirectNode_.execute(LLVMCallNodeFactory.java:195)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen$BaseNode_.acceptAndExecute(LLVMCallNodeFactory.java:113)
    at com.oracle.truffle.api.dsl.internal.SpecializationNode.uninitialized(SpecializationNode.java:407)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen$UninitializedNode_.execute(LLVMCallNodeFactory.java:153)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen.executeDispatch(LLVMCallNodeFactory.java:74)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMFunctionCallChainStartNode.executeGeneric(LLVMCallNode.java:238)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMUnresolvedCallNode.executeGeneric(LLVMCallNode.java:153)
    at com.oracle.truffle.llvm.nodes.base.LLVMExpressionNode.executeVoid(LLVMExpressionNode.java:44)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallUnboxNode$LLVMVoidCallUnboxNode.executeVoid(LLVMCallUnboxNode.java:128)
    at com.oracle.truffle.llvm.nodes.impl.others.LLVMStaticInitsBlockNode.execute(LLVMStaticInitsBlockNode.java:61)
    at com.oracle.graal.truffle.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:503)
    at com.oracle.graal.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:365)
    at com.oracle.graal.truffle.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:351)
    at com.oracle.graal.truffle.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:339)
    at com.oracle.graal.truffle.OptimizedCallTarget.call(OptimizedCallTarget.java:244)
    at com.oracle.truffle.llvm.LLVM$1.lambda$1(LLVM.java:116)
    at com.oracle.truffle.llvm.SulongLibrary.readContents(SulongLibrary.java:88)
    at com.oracle.truffle.llvm.LLVM$1.parse(LLVM.java:102)
    at com.oracle.truffle.llvm.nodes.impl.base.LLVMLanguage.parse(LLVMLanguage.java:98)
    at com.oracle.truffle.api.TruffleLanguage$Env.parse(TruffleLanguage.java:436)
    at org.jruby.truffle.language.loader.FeatureLoader.ensureCExtImplementationLoaded(FeatureLoader.java:262)

To reproduce, build JRuby master, set SULONG_DIR, and run JRUBY_OPTS=-Xtruffle.exceptions.print_java=true jt test cexts.

If you want to modify the C extension API implementation, it's in truffle/src/main/c/cext. Compile with bin/jruby bin/jruby-cext-c truffle/src/main/c/cext.

Commit Messages and PR descriptions

Since the project is steadily growing, I propose a stricter approach to commit messages and PR request descriptions to increase maintainability and speed up reviewing. I have the following suggestions:

Commit messages

Each commit (message) should contain (and describe) one logical change.

PR request:

Each PR request should address one feature or change.
Self-explanatory changes such as updates of the Graal version can stay unexplained.
Other changes should at least describe why (1) the change or feature is needed, (2) how the change or feature is implemented, and optionally (3) what further implications the change has. Personally, I prefer the Github description field to describe a change in more detail. Alternatively, this could also be part of a commit message.

Both PR request titles and commit messages (first two bullets copied from here):

Write the summary line and description of what you have done in the imperative mode, that is as if you were commanding someone. Start the line with "Fix", "Add", "Change" instead of "Fixed", "Added", "Changed".
Don't end the summary line with a period - it's a title and titles don't end with a period.
Each title and commit message should start with a capital letter.

Links:

Access to non-native variable arguments

@mrigger @grimmerm @chrisseaton
For Ruby C-extensions, we would like a way to access variadic function arguments without moving them to native. For instance, there is

VALUE rb_ary_new_from_args(long n, ...);

which creates a Ruby array from the arguments.

Using stdarg.h and va_list, va_start, etc would move everything to native, and it seems difficult to do differently.

VALUE rb_ary_new_from_args(long n, ...) {
  va_list args;
  VALUE array = rb_ary_new_capa(n);

  va_start(args, n);
  for (int i = 0; i < n; i++) {
    rb_ary_store(array, i, va_arg(args, VALUE)); // va_arg produces a native address here.
  }
  va_end(args);

  return array;
}

Should we add an intrinsic to get an argument directly from the Object[] arguments?
Maybe truffle_get_arg(int i), returning an Object?
Or truffle_get_args, returning a Object[]?

Call sulong from simplelanguage

I want to be able to use the polyglot feature of the Truffle API to be able to call sulong from, say, simplelanguage.

How would I go about doing this?

How can we install Dragonegg on Mac?

We can install GCC 4.8 via Homebrew

brew tap homebrew/versions
brew install gcc48 --with-fortran

But this doesn't seem to include GCC plugin headers and so building Dragonegg fails.

Inconsistent Use of the NodeFactoryFacade

The parser should not be aware of concrete Truffle node implementations. To achieve this, an interface NodeFactoryFacade exists that controls the creation of nodes. With the NodeFactoryFacade, one should be able to replace the existing node implementations.

Still, the LLVMVisitor class which is responsible for the Truffle AST creation sometimes directly instantiates Truffle nodes, e.g., for literals.

Where to profile branches?

In #211, we removed the branch profiling in switch, conditional branches, and select, and instead started to profile only the successor bytecode indices probability at a certain instruction (see LLVMBlockNode). We thought that it would suffice to only profile the successor selection.

In #215 we discovered a slow down in benchmarks due to this change, but @lukasstadler figured out that the performance difference could be explained by the missing deopt on dead branches. However, as I commented later on that PR, switches still have wrong branch probabilities.

In #235 the question came up whether or not we want to profile the comparison nodes. @gilles-duboscq pointed out that the decision where to profile is part of a larger design question. So where should we profile?

I want to point out that the current approach probably does not work that well. To illustrate that I analyzed a small C program:

volatile int test = 3;

int asdf(int val) {
    switch (val) {
        case 0: return test / test;
        case 1: return test * test;
        case 2: return test + test;
        case 3: return test - test;
        case 4: return test & test;
        case 5: return test | test;
        case 6: return test % test;
        case 7: return test ^ test;
        case 8: return test / test;
        case 9: return test * test;
        case 10: return test + test;
    }
    return -1;
}

int main() {
    for (int i = 0; i < 10; i++) {
        asdf(i);
    }
    for (int i = 0; i < 1000000; i++) {
        asdf(10);
    }
}

The switch statement maps to the following in LLVM IR:

  switch i32 %val, label %45 [
    i32 0, label %1
    i32 1, label %5
    i32 2, label %9
    i32 3, label %13
    i32 4, label %17
    i32 5, label %21
    i32 6, label %25
    i32 7, label %29
    i32 8, label %33
    i32 9, label %37
    i32 10, label %41
  ]

We currently map the switch statement to a LLVMSwitchNode that does not do any branch profiling. The switch node computes the next successor index that is immediately used in the LLVMBlockNode. In this node we inject the branch probabilities, since the "real branching" happens there.

If we look at the Graal graphs for this function we see an if-cascade for the switch. Each if node has a wrong branch probability of 0.5, since we did not inject the probabilities. The result of this if-cascade is a merge node which is used for the execution of the successor. Then, we have another if-cascade for the block that more or less duplicates the previous if-cascade to execute the next instruction based on the successor. For this if-cascade we injected the right branch probabilities. After the PathDuplication phase, Graal decides to merge both if-cascade into one. However, instead of taking the injected branch probabilities Graal takes the wrong 0.5 probabilities. Effectively, Graal did not use our probabilities.

I did not look into the select and br instructions yet, but it poses the question: should we consider whether or not we inject branch probabilities based on what Graal does with it or should we use execute branch probabilities for all the if statements? I would rather go with the second approach, since then we do not have to look at Graal graphs to know if the branch probabilities are actually used. Of course, this approach has also drawbacks since it increases the maintenance burden and size of the project since we need to profile more branches.

Create easy-to-manage package for LLVM bitcode

Languages which can load extensions, like Ruby, are often build around a model of .so files being on a load path and being dynamically loaded. I'd like to be able to provide something similar for Sulong.

So what I propose is a linker command that produces a .su file - a Sulong library file - which is a zip file with LLVM bitcode in it, and a manifest that specifies which libraries it depends on.

Then where Ruby normally allows you to load a .so file, we do exactly the same thing with our .su files.

Linking bitcode files into a .su and specifying which libraries to load when that library is loaded would look like this:

$ mx su-link test.su -lz test.ll

You could then directly run the test.su file:

$ mx su-run test.su

Or you can load your Sulong library like you would a native library:

$ mx su-run -ltest main.ll

Sulong would automatically detect the difference between a native library and a Sulong library, and in the latter case it would just load all the dependent libraries and then all the bitcode in the zip file.

Why not just link everything into a single big bitcode file for your library? Because we need somewhere to list dependent libraries. If a Ruby extension needs zlib then I need to say that somewhere.

I've started working on this in master...chrisseaton:su-files

Let me know if you think I'm heading down the wrong path.

Failing double to unsigned long cast

The following program (adapted from a GCC test case) fails with Sulong but is executed correctly by Clang:

unsigned long foo(double d) { return (unsigned long)d; }

int main(void) {
  double d;
  unsigned long l;
  d = 9223372036854775808.7;
  l = 1LL << 63;
  if (foo(l) != 9223372036854775808) {
    abort();
  }
  return 0;
}

On Stack Replacement

Sulong does not implement On Stack Replacement (OSR). Since Sulong cannot directly reuse the LoopNode provided by the Truffle framework, we have to implement our own version of it.

File, Stdin, and Stderr Print Options for Debug Information

Currently, we have some debug options which print information about the program (see mx su-options):

                            sulong.Debug (default = false) Turns debugging on/off
         sulong.PrintPerformanceWarnings (default = false) Prints performance warnings
                        sulong.PrintASTs (default = false) Prints the Truffle ASTs for the parsed functions
             sulong.PrintNativeCallStats (default = false) Outputs stats about native call site frequencies
         sulong.PrintNativeAnalysisStats (default = false) Outputs the results of the lifetime analysis (if enabled)

If enabled, this options currently print to System.err and System.out. If the program prints at the same time, then this debug output is often not useful since it is mixed together with the program output. It would thus be useful if the options would not just understand boolean values, but be more configurable. I suggest the following option values:

false: not enabled
stdout: print to System.out
stderr: print to System.err
*: interpret as file path (multiple options with the same file path should be supported)

Lower the Travis gate execution time

Currently, a Travis gate run takes around 30 to 40 minutes (1:20 to 1:40 total time). I think that it should be shorter to not become a development bottleneck on days on which several people actively develop and submit PRs.

The most time is spent in the mx su-travis1 and mx su-travis2 steps. Both steps are supposed to mainly execute test cases. However, a major portion is actually spent in compiling the native (modified HotSpot) project. We have to build it since we have to execute the test cases with Graal. We directly rely on the Graal Native Function Interface for executing native calls and thus cannot use a standard JVM. Breaking the test cases into several independent commands does not scale, since we have to compile the native project for each of this steps.

One idea to solve this is to use a Travis cache (or another cache) for the native project. Instead of building the project, we could directly download the built native project. This would reduce the built time and would also allow us to break the test cases into smaller parts. One drawback of this solution is that we have to upload a new version whenever the native project is updated. If nobody has a better idea, I will still try to follow this approach.

Fix function comparisons of functions with a weak attribute

Consider the following C program:

extern void foo () __attribute__((weak));

int main() {
    if (foo) {
        foo ();
    }
    return 0;
}

The if condition results in br i1 icmp ne (void (...)* @foo, void (...)* null), label %2, label %3 when compiling the function to LLVM IR. Sulong performs function comparisons currently on basis of function indices. The null function has a 0 function index. Since @foo has some other function index, the function comparison results in a true condition and the program attempts to call foo which has not been defined.

The same is also true for the following program, which should not abort since both functions are null:

volatile extern int test1() __attribute__((weak));
volatile extern int test2() __attribute__((weak));

int main() {
    if (test1 != test2) {
        abort();
    }
}

To fix this, we have to check in the function compare node if the function has been loaded. Since this is expensive, we can construct a polymorphic inline cache for that.

Lookup of Native Functions with Exceeding Polymorphic Inline Cache Limit

When we have an indirect call in Sulong which exceeds the polymorphic inline cache limit we perform the function lookup every time. This is not a problem for Truffle functions but for native functions: Whenever we get a native handle we also install the call stub for it which eventually results in a bailout exception. To circumvent this problem we currently have a cache for native function lookups. There are also other deficiencies in the current implementation, since we create the call target every time in this case.

To efficiently support indirect calls with native targets it would be nice if we could extend the Graal Native Function Interface to be able to specify the parameters beforehand, and then pass the function pointer as one of the arguments.

Constructor attribute crashes when in an `.su` file

#include <stdio.h>

__attribute__((constructor))
static void on_load() {
  printf("on_load\n");
}

int main(int argc, char **argv) {
  printf("main\n");
}

$ mx --vm server su-clang -S -emit-llvm -o test.ll test.c
$ mx --vm server su-run test.ll
on_load
main
$ mx --vm server su-link test.ll -o test.su
$ mx --vm server su-run test.su
Exception in thread "main" java.lang.IllegalStateException: could not find function @on_load
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMFunctionCallChain.getNativeCallTarget(LLVMCallNode.java:277)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMFunctionCallChain.getIndirectCallTarget(LLVMCallNode.java:264)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen$BaseNode_.createNext(LLVMCallNodeFactory.java:125)
    at com.oracle.truffle.api.dsl.internal.SpecializationNode$InsertionEvent2.call(SpecializationNode.java:678)
    at com.oracle.truffle.api.dsl.internal.SpecializationNode$InsertionEvent2.call(SpecializationNode.java:1)
    at com.oracle.truffle.api.nodes.Node.atomic(Node.java:515)
    at com.oracle.truffle.api.dsl.internal.SpecializationNode.uninitialized(SpecializationNode.java:403)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen$UninitializedNode_.execute(LLVMCallNodeFactory.java:153)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen.executeDispatch(LLVMCallNodeFactory.java:74)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMFunctionCallChainStartNode.executeGeneric(LLVMCallNode.java:238)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMUnresolvedCallNode.executeGeneric(LLVMCallNode.java:153)
    at com.oracle.truffle.llvm.nodes.base.LLVMExpressionNode.executeVoid(LLVMExpressionNode.java:44)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallUnboxNode$LLVMVoidCallUnboxNode.executeVoid(LLVMCallUnboxNode.java:128)
    at com.oracle.truffle.llvm.nodes.impl.others.LLVMStaticInitsBlockNode.execute(LLVMStaticInitsBlockNode.java:61)
    at com.oracle.graal.truffle.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:477)
    at com.oracle.graal.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:364)
    at com.oracle.graal.truffle.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:350)
    at com.oracle.graal.truffle.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:338)
    at com.oracle.graal.truffle.OptimizedCallTarget.call(OptimizedCallTarget.java:243)
    at com.oracle.truffle.llvm.LLVM$1.lambda$parse$1(LLVM.java:108)
    at com.oracle.truffle.llvm.SulongLibrary.readContents(SulongLibrary.java:88)
    at com.oracle.truffle.llvm.LLVM$1.parse(LLVM.java:99)
    at com.oracle.truffle.llvm.nodes.impl.base.LLVMLanguage.parse(LLVMLanguage.java:97)
    at com.oracle.truffle.api.TruffleLanguage$AccessAPI.eval(TruffleLanguage.java:564)
    at com.oracle.truffle.api.impl.Accessor.eval(Accessor.java:164)
    at com.oracle.truffle.api.vm.PolyglotEngine$SPIAccessor.eval(PolyglotEngine.java:1178)
    at com.oracle.truffle.api.vm.PolyglotEngine.evalImpl(PolyglotEngine.java:532)
    at com.oracle.truffle.api.vm.PolyglotEngine.access$300(PolyglotEngine.java:104)
    at com.oracle.truffle.api.vm.PolyglotEngine$2.compute(PolyglotEngine.java:516)
    at com.oracle.truffle.api.vm.ComputeInExecutor.run(ComputeInExecutor.java:93)
    at com.oracle.truffle.api.vm.ComputeInExecutor.perform(ComputeInExecutor.java:83)
    at com.oracle.truffle.api.vm.PolyglotEngine.eval(PolyglotEngine.java:519)
    at com.oracle.truffle.api.vm.PolyglotEngine.eval(PolyglotEngine.java:454)
    at com.oracle.truffle.llvm.LLVM.evaluateFromSource(LLVM.java:228)
    at com.oracle.truffle.llvm.LLVM.executeMain(LLVM.java:210)
    at com.oracle.truffle.llvm.LLVM.main(LLVM.java:172)

Escaping Function Pointers from Sulong to Native Code

It can happen that programs pass Sulong function pointers to a native function. Since a Sulong function pointer is not a function that can be called by the native side, we currently throw an error and exit the program.

Passing a Sulong function pointer to the native side is not uncommon, since the C standard library has functions that expect function pointers that are usually implemented by the user, e.g., the qsort function.

Untrapped floating point exception support

With C99 it's possible to handle untrapped floating point exceptions in C:

#include <math.h>
#include <fenv.h>
#include <stdio.h>

extern void abort(void);

volatile double zero = 0.0;

int main(void) {
  feclearexcept(FE_ALL_EXCEPT);
  double result = 1.0 / zero;
  if (!fetestexcept(FE_DIVBYZERO)) {
    puts("did not raise div by zero exception!");
    abort();
  }
  printf("excp: %f\n", result);
  return 0;
}

Sulong currently fails this and other test cases (e.g., see gcc.dg/c99-math* in the GCC test case suite).

To support them in Sulong, we probably need to intrinsify the functions that handle and also those that can raise such exceptions. Also, the arithmetic floating point operations need to set a flag when such floating point exceptions arise.

Links:

Intrinsify the atexit C library function

The C library function atexit allows a programmer to register a function that is called before a program terminates. We need to intrinsify this function in Sulong since we cannot simply pass a Truffle function to the native side (see #57). The implementation would be conceptually similar to the constructor and destructor attributes (see #214). In the LLVM test suite there is a test case test-suite-3.2.src/SingleSource/UnitTests/2003-05-14-AtExit.c that could validate the implementation.

longjmp/setjmp Not Supported

We cannot directly call the native implementations of longjmp/setjmp since they implement non-local jumps which manipulate the stack. We will have to implement them through Java intrinsifications.

I refrained from implementing them since they would probably make the interpreter more complicated. However, since longjmp/setjmp can be used to implement exceptions, we still have to support them in the long term.

Fortran Benchmark Runner

Currently, the mx commands provide functions to execute benchmarks with Sulong, GCC, and Clang.

The commands still lack Fortran support. It is necessary to implement Fortran specific parts both in the GCC benchmark command and for Sulong.

Conditional Phi Write Nodes in Switch

Conditional phi writes occur in conditional branches of a basic block, when a successor of the basic block refers to it in a phi function.

We currently execute the phi writes conditionally in the br instruction, but unconditionally in the indirectbr and switch instructions.

I am not sure whether cases can occur where this is a problem. Probably, we should extend the test suite with more complicated control flow structure tests to find out if we can break the current implementation.

Duplication of Compilation Logic on the mx and Java side

Currently, the logic for compiling a file written in C, C++, or Fortran to LLVM IR or machine code is once on the Java side and once on the mx side. In Java, the test framework uses classes like GCC and opt. In mx, there are equivalent functions like compileWithGCC and opt.

It would be nice to remove this duplication of logic to easy maintenance when, e.g., adopting new versions of LLVM or GCC. It would probably be best to remove the Java part, since mx already takes care of downloading the dependencies. The Java part could then have a mx class from which the test framework can compile source files.

Automatically install external dependencies such as gcc-4.6

Currently, the Sulong mx commands assume that some external programs such as gcc-4.6 are already installed on the machine (see here). Instead, the mx script should automatically install these, if they are not installed yet. For that, it could simply use the system's package manager such as apt-get or yum on Linux.

Documentation for truffle.h

truffle.h is the most important API that Sulong provides. I think that the function names are mostly self-descriptive, even for people that did not use the interop API before.

However, I think it would be important to explain the bigger context of the file, i.e., how interop roughly works, how one can use the file with an example, what Sulong will do with the function calls, and a rough overview of the functions provided.

Documenting the file will allow people who use Truffle for the first time to quickly understand interop.

Fix Arithmetic Operations for 80 Bit Floats

Sulong implements 80 bit floating point numbers in a class LLVM80BitFloat since Java itself does not provide them.

Still, arithmetic operations such as subtraction, addition, multiplication, remainder, and division are implemented using double precision. Besides fixing them to use 80 bit float precision, it would be good for performance to somehow treat them special during compilation.

Support for 80 bit floats in native calls

Currently, Sulong does not provide native interoperability for 80 bit floats. The Graal Native Function Interface (NFI) does not handle them, since there are no 80 bit float primitives in Java. Probably, we have to extend the NFI to correctly handle passing and alignment of them.

Document installation procedure for Ubuntu 16.04

The README currently has instructions for installation on Ubuntu, but they don't work for the latest 16.04 LTS release. In particular gcc-4.6 isn't available in Ubuntu 16.04. There is likely a way to use the older packages from the the previous LTS release (14.04) or perhaps a PPA. To make installation easier, it'd be helpful to have a document set of steps on how to do so.

Life Time Analysis Execution Time

Implementing a life time analysis of virtual registers gave a huge speedup on many benchmarks. However, due to its naive implementation is also significantly increased the gate time (from 25 to 45 minutes on my machine).

Add support for Signal handlers

We currently ignore signal handler registrations. It think it would not be too hard to handle them on the Java side with sun.misc.SignalHandler. For example, you can do the following in Java:

SignalHandler sh = new SignalHandler() {
            @Override
            public void handle(Signal sig) {
                System.out.println(sig.getName() + " " + sig.getNumber());
            }
        };
        Signal.handle(new Signal("HUP"), sh);
        Signal.handle(new Signal("INT"), sh);
        Signal.handle(new Signal("TERM"), sh);
        while (true) {
            try {
                Thread.sleep(500);
            } catch (InterruptedException e) {
                e.printStackTrace();
            }
        }

Signal.handle registers the Signal handler. So when you, e.g., use kill -INT <process id> in Linux, then instead of terminating, the handle method is called. In Sulong we can intrinsify the C signal function and remember the function to then call it (with the signal number as an argument) when a signal occurs.

Make LLVMNodes serializable

Since we started to add files that are loaded as dynamic libraries (in the mx.sulong/libs directory) start-up performance got worse since these files have to be loaded in addition to the main program. The situation will get worse if we substitute more functions of the standard library.

As a way to improve start-up performance we could let LLVMNode implement Serializable and store the (uninitialized) Truffle nodes on the disk. If the C files were not modified and the node implementations did not change we could load the Truffle nodes from the disk instead of having to run the parser to parse the bitcode file and create the Truffle ASTs.

We could also store the global symbols of each serialized file. With this information we would only have to load those shared library files that we actually need.

Names for Mx Commands and Options

The Sulong mx commands use a lowercase command prefixed by su- and (inconsistently) using - as a word delimiter. Examples are su-run and su-tests-gcc.

The Sulong options are currently implemented by System options prefixed by llvm-. They also use - as word delimiter. Examples are llvm-opt-valueprofiling and llvm-print-asts.

A typical command on the command line looks (for me) like this: mx su-run -Dgraal.Dump=Truffle -Dllvm-print-asts=true test.ll.

Since the naming should change as infrequently as possible, it would be good to make a permanent decision about the naming conventions.

Fix argument conversion for indirect native calls

LLVMAddress should be converted to long during an indirect native call.

#include <string.h>

int main(void) {
        char test[512];
        static void *(*const volatile memset_v)(void *, int, size_t) = &memset;
        memset_v(test, 0, 512);
        return 0;
}

$ mx su-clang -S -emit-llvm -o memset-test.ll memset-test.c
$ mx su-run -ea -esa memset-test.ll
Exception in thread "main" java.lang.AssertionError: java.io.IOException: java.lang.AssertionError: memset[long, int, long] expected arg 0 to be java.lang.Long, not com.oracle.truffle.llvm.types.LLVMAddress
    at com.oracle.truffle.llvm.LLVM.evaluateFromSource(LLVM.java:274)
    at com.oracle.truffle.llvm.LLVM.executeMain(LLVM.java:249)
    at com.oracle.truffle.llvm.LLVM.main(LLVM.java:203)
Caused by: java.io.IOException: java.lang.AssertionError: memset[long, int, long] expected arg 0 to be java.lang.Long, not com.oracle.truffle.llvm.types.LLVMAddress
    at com.oracle.truffle.api.TruffleLanguage$LanguageImpl.eval(TruffleLanguage.java:585)
    at com.oracle.truffle.api.vm.PolyglotEngine.evalImpl(PolyglotEngine.java:565)
    at com.oracle.truffle.api.vm.PolyglotEngine.eval(PolyglotEngine.java:532)
    at com.oracle.truffle.api.vm.PolyglotEngine.eval(PolyglotEngine.java:469)
    at com.oracle.truffle.llvm.LLVM.evaluateFromSource(LLVM.java:271)
    ... 2 more
Caused by: java.lang.AssertionError: memset[long, int, long] expected arg 0 to be java.lang.Long, not com.oracle.truffle.llvm.types.LLVMAddress
    at com.oracle.graal.truffle.hotspot.nfi.HotSpotNativeFunctionHandle.checkArgs(HotSpotNativeFunctionHandle.java:119)
    at com.oracle.graal.truffle.hotspot.nfi.HotSpotNativeFunctionHandle.call(HotSpotNativeFunctionHandle.java:85)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMFunctionCallChain$1.execute(LLVMCallNode.java:295)
    at com.oracle.graal.truffle.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:220)
    at com.oracle.graal.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:213)
    at com.oracle.graal.truffle.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:204)
    at com.oracle.graal.truffle.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:188)
    at com.oracle.graal.truffle.OptimizedCallTarget.callDirect(OptimizedCallTarget.java:172)
    at com.oracle.graal.truffle.OptimizedDirectCallNode.callProxy(OptimizedDirectCallNode.java:70)
    at com.oracle.graal.truffle.OptimizedDirectCallNode.call(OptimizedDirectCallNode.java:61)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMFunctionCallChain.doDirect(LLVMCallNode.java:305)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen$DirectNode_.execute(LLVMCallNodeFactory.java:195)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen$BaseNode_.acceptAndExecute(LLVMCallNodeFactory.java:113)
    at com.oracle.truffle.api.dsl.internal.SpecializationNode.uninitialized(SpecializationNode.java:407)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen$UninitializedNode_.execute(LLVMCallNodeFactory.java:153)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNodeFactory$LLVMFunctionCallChainNodeGen.executeDispatch(LLVMCallNodeFactory.java:74)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMFunctionCallChainStartNode.executeGeneric(LLVMCallNode.java:247)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallNode$LLVMUnresolvedCallNode.executeGeneric(LLVMCallNode.java:152)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMCallUnboxNodeFactory$LLVMAddressCallUnboxNodeGen.executeGeneric(LLVMCallUnboxNodeFactory.java:435)
    at com.oracle.truffle.llvm.nodes.impl.vars.LLVMWriteNodeFactory$LLVMWriteAddressNodeGen.executeVoid(LLVMWriteNodeFactory.java:386)
    at com.oracle.truffle.llvm.nodes.impl.base.LLVMBasicBlockNode.executeGetSuccessorIndex(LLVMBasicBlockNode.java:86)
    at com.oracle.truffle.llvm.nodes.impl.others.LLVMBlockNode$LLVMBlockControlFlowNode.executeGeneric(LLVMBlockNode.java:77)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMFunctionStartNode.execute(LLVMFunctionStartNode.java:112)
    at com.oracle.graal.truffle.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:220)
    at com.oracle.graal.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:213)
    at com.oracle.graal.truffle.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:204)
    at com.oracle.graal.truffle.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:188)
    at com.oracle.graal.truffle.OptimizedCallTarget.callDirect(OptimizedCallTarget.java:172)
    at com.oracle.graal.truffle.OptimizedDirectCallNode.callProxy(OptimizedDirectCallNode.java:70)
    at com.oracle.graal.truffle.OptimizedDirectCallNode.call(OptimizedDirectCallNode.java:61)
    at com.oracle.truffle.llvm.nodes.impl.func.LLVMGlobalRootNode.execute(LLVMGlobalRootNode.java:93)
    at com.oracle.graal.truffle.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:220)
    at com.oracle.graal.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:213)
    at com.oracle.graal.truffle.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:204)
    at com.oracle.graal.truffle.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:188)
    at com.oracle.graal.truffle.OptimizedCallTarget.call(OptimizedCallTarget.java:166)
    at com.oracle.truffle.llvm.nodes.impl.base.LLVMMainFunctionReturnValueRootNode$LLVMMainFunctionReturnNumberRootNode.execute(LLVMMainFunctionReturnValueRootNode.java:81)
    at com.oracle.graal.truffle.OptimizedCallTarget.callProxy(OptimizedCallTarget.java:220)
    at com.oracle.graal.truffle.OptimizedCallTarget.callRoot(OptimizedCallTarget.java:213)
    at com.oracle.graal.truffle.OptimizedCallTarget.callBoundary(OptimizedCallTarget.java:204)
    at com.oracle.graal.truffle.OptimizedCallTarget.doInvoke(OptimizedCallTarget.java:188)
    at com.oracle.graal.truffle.OptimizedCallTarget.call(OptimizedCallTarget.java:166)
    at com.oracle.truffle.api.TruffleLanguage$LanguageImpl.eval(TruffleLanguage.java:581)
    ... 6 more

Support C, C++, Fortran, and other languages directly by the interpreter

Currently, the interpreter supports only the textual bitcode format (.ll), and will instead soon support the binary bitcode format (.bc). To execute C programs or programs written in other languages, one has to manually compile them to bitcode which one can currently do with mx. For example, one would use mx su-clang -S -emit-llvm -o test.ll test.c to compile a C file and then execute it with mx su-run test.ll which in turn invokes the LLVM class.

It would be nice to support C and other LLVM-based languages directly by the LLVM class, so one can execute them with a single mx su-run test.ll.
To support both .ll and .bc we can convert between the two formats using llvm-as and llvm-dis. To compile C and C++ files, we can use clang and opt as we currently do in compileWithClangOpt in the mx script.
Having this functionality would bring more of the compilation logic from mx to the Java side (see #82), would benefit the interop use case to directly support C and files of other languages, and would also simplify the use of Sulong.

Email address for security suggestion for sulong

I have a small suggestion regarding sulong which is security related. Please don't be alarmed -- its its just a suggestion -- but I thought it might be better to communicate on email rather than post this as a ticket.

Can you give me an email address where I can email?

Add support for attribute((destructor))

The GNU C extensions support attributes, with which one can implement constructors and destructors for C:

#include <stdio.h>

int main() {
    puts("hello world!");
}

__attribute__((constructor))
void constr() {
    puts("constr");
}

__attribute__((destructor))
void destr() {
    puts("constr");
}

Using mx su-clang -S -emit-llvm -o test.ll test.c to compile, one gets the following bitcode:

@.str = private unnamed_addr constant [13 x i8] c"hello world!\00", align 1
@.str1 = private unnamed_addr constant [7 x i8] c"constr\00", align 1
@llvm.global_ctors = appending global [1 x { i32, void ()* }] [{ i32, void ()* } { i32 65535, void ()* @constr }]
@llvm.global_dtors = appending global [1 x { i32, void ()* }] [{ i32, void ()* } { i32 65535, void ()* @destr }]

define i32 @main() nounwind uwtable {
  %1 = call i32 @puts(i8* getelementptr inbounds ([13 x i8]* @.str, i32 0, i32 0))
  ret i32 0
}

declare i32 @puts(i8*)

define void @constr() nounwind uwtable {
  %1 = call i32 @puts(i8* getelementptr inbounds ([7 x i8]* @.str1, i32 0, i32 0))
  ret void
}

define void @destr() nounwind uwtable {
  %1 = call i32 @puts(i8* getelementptr inbounds ([7 x i8]* @.str1, i32 0, i32 0))
  ret void
}

The constructors and destructors functions are stored in variables @llvm.global_ctors and @llvm.global_dtors.

Currently, we only support constructors. A suitable beginner task would be to also add destructors based on the current constructor implementation. Also see #174 for implementation hints.

Make failures of unsupported instructions lazy

Currently, the parser throws exceptions for unsupported LLVM IR instructions. This prevents execution of larger libraries, where functions with unsupported instructions might never be executed. To partially support these libraries, we have to make the failures lazy, i.e., throw an exception in a node and not in the parser.

Branch Probability Injection

We communicate branch probability information to Graal, but it is not actually exploited in its BranchProbabilityNode since in Sulong's use cases, the branch probability injection in the end applies to a ConditionalNode in Graal.

Lookup of Native Symbols

Currently, we apply a hack for looking up native symbols other than functions such as C's stdout file. We access private methods of Graal's Native Function Interface (NFI) in LLVMContext via reflections. The call eventually resolves to dlsym on the native side, which works for all symbols, not just functions.

The NFI currently does not expose this functionality. It would be best if we extended the NFI's API to be able to get a native address of a symbol.

Graal Native Function Interface Performance Problem

Currently, native function calls are really slow. With the Graal Foreign Function Interface they should actually be very fast, since there should not be any overhead between the Truffle and native side, just the argument conversion and a direct call to the native side. [1]

Instead of being lowered to a call stub, the Graal graph contains a CompilerToVM.executeInstalledCode invoke node, even after the last compiler phase.

[1] An efficient native function interface for Java

Test Case Generation: Fuzzers and other Compiler-Testing Tools

It becomes more and more difficult to find (unknown) bugs with test cases from the existing test suites. I think that we should start to include compiler fuzzers and other compiler-testing tools as a more structured approach to find bugs in the Sulong interpreter. One example of such a tool is Csmith.

I think that adding such tools to our CI testing and fixing existing bugs would be a suitable task for someone who wants to get started with Sulong.

Fix linking of extern variables

Currently, linking of extern symbols is broken when the external symbol is defined in one of the LLVM IR files (see #340). The current lookup logic (see LLVMVisitor) is as following:

if ("external".equals(linkage)) {
    long getNativeSymbol = nativeLookup.getNativeHandle(globalVarName);
    LLVMAddress nativeSymbolAddress = LLVMAddress.fromLong(getNativeSymbol);
    return factoryFacade.createLiteral(nativeSymbolAddress, LLVMBaseType.ADDRESS);
} else {
    Object findOrAllocateGlobal = findOrAllocateGlobal(globalVariable);
    assert findOrAllocateGlobal != null;
    return factoryFacade.createLiteral(findOrAllocateGlobal, LLVMBaseType.ADDRESS);
}

Per default, external symbols are looked up by using the Graal NFI while they should first be looked up in the bitcode files.

It is not enough to add a condition that checks if there is already an entry for a global variable (&& !globalVars.containsKey(globalVariable)) since space for all global variables is initially allocated, even if the symbol is looked up later and the allocated global variable space not used. Even if the redundant allocation would be fixed inside the LLVMVisitor it would not be enough to find global extern variables that are defined in other files.

To fix this issue, we have to traverse all bitcode files, determine the unresolved extern variables and resolve them (with the Graal NFI). We could either keep patch addresses and patch them after constructing the ASTs (with one traversal) or first determine the unresolved global variables, and then construct the right references in the visitor (with two traversals).

Tests for Value Profiling

Currently, no tests for value profiling are implemented. It would be nice to have some tests where a value profiling speculation fails after compilation. It's especially important to have these tests for floating point numbers on corner cases such as NaNs and +/-0.

Graal NFI cannot handle functions that return <= 8 byte structs by value

Currently, we cannot successfully call native functions that return structs by value.

Consider the following function:

struct a {
    double x;
    double y;
};

struct a test() {
        struct a t = {3, 4};
        return t;
}

int main() {
    struct a asdf = test();
    printf("%d %d\n", asdf.x, asdf.y);
}

When compiling the program to an executable the function test looks as follows:

00000000004004e0 <test>:
  4004e0:       55                      push   %rbp
  4004e1:       48 89 e5                mov    %rsp,%rbp
  4004e4:       0f 10 05 dd 00 00 00    movups 0xdd(%rip),%xmm0        # 4005c8 <_IO_stdin_used+0x8>
  4004eb:       0f 29 45 e0             movaps %xmm0,-0x20(%rbp)
  4004ef:       0f 29 45 f0             movaps %xmm0,-0x10(%rbp)
  4004f3:       f2 0f 10 45 f0          movsd  -0x10(%rbp),%xmm0
  4004f8:       f2 0f 10 4d f8          movsd  -0x8(%rbp),%xmm1
  4004fd:       5d                      pop    %rbp
  4004fe:       c3                      retq   
  4004ff:       90                      nop

The struct returned by test is passed over %xmmo and %xmm1. When executing this program with Sulong it works since test is not a native call. However, when calling functions in the standard library that return such structs (e.g. in complex.h), Sulong crashes:

#include <stdio.h>
#include <complex.h>
#include <tgmath.h>

int main() {
    double complex z2 = pow(I, 2);  
    printf("pow(I, 2) = %.1f%+.1fi\n", creal(z2), cimag(z2));
}

The AMD64 ABI (pages 18-22) specifies that the caller has to provide space for structs returned by value through a hidden first argument. However, float and double structs that do not exceed 8 bytes are returned by %xmm0 and %xmm1.

The Graal Native Function interface currently does not support that a function returns multiple values and thus has to be extended.

Code is too large errors since recent Graal version with Argon2

Argon2 works fine with older Graal versions, but triggers code is too large errors with recent Graal versions.

Argon2 Debug branch: https://github.com/lxp/sulong/tree/argon2-debug

Old, working Graal version:

$ mx sversions
f65a828c424782db0551cab83471de46a49a5888  sulong
a9f5ed2e5289293198de31435377b76028c0c401  graal-core
fbb6bb30803df787c07b1c8131789c94acfc2761  truffle
$ mx su-tests-argon2 -Dgraal.TraceTruffleCompilation=true
[...]
[truffle] opt done         @fill_block_with_xor <opt>                                  |ASTSize   14880/14880 |Time 19526(6132+13394)ms |DirectCallNodes I    0/D  134 |GraalNodes 15053/31597 |CodeSize       137225 |Source /home/david/graalvm/sulong/projects/com.oracle.truffle.llvm.test/argon2/phc-winner-argon2/test.su@dc1216ad8495636fd6a5fa4185a38a01e515a68a_ref.ll:1 
[truffle] opt done         @fill_block <opt>                                           |ASTSize   14871/14871 |Time 19460(6866+12594)ms |DirectCallNodes I    0/D  133 |GraalNodes 15024/31477 |CodeSize       136955 |Source /home/david/graalvm/sulong/projects/com.oracle.truffle.llvm.test/argon2/phc-winner-argon2/test.su@dc1216ad8495636fd6a5fa4185a38a01e515a68a_ref.ll:1 
[...]

Still working, but already large code size:

$ mx sversions
f65a828c424782db0551cab83471de46a49a5888  sulong
e8656a2a6674f662ba2568e10327c55153d61f12  graal-core
534a0391a33938b5d20b362c3f4988e90672bf0c  truffle
$ mx su-tests-argon2 -Dgraal.TraceTruffleCompilation=true
[...]
[truffle] opt done         @fill_block_with_xor <opt>                                  |ASTSize   14880/14880 |Time 25975(6091+19884)ms |DirectCallNodes I    0/D  134 |GraalNodes 19120/76991 |CodeSize       631612 |Source /home/david/graalvm/sulong/projects/com.oracle.truffle.llvm.test/argon2/phc-winner-argon2/test.su@dc1216ad8495636fd6a5fa4185a38a01e515a68a_ref.ll:1 
[truffle] opt done         @fill_block <opt>                                           |ASTSize   14871/14871 |Time 24168(4752+19416)ms |DirectCallNodes I    0/D  133 |GraalNodes 19086/76777 |CodeSize       631817 |Source /home/david/graalvm/sulong/projects/com.oracle.truffle.llvm.test/argon2/phc-winner-argon2/test.su@dc1216ad8495636fd6a5fa4185a38a01e515a68a_ref.ll:1 
[...]

Most recent version, Code size too large:

$ mx sversions
f65a828c424782db0551cab83471de46a49a5888  sulong
daaace2f5a3d961da5eca169e32408ab5b66e2c9  graal-core
bd163128ec958b97ebc68b33ac5b4fae376a37b5  truffle
$ mx su-tests-argon2 -Dgraal.TraceTruffleCompilation=true
[truffle] opt fail         @fill_block_with_xor                                        |Reason jdk.vm.ci.code.BailoutException: Code installation failed: code is too large 
jdk.vm.ci.code.BailoutException: Code installation failed: code is too large
    at jdk.vm.ci.hotspot.HotSpotCodeCacheProvider.installCode(HotSpotCodeCacheProvider.java:140)
    at com.oracle.graal.compiler.target.Backend.createInstalledCode(Backend.java:163)
    at com.oracle.graal.truffle.TruffleCompiler.compileMethodHelper(TruffleCompiler.java:207)
    at com.oracle.graal.truffle.TruffleCompiler.compileMethod(TruffleCompiler.java:159)
    at com.oracle.graal.truffle.GraalTruffleRuntime.doCompile0(GraalTruffleRuntime.java:465)
    at com.oracle.graal.truffle.GraalTruffleRuntime.doCompile(GraalTruffleRuntime.java:451)
    at com.oracle.graal.truffle.GraalTruffleRuntime$1.run(GraalTruffleRuntime.java:483)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
    at com.oracle.graal.compiler.CompilerThread.run(CompilerThread.java:51)

[truffle] opt fail         @fill_block                                                 |Reason jdk.vm.ci.code.BailoutException: Code installation failed: code is too large 
jdk.vm.ci.code.BailoutException: Code installation failed: code is too large
    at jdk.vm.ci.hotspot.HotSpotCodeCacheProvider.installCode(HotSpotCodeCacheProvider.java:140)
    at com.oracle.graal.compiler.target.Backend.createInstalledCode(Backend.java:163)
    at com.oracle.graal.truffle.TruffleCompiler.compileMethodHelper(TruffleCompiler.java:207)
    at com.oracle.graal.truffle.TruffleCompiler.compileMethod(TruffleCompiler.java:159)
    at com.oracle.graal.truffle.GraalTruffleRuntime.doCompile0(GraalTruffleRuntime.java:465)
    at com.oracle.graal.truffle.GraalTruffleRuntime.doCompile(GraalTruffleRuntime.java:451)
    at com.oracle.graal.truffle.GraalTruffleRuntime$1.run(GraalTruffleRuntime.java:483)
    at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
    at java.util.concurrent.FutureTask.run(FutureTask.java:266)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
    at java.lang.Thread.run(Thread.java:745)
    at com.oracle.graal.compiler.CompilerThread.run(CompilerThread.java:51)

With this Graal version compilation also requires a lot of heap space.

graalvm / sulong Goto Github PK

sulong's People

Stargazers

Watchers

Forkers

sulong's Issues

Recommend Projects

Recommend Topics

Recommend Org