Giter Club home page Giter Club logo

uranium's Introduction

Uranium


Uranium is a semantic analysis framework. It is used to implement the semantic analysis stage of a compiler or interpreter: typing, building lexical and dynamic scopes, verifying and deriving other properties of interest at compile time.

Uranium works in a manner reminiscent of attribute grammars: it provides support for the computation of attributes on the nodes of an AST (Abstract Syntax Tree) - or really any set of objects, for that matter. These attributes may be interdependent, and a large part of Uranium's value-added is that it lets you specify these dependencies and how to compute attribute values from the value of dependencies, then automatically manages the propagation of dependencies.

Uranium is the perfect complement to Autumn, my parsing library.

You can see an example of Uranium in action in the demo language Sigh (SemanticAnalysis.java, SemanticAnalysisTests.java), and here is a code review of that code.

Key Concepts

(Rough draft for now.)

  • An Attribute is a (node, name) pair. One of the objective of the framework is to assign values to attributes, which may be dependent on the value of other attributes.

  • The Reactor is the central workhorse of the framework: it is responsible to compute the value of attributes from a set of user-defined specifications.

  • You can supply basic attribute values directly by using Reactor#set.

  • You can also specify rules (Rule) to compute attribute values dynamically by using Reactor#rule.

  • When you have registered all your basic values and rules, you can run the reactor with Reactor#run. This automatically propagates attribute values and trigger dynamic value computation through the registered rules.

  • Rules are able to dynamically register new rules. This is sometimes necessary because the node to be used for an attribute can only be determined dynamically.

  • The initial set of values and rules are typically registered during a single-pass tree walk. I recommend using the abstract Walker class from my library norswap-utils. Its reflective implementations are particularly useful for prototyping.

  • The other values/rules that can be registered through this pass are computed by Reactor#run and are those we have called "dynamic" above.

  • The nature of semantic analysis is to check for potential errors (SemanticError). Uranium makes it easy to signal errors while not disrupting the computational flow. This lets the compiler derive as much information as possible.

  • If an error precludes the computation of an attribute value, the error becomes the attribute value. However, that value is not propagate to rules depending on the attribute. Instead, the error is propgated: new errors are generated for the attributes computed by the dependent rules, signifying they can't be computed because of the original error.

  • You can inspect the result of a Reactor run by using the AttributeTreeFormatter class. It is meant for tree-like structures and required a norswap-utils Walker to walk the tree.

Be sure to check out the javadoc pages for Rule and Reactor which contains crucial information on how to use the framework.

Your implementation of semantic analysis can be unit-tested using the UraniumTestFixture class.

uranium's People

Contributors

martin-danhier avatar norswap avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

uranium's Issues

Use System.identityHashCode in Attribute#hashCode

  • Attribute#equals uses object identity to compares its fields, so its unecessary (and in fact, potentially counter-productive) to compute the full hashCode of its fields in the hashCode method.
  • A student stumbled upon an issue where the hashCode of the Attribute changed because he made a change to the underyling AST. While this is not strictly a bug (AST should be immutable), it would have been avoided by using identityHashCode.
  • This will improve performance a lot: the current implem of SighNode#hashCode uses reflection and is very inefficient.

Feature request: objectives of nested rules

Hello,

While I'm at it, I will also address one last issue i had while implementing Sigh.
It's not really a bug, more like a feature request concerning nested rules.

I often had to use nested rules in my implementations.

For example in binary expressions:

// Find the types of operands
R.rule()
    .using(node.left.attr("type"), node.right.attr("type"))
    .by(r -> {
        // Retrieve their declarations
        DeclarationNode leftDecl  = ((Type) r.get(0)).decl();
        DeclarationNode rightDecl = ((Type) r.get(1)).decl();

        // Retrieve their scopes
        R.rule()
            .using(leftDecl.attr("scope"), rightDecl.attr("scope"))
            .by(r2 -> {
                Scope leftScope  = r2.get(0);
                Scope rightScope = r2.get(1);

                // Lookup the operator in the scopes
                FunOverrideList leftOverloads  = leftScope .lookupLocalFun(node.operator.toString());
                FunOverrideList rightOverloads = rightScope.lookupLocalFun(node.operator.toString());

                // Get dependencies (types of all functions in both overload lists)
                // [...]

                // Get the types of the dependencies
                R.rule(node.attr("opType"), node.attr("type"))
                    .using(dependencies)
                    .by(r -> {
                        // Check overloads to find one that matches the types of the arguments
                        // [...]

                        // Only then, we can set the type of the binary operation
                        r.set(0, funType);
                        r.set(1, funType.getReturnType());
                    });
            });
    });

Here, the first two rules are declared without any exports because they don't set any attribute. Trying to add an attribute in the rule() call will create an error because the attributes are not actually set in the rule itself, which only creates a new rule for later.
Thus, only the third one knows it wants to compute the type of the operation.

However, if one of the dependencies of the first rules cannot be provided, for example the type of one of the operands, Uranium will report a missing attribute error because it doesn't know that the rule aims to compute that particular attribute.

For example:

return 5 + x

Would return two errors:

  1. Could not resolve 'x'
  2. missing attribute (BinaryExpression(5 + x) :: type)

The way to prevent this would be to add the binary expression in the errorFor call that reports the resolution error on x. However, this requires each node to take its parents into account when reporting an error, which is not ideal.

Would it be possible to add a new feature to Uranium that adds the ability to "tell" a rule that an attribute will be computed by a nested rule ? This way, when the type attribute is missing on the operand, it won't report an error on the operation since it knows that it requires the operand to be computed.

R.rule()
    .using(node.left.attr("type"), node.right.attr("type"))
    .objective(node.attr("type"))
    .by(r -> {
        // [...] Nested rules
    })

Please note that if you agree with the idea of such a feature, I would be glad to help implementing it.

Thank you for your time,

Martin Danhier

Add unit tests

The framework is currently not unit tested.

The de-facto testing suite for Uranium is the Sigh demo language.
Obviously, that's not ideal.

Unit tests would also help us avoid regressions.
e.g., this bug needs a regression test

A rule with equivalent dependencies is enqueued too many times

Hello again,

Thank you for merging my commit !

I think I found another bug while testing the update on my Sigh implementation.
It also occurs when two dependencies are the same (it's actually at the same place than the other one, and was introduced in the update):

// Retrieve their declarations
DeclarationNode leftDecl = left.decl();
DeclarationNode rightDecl = right.decl();

// Retrieve their scopes
R.rule()
    .using(leftDecl.attr("scope"), rightDecl.attr("scope"))
    .by(r2 -> {
        Scope leftScope = r2.get(0);
        Scope rightScope = r2.get(1);

        // [...] find operators in the scopes and set the expression type (other rules are created)
    });

According to my analysis, here is what happens:

  • In the Reactor.register function, the for loop iterates over dependencies (here two iterations)
  • For each dependency, the rule.supply function is called
  • In the rule.supply, for each dependency, the attribute is checked to avoid the problem of the previous issue. Here, both attributes match since they are equivalent, so the code inside the if is ran for each dependency.
  • Each time a non-null attribute is set, the unsatisfied counter decrements, and thus reaches 0 when storing the value in the second dependency. Since the counter is 0, the rule is enqueued.
  • But then, another iteration executes in the reactor.register function, which calls the rule.supply, which also runs two iterations to check the value. For both iterations, since the counter is 0, the rule is enqueued again.
  • At the end, the rule was thus enqueued three times instead of once.
  • This causes the rule code to be executed several time even though no redefinition was intended, and in the end triggers attribute redefinition in sub rules which crash the program with "attempting to redefine: ..."

Maybe the register function (and other functions iterating over dependencies to supply attributes, like supplyToDependents) should stop iterating over dependencies if unsatisfied is 0 ?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.