norswap / uranium Goto Github PK

Semantic analysis framework

License: BSD 3-Clause "New" or "Revised" License

Java 92.76% Kotlin 7.24%

uranium's Introduction

Uranium

Install
Javadoc (updates daily) / Mirror (might need loading)

Uranium is a semantic analysis framework. It is used to implement the semantic analysis stage of a compiler or interpreter: typing, building lexical and dynamic scopes, verifying and deriving other properties of interest at compile time.

Uranium works in a manner reminiscent of attribute grammars: it provides support for the computation of attributes on the nodes of an AST (Abstract Syntax Tree) - or really any set of objects, for that matter. These attributes may be interdependent, and a large part of Uranium's value-added is that it lets you specify these dependencies and how to compute attribute values from the value of dependencies, then automatically manages the propagation of dependencies.

Uranium is the perfect complement to Autumn, my parsing library.

You can see an example of Uranium in action in the demo language Sigh (SemanticAnalysis.java, SemanticAnalysisTests.java), and here is a code review of that code.

Key Concepts

(Rough draft for now.)

An Attribute is a (node, name) pair. One of the objective of the framework is to assign values to attributes, which may be dependent on the value of other attributes.
The Reactor is the central workhorse of the framework: it is responsible to compute the value of attributes from a set of user-defined specifications.
You can supply basic attribute values directly by using Reactor#set.
You can also specify rules (Rule) to compute attribute values dynamically by using Reactor#rule.
When you have registered all your basic values and rules, you can run the reactor with Reactor#run. This automatically propagates attribute values and trigger dynamic value computation through the registered rules.
Rules are able to dynamically register new rules. This is sometimes necessary because the node to be used for an attribute can only be determined dynamically.
The initial set of values and rules are typically registered during a single-pass tree walk. I recommend using the abstract Walker class from my library norswap-utils. Its reflective implementations are particularly useful for prototyping.
The other values/rules that can be registered through this pass are computed by Reactor#run and are those we have called "dynamic" above.
The nature of semantic analysis is to check for potential errors (SemanticError). Uranium makes it easy to signal errors while not disrupting the computational flow. This lets the compiler derive as much information as possible.
If an error precludes the computation of an attribute value, the error becomes the attribute value. However, that value is not propagate to rules depending on the attribute. Instead, the error is propgated: new errors are generated for the attributes computed by the dependent rules, signifying they can't be computed because of the original error.
You can inspect the result of a Reactor run by using the AttributeTreeFormatter class. It is meant for tree-like structures and required a norswap-utils Walker to walk the tree.

Be sure to check out the javadoc pages for Rule and Reactor which contains crucial information on how to use the framework.

Your implementation of semantic analysis can be unit-tested using the UraniumTestFixture class.

uranium's People

Contributors

Stargazers

Watchers

Forkers

mounzera yoyomanhihi marielonfils dpcodebiz

uranium's Issues

Use System.identityHashCode in Attribute#hashCode

Attribute#equals uses object identity to compares its fields, so its unecessary (and in fact, potentially counter-productive) to compute the full hashCode of its fields in the hashCode method.
A student stumbled upon an issue where the hashCode of the Attribute changed because he made a change to the underyling AST. While this is not strictly a bug (AST should be immutable), it would have been avoided by using identityHashCode.
This will improve performance a lot: the current implem of SighNode#hashCode uses reflection and is very inefficient.

Feature request: objectives of nested rules

Hello,

While I'm at it, I will also address one last issue i had while implementing Sigh.
It's not really a bug, more like a feature request concerning nested rules.

I often had to use nested rules in my implementations.

For example in binary expressions:

// Find the types of operands
R.rule()
    .using(node.left.attr("type"), node.right.attr("type"))
    .by(r -> {
        // Retrieve their declarations
        DeclarationNode leftDecl  = ((Type) r.get(0)).decl();
        DeclarationNode rightDecl = ((Type) r.get(1)).decl();

        // Retrieve their scopes
        R.rule()
            .using(leftDecl.attr("scope"), rightDecl.attr("scope"))
            .by(r2 -> {
                Scope leftScope  = r2.get(0);
                Scope rightScope = r2.get(1);

                // Lookup the operator in the scopes
                FunOverrideList leftOverloads  = leftScope .lookupLocalFun(node.operator.toString());
                FunOverrideList rightOverloads = rightScope.lookupLocalFun(node.operator.toString());

                // Get dependencies (types of all functions in both overload lists)
                // [...]

                // Get the types of the dependencies
                R.rule(node.attr("opType"), node.attr("type"))
                    .using(dependencies)
                    .by(r -> {
                        // Check overloads to find one that matches the types of the arguments
                        // [...]

                        // Only then, we can set the type of the binary operation
                        r.set(0, funType);
                        r.set(1, funType.getReturnType());
                    });
            });
    });

Here, the first two rules are declared without any exports because they don't set any attribute. Trying to add an attribute in the rule() call will create an error because the attributes are not actually set in the rule itself, which only creates a new rule for later.
Thus, only the third one knows it wants to compute the type of the operation.

However, if one of the dependencies of the first rules cannot be provided, for example the type of one of the operands, Uranium will report a missing attribute error because it doesn't know that the rule aims to compute that particular attribute.

For example:

return 5 + x

Would return two errors:

Could not resolve 'x'
missing attribute (BinaryExpression(5 + x) :: type)

The way to prevent this would be to add the binary expression in the errorFor call that reports the resolution error on x. However, this requires each node to take its parents into account when reporting an error, which is not ideal.

Would it be possible to add a new feature to Uranium that adds the ability to "tell" a rule that an attribute will be computed by a nested rule ? This way, when the type attribute is missing on the operand, it won't report an error on the operation since it knows that it requires the operand to be computed.

R.rule()
    .using(node.left.attr("type"), node.right.attr("type"))
    .objective(node.attr("type"))
    .by(r -> {
        // [...] Nested rules
    })

Please note that if you agree with the idea of such a feature, I would be glad to help implementing it.

Thank you for your time,

Martin Danhier

Add unit tests

The framework is currently not unit tested.

The de-facto testing suite for Uranium is the Sigh demo language.
Obviously, that's not ideal.

Unit tests would also help us avoid regressions.
e.g., this bug needs a regression test

A rule with equivalent dependencies is enqueued too many times

Hello again,

Thank you for merging my commit !

I think I found another bug while testing the update on my Sigh implementation.
It also occurs when two dependencies are the same (it's actually at the same place than the other one, and was introduced in the update):

// Retrieve their declarations
DeclarationNode leftDecl = left.decl();
DeclarationNode rightDecl = right.decl();

// Retrieve their scopes
R.rule()
    .using(leftDecl.attr("scope"), rightDecl.attr("scope"))
    .by(r2 -> {
        Scope leftScope = r2.get(0);
        Scope rightScope = r2.get(1);

        // [...] find operators in the scopes and set the expression type (other rules are created)
    });

According to my analysis, here is what happens:

In the Reactor.register function, the for loop iterates over dependencies (here two iterations)
For each dependency, the rule.supply function is called
In the rule.supply, for each dependency, the attribute is checked to avoid the problem of the previous issue. Here, both attributes match since they are equivalent, so the code inside the if is ran for each dependency.
Each time a non-null attribute is set, the unsatisfied counter decrements, and thus reaches 0 when storing the value in the second dependency. Since the counter is 0, the rule is enqueued.
But then, another iteration executes in the reactor.register function, which calls the rule.supply, which also runs two iterations to check the value. For both iterations, since the counter is 0, the rule is enqueued again.
At the end, the rule was thus enqueued three times instead of once.
This causes the rule code to be executed several time even though no redefinition was intended, and in the end triggers attribute redefinition in sub rules which crash the program with "attempting to redefine: ..."

Maybe the register function (and other functions iterating over dependencies to supply attributes, like supplyToDependents) should stop iterating over dependencies if unsatisfied is 0 ?

norswap / uranium Goto Github PK

uranium's Introduction

Uranium

Key Concepts

uranium's People

Contributors

Stargazers

Watchers

Forkers

uranium's Issues

Use System.identityHashCode in Attribute#hashCode

Feature request: objectives of nested rules

Add unit tests

A rule with equivalent dependencies is enqueued too many times

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent