Giter Club home page Giter Club logo

libscript's People

Contributors

strandfield avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

libscript's Issues

Static local variables

Introduction

Static local variables are variables declared inside blocks with the static keyword.

void foo()
{
  static int n = 0;
  int i = 0;
  ++n;
  ++i;
}

Unlike other variables, static variables are initialized only once when the declaration is reached for the first time (except for a few particular cases like zero or constant initialization); the initialization part of the declaration is skipped if control reaches the block again. This means that, in the body of a function, the variable keep its state between multiple calls to the function. If the function in the above example is called twice, the value of n after the second execution will be 2 while the value of i at the end of the function body will always be 1.

Unlike global variables, static variables keep the name local (and thus limit access) and are less concerned by initialization order problems (have a look at C++ Static Initialization Order Fiasco). See also this relevant paragraph on cppreference.com.

Current status

Currently, libscript does not support static local variables. The keyword static was even silently ignored until recently : commit 0a06efe now makes the library print an error message when they are used; i.e. the following code

int counter()      
{
  static int n = 0;
  return ++n; 
} 

produces the following error message.

[error](C142) Not implemented : Static variables not implemented yet

Implementing the feature

A static variable can easily be implemented using two variables:

  • the variable itself
  • a boolean flag indicating whether the variable was initialized

Both of these need to be stored in a persistent way, and the flag must be initialized at compilation-time to false.

I think the library already has all the tools necessary to implement the feature. We can simply use the flag to either evaluate the initialization and push the result to the stack or to only push the existing value to the stack. Upon leaving the block, the value is popped from the stack but not destroyed.

A first implementation could use this technique, but in the future we may need a dedicated program::Statement for this in order not to lose too much information about the semantic of the program.

A tests has already been written for the feature and just needs to be uncommented.

Redesign default function arguments

Introduction

Let's start by a brief history of default arguments handling in the library.

In the very first implementation of default arguments, the functions themselves were responsible for checking they were called with the correct number of arguments; and if some were missing, they had to construct them using the default values. This was quickly deemed unacceptable as it implied that functions with default arguments were slower than other functions because they had to check if some arguments were missing, construct the missing ones and destruct them on returning.

In the second implementation, the burden was moved to the caller function. It is now the caller responsability to ensure that it can provides all arguments. The called function (i.e. the callee) now holds a list of default arguments that can be used by the caller if needed. This means that the callee no longer has some parameters marked as 'optional': a function that declares n parameters is always called with n arguments. The processing of default arguments was moved from run time to compile time !

But things are not good enough yet...

The problems

As of @a796ad9, default function arguments are one of the very few things that cannot be set in Function at construction time through the FunctionBuilder class.

This means that they can only be added after the Function object is created. And since we cannot know in advance if a function is going to have default arguments or not, all functions must provide the mechanism to add default arguments. Currently this translates into a DefaultArgumentList data member in each FunctionImpl.

This is still not acceptable because:

  • behavior may change at runtime depending on when the default arguments are added
  • functions that cannot have default arguments (e.g. operators, conversions) pay the price for this feature.

Now this is not as bad as it may seem.
First, default arguments are very likely to be declared right after the function is created and thus before it is ever used.
Second, the DefaultArgumentList has been designed to be the size of a std::unique_ptr, so the space overhead is small.

Regardless, the design and implementations are still immature and need to be improved.

There are two main problems currently:

  1. To limit the space overhead, no std::vector is created if there are no default arguments; and in such case Function::defaultArguments() throws an exception. (This could be easily fix without incurring any space overhead)

  2. The order in which the default arguments are stored hasn't been well justified.
    Currently if we have

int foo(int a, int b = 0, char c = 'c'); 

the default arguments are stored as [0, 'c'].
But if you call Function::addDefaultArgument(), a push_back() is made, meaning that we should write:

Function foo = ...;
foo.addDefaultArgument(engine.newInt(0));
foo.addDefaultArgument(engine.newChar('c'));

Now this works because no type checking is performed yet.
But technically we are in an invalid state before the second call to addDefaultArgument().

Since all this behavior are not documented yet, the feature is barely usable.

What needs to be done

  1. It would be interesting to have some data on how often default arguments are used. This would help choosing between time or space overhead.

  2. Ordering of default arguments should be clearly defined and rationales should be documented.

  3. Default arguments must be set at construction time, not after. This will reinforce the idea that Function are immutable objects and improve consistency; and if point 1) tells us it is interesting to distinguish functions that have default args and those that haven't, well such thing should naturally be done before the function is constructed.

  4. Regardless of the results of 1), experiment removing the feature for operators and conversion functions (and thus the space overhead).

Ideally, decisions should be backed up with benchmarks (although this may be hard to notice a difference) and any new code should be documented.

Make the overload resolution algorithms more generic

Introduction

The OverloadResolution is a central piece of the library. Its job is, given a set of candidate functions and a list of argument types, to find the function that matches the arguments best. Note that such function may not exist, or several functions may be equally good. In such case we say that overload resolution fails, and the program cannot compile (in the second case we also say that the call is ambiguous). The overload resolution mechanism implemented in the library is really similar to its C++ counterpart; the latter is described thoroughly in the C++ standard and a very good summary is available on cppreference.com (see Overload Resolution).

The overload resolution algorithms have two primary inputs:

  • a set of functions
  • an ordered list of types

While the first bullet is always passed as a std::vector<Function>, the second one may currently take several forms:

  • a std::vector<Type> (obviously)
  • a std::vector<Value>
  • a std:vector<std::shared_ptr<program::Expression>>

Why 3 different kinds of inputs ? Because std::vector<Type> is not sufficient ! Consider the following expression:

f("Hello", {1, 2});

During compilation, the compiler will process the arguments, perform name resolution to get a list of f, and perform overload resolution to decide which f to call. The problem is that while the first argument has a type (which is String), the second one does not (it is a form of list-initialization). It will be used to initialize a function parameter. For example, given

void f(const String & str, const Point & pt) { ... }

The second argument will be used to initialize a Point. But we can only know that after overload resolution completes; and in such case we cannot provide an input type to the overload resolution algorithms.

But there is more to that... In fact, OverloadResolution may take a third input: an implicit object type. Let's see why it's needed.

class Foo
{
  /* ... some code ... */
  
  void bar()
  {
    g(1, 2, 3);
  }
};

In this code fragment, the call to g() in bar() will be processed as follow:

  • processing the arguments
  • performing name resolution to get a list of g
  • overload resolution

The key part here is that name resolution may return a list containing:

  • free functions
  • static member functions
  • non-static member functions

In the two first cases, the actual argument list will be [1, 2, 3], but in the last one it will be [this, 1, 2, 3]: that is, the object itself is implictly inserted in the argument list. We cannot add this to the argument list because we cannot know in advance if it will be needed, but we must provide it in case it is needed; hence the third input for OverloadResolution.

Current implementation details

Currently the implicit object argument can only be provided as a std::shared_ptr<Expression>, the set of functions is always a std::vector<Function>, and the argument list is provided in any of the three forms mentionned earlier.

There are two main algorithms: one with the implicit object and one without.

The variety of argument list is handled by having an ORInputs class that can handle all three forms. This class uses a switch each time it is used to compute its result depending on the actual type of the input.

Problems

I find the current solution inelegant, and it has several limitations:
1) It currently does not handle all combinations of inputs (e.g. the implicit object must be a std::shared_ptr<program::Expression>).
2) Every time we need to access the argument list, the ORInputs performs a switch (which will always give the same result during the entire session).
3) Because necessary informations are exposed through ORInputs, the various treatments for each kind of input are interleaved in the switch of ORInputs's methods; which make things harder to maintain.
4) It requires a special case to handle overload resolution of constructors (partly because of 1)).

What could be done

I would like to switch to a template-based implementation. There would still be two implementations of the algorithm (one with the implicit object and the other without), but the inputs would be parameterized.

Instead of being exposed through ORInputs, the necessary informations would be provided by free functions. During template instantiation, overload resolution would pick the correct functions depending on the input types. This would effectively replace the run-time switch-based approach by a compile-time overload-resolution-based approach.

The ORInputs would still be needed though, to store the different kind of inputs, but it would no longer perform any processing.

There might be some performance benefits with this template-based approach; though it is very likely to be hidden by the more time-consuming operations performed by the algorithms. Having some benchmarks would be interesting.

Anyway, having something that's compile-time and uses templates is more C++ish ๐Ÿ˜Š. Besides, using overload-resolution to enhance the OverloadResolution class is satisfying ๐Ÿ˜‹.

The project doesn't build

I tried to build the project but i get

error: 'numeric_limits' is not a member of 'std'
  133 |   : index(std::numeric_limits<size_t>::max())

Specialize function prototypes

Introduction

The Prototype class represents a function prototype, or signature; that is, its list of parameters and its return type.

Currently this class is not derived and handles all cases that can occur. Its data members are the following.

Type mReturnType;
int mCapacity;
Type *mBegin;
Type *mEnd;
Type mParams[4];

Basically a prototype holds a return type and two pointers indicating the beginning and end of a sequence of parameter types.

The Prototype class is not used much within the library, but every Function holds a prototype. To avoid many dynamic memory allocations, the class also implements a "small-prototypes" optimisation that allows it to store up to 4 parameters without making any additional memory allocation. This number was choosen empirically without much data to support the decision.

The problems

This unique class that is supposed to fit them all is unlikely to be optimal for the variety of cases:

  • binary operators and conversion functions always have 2 parameters
  • destructors always have a single parameter (the implicit object type)
  • default, copy and move constructors have respectively 1 or 2 parameters
  • other functions can have zero, one or many more parameters

After using libscript for some time in the Yasl project, I was able to gather some data (see commit 4b9a3c6). The current statistics show that functions have on average 1.73 parameters (including the implicit object parameter this). All this means that space is often wasted.

Besides, the Prototype class is designed to allow adding new parameters (it acts roughly as a std::vector) but it is mostly accessed through const &.
After a function has been constructed, its prototype cannot change, and so the capacity field becomes useless.
It would be better if prototypes could implement modification functionnalities only before the Function is created (i.e. in the FunctionBuilder class).

Things to do

1) Redefine Prototype to be a non-owning non-polymorphic class holding only a return type and a begin and end.

2) Introduce derived types that implement actual prototypes, e.g. OperatorPrototype, CastPrototype, ...

3) Experiment with mutable and non-mutable prototypes that would be used respectively during construction and after construction is completed.

4) "Small-prototypes" optimisation are okay, but gathering some more data would be interesting to choose an optimal size for these.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.