strandfield / libscript Goto Github PK
View Code? Open in Web Editor NEWA C++ scripting library
License: MIT License
A C++ scripting library
License: MIT License
Introduction
Static local variables are variables declared inside blocks with the static
keyword.
void foo()
{
static int n = 0;
int i = 0;
++n;
++i;
}
Unlike other variables, static variables are initialized only once when the declaration is reached for the first time (except for a few particular cases like zero or constant initialization); the initialization part of the declaration is skipped if control reaches the block again. This means that, in the body of a function, the variable keep its state between multiple calls to the function. If the function in the above example is called twice, the value of n
after the second execution will be 2
while the value of i
at the end of the function body will always be 1
.
Unlike global variables, static variables keep the name local (and thus limit access) and are less concerned by initialization order problems (have a look at C++ Static Initialization Order Fiasco). See also this relevant paragraph on cppreference.com.
Current status
Currently, libscript
does not support static local variables. The keyword static
was even silently ignored until recently : commit 0a06efe now makes the library print an error message when they are used; i.e. the following code
int counter()
{
static int n = 0;
return ++n;
}
produces the following error message.
[error](C142) Not implemented : Static variables not implemented yet
Implementing the feature
A static variable can easily be implemented using two variables:
Both of these need to be stored in a persistent way, and the flag must be initialized at compilation-time to false
.
I think the library already has all the tools necessary to implement the feature. We can simply use the flag to either evaluate the initialization and push the result to the stack or to only push the existing value to the stack. Upon leaving the block, the value is popped from the stack but not destroyed.
A first implementation could use this technique, but in the future we may need a dedicated program::Statement
for this in order not to lose too much information about the semantic of the program.
A tests has already been written for the feature and just needs to be uncommented.
Introduction
Let's start by a brief history of default arguments handling in the library.
In the very first implementation of default arguments, the functions themselves were responsible for checking they were called with the correct number of arguments; and if some were missing, they had to construct them using the default values. This was quickly deemed unacceptable as it implied that functions with default arguments were slower than other functions because they had to check if some arguments were missing, construct the missing ones and destruct them on returning.
In the second implementation, the burden was moved to the caller function. It is now the caller responsability to ensure that it can provides all arguments. The called function (i.e. the callee) now holds a list of default arguments that can be used by the caller if needed. This means that the callee no longer has some parameters marked as 'optional': a function that declares n
parameters is always called with n
arguments. The processing of default arguments was moved from run time to compile time !
But things are not good enough yet...
The problems
As of @a796ad9, default function arguments are one of the very few things that cannot be set in Function
at construction time through the FunctionBuilder
class.
This means that they can only be added after the Function
object is created. And since we cannot know in advance if a function is going to have default arguments or not, all functions must provide the mechanism to add default arguments. Currently this translates into a DefaultArgumentList
data member in each FunctionImpl
.
This is still not acceptable because:
Now this is not as bad as it may seem.
First, default arguments are very likely to be declared right after the function is created and thus before it is ever used.
Second, the DefaultArgumentList
has been designed to be the size of a std::unique_ptr
, so the space overhead is small.
Regardless, the design and implementations are still immature and need to be improved.
There are two main problems currently:
To limit the space overhead, no std::vector
is created if there are no default arguments; and in such case Function::defaultArguments()
throws an exception. (This could be easily fix without incurring any space overhead)
The order in which the default arguments are stored hasn't been well justified.
Currently if we have
int foo(int a, int b = 0, char c = 'c');
the default arguments are stored as [0, 'c']
.
But if you call Function::addDefaultArgument()
, a push_back()
is made, meaning that we should write:
Function foo = ...;
foo.addDefaultArgument(engine.newInt(0));
foo.addDefaultArgument(engine.newChar('c'));
Now this works because no type checking is performed yet.
But technically we are in an invalid state before the second call to addDefaultArgument()
.
Since all this behavior are not documented yet, the feature is barely usable.
What needs to be done
It would be interesting to have some data on how often default arguments are used. This would help choosing between time or space overhead.
Ordering of default arguments should be clearly defined and rationales should be documented.
Default arguments must be set at construction time, not after. This will reinforce the idea that Function
are immutable objects and improve consistency; and if point 1) tells us it is interesting to distinguish functions that have default args and those that haven't, well such thing should naturally be done before the function is constructed.
Regardless of the results of 1), experiment removing the feature for operators and conversion functions (and thus the space overhead).
Ideally, decisions should be backed up with benchmarks (although this may be hard to notice a difference) and any new code should be documented.
Introduction
The OverloadResolution
is a central piece of the library. Its job is, given a set of candidate functions and a list of argument types, to find the function that matches the arguments best. Note that such function may not exist, or several functions may be equally good. In such case we say that overload resolution fails, and the program cannot compile (in the second case we also say that the call is ambiguous). The overload resolution mechanism implemented in the library is really similar to its C++ counterpart; the latter is described thoroughly in the C++ standard and a very good summary is available on cppreference.com (see Overload Resolution).
The overload resolution algorithms have two primary inputs:
While the first bullet is always passed as a std::vector<Function>
, the second one may currently take several forms:
std::vector<Type>
(obviously)std::vector<Value>
std:vector<std::shared_ptr<program::Expression>>
Why 3 different kinds of inputs ? Because std::vector<Type>
is not sufficient ! Consider the following expression:
f("Hello", {1, 2});
During compilation, the compiler will process the arguments, perform name resolution to get a list of f
, and perform overload resolution to decide which f
to call. The problem is that while the first argument has a type (which is String
), the second one does not (it is a form of list-initialization). It will be used to initialize a function parameter. For example, given
void f(const String & str, const Point & pt) { ... }
The second argument will be used to initialize a Point
. But we can only know that after overload resolution completes; and in such case we cannot provide an input type to the overload resolution algorithms.
But there is more to that... In fact, OverloadResolution
may take a third input: an implicit object type. Let's see why it's needed.
class Foo
{
/* ... some code ... */
void bar()
{
g(1, 2, 3);
}
};
In this code fragment, the call to g()
in bar()
will be processed as follow:
g
The key part here is that name resolution may return a list containing:
In the two first cases, the actual argument list will be [1, 2, 3]
, but in the last one it will be [this, 1, 2, 3]
: that is, the object itself is implictly inserted in the argument list. We cannot add this
to the argument list because we cannot know in advance if it will be needed, but we must provide it in case it is needed; hence the third input for OverloadResolution
.
Current implementation details
Currently the implicit object argument can only be provided as a std::shared_ptr<Expression>
, the set of functions is always a std::vector<Function>
, and the argument list is provided in any of the three forms mentionned earlier.
There are two main algorithms: one with the implicit object and one without.
The variety of argument list is handled by having an ORInputs
class that can handle all three forms. This class uses a switch
each time it is used to compute its result depending on the actual type of the input.
Problems
I find the current solution inelegant, and it has several limitations:
1) It currently does not handle all combinations of inputs (e.g. the implicit object must be a std::shared_ptr<program::Expression>
).
2) Every time we need to access the argument list, the ORInputs
performs a switch
(which will always give the same result during the entire session).
3) Because necessary informations are exposed through ORInputs
, the various treatments for each kind of input are interleaved in the switch
of ORInputs
's methods; which make things harder to maintain.
4) It requires a special case to handle overload resolution of constructors (partly because of 1)).
What could be done
I would like to switch to a template-based implementation. There would still be two implementations of the algorithm (one with the implicit object and the other without), but the inputs would be parameterized.
Instead of being exposed through ORInputs
, the necessary informations would be provided by free functions. During template instantiation, overload resolution would pick the correct functions depending on the input types. This would effectively replace the run-time switch
-based approach by a compile-time overload-resolution-based approach.
The ORInputs
would still be needed though, to store the different kind of inputs, but it would no longer perform any processing.
There might be some performance benefits with this template-based approach; though it is very likely to be hidden by the more time-consuming operations performed by the algorithms. Having some benchmarks would be interesting.
Anyway, having something that's compile-time and uses templates is more C++ish ๐. Besides, using overload-resolution to enhance the OverloadResolution
class is satisfying ๐.
I tried to build the project but i get
error: 'numeric_limits' is not a member of 'std'
133 | : index(std::numeric_limits<size_t>::max())
Introduction
The Prototype
class represents a function prototype, or signature; that is, its list of parameters and its return type.
Currently this class is not derived and handles all cases that can occur. Its data members are the following.
Type mReturnType;
int mCapacity;
Type *mBegin;
Type *mEnd;
Type mParams[4];
Basically a prototype holds a return type and two pointers indicating the beginning and end of a sequence of parameter types.
The Prototype
class is not used much within the library, but every Function
holds a prototype. To avoid many dynamic memory allocations, the class also implements a "small-prototypes" optimisation that allows it to store up to 4 parameters without making any additional memory allocation. This number was choosen empirically without much data to support the decision.
The problems
This unique class that is supposed to fit them all is unlikely to be optimal for the variety of cases:
After using libscript
for some time in the Yasl project, I was able to gather some data (see commit 4b9a3c6). The current statistics show that functions have on average 1.73 parameters (including the implicit object parameter this
). All this means that space is often wasted.
Besides, the Prototype
class is designed to allow adding new parameters (it acts roughly as a std::vector
) but it is mostly accessed through const &
.
After a function has been constructed, its prototype cannot change, and so the capacity
field becomes useless.
It would be better if prototypes could implement modification functionnalities only before the Function
is created (i.e. in the FunctionBuilder
class).
Things to do
1) Redefine Prototype
to be a non-owning non-polymorphic class holding only a return type and a begin
and end
.
2) Introduce derived types that implement actual prototypes, e.g. OperatorPrototype
, CastPrototype
, ...
3) Experiment with mutable and non-mutable prototypes that would be used respectively during construction and after construction is completed.
4) "Small-prototypes" optimisation are okay, but gathering some more data would be interesting to choose an optimal size for these.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.