Comments (25)
Maybe that readme is a little sparse on information when it comes to D#. And you're right, there's hardly any documentation on D# itself.
Look, I initially designed D# as a "test" of Flame's capabilities. I needed a front-end to ensure that Flame can handle real programs, so I built one. Initially, I started off with a subset of C#, because that was Flame's original implementation language. That subset slowly expanded to include most C# features, with minor syntactical differences (which broke syntactic backward compatibility from the get-go) and various flavors of syntactic sugar to make my life easier, such as
public this(set int X, set int Y);
instead of C#'s
public Vector(int X, int Y)
{
this = default(Vector);
this.X = X;
this.Y = Y;
}
Eventually, I decided that D# was "ready" for its trial by fire: to become the implementation language of the core Flame libraries.
Then, as I was rewriting those libraries in D#, I noticed some things in C# that bothered me. So I decided to break semantic backward compatibility, as well. Here are some of the highlights:
- Static classes can't implement interfaces, which is an annoyance when implementing compiler passes: most passes fit the bill of a
static class
, but they must implement an interface. The typical workaround involves writing a lot of unsatisfying boilerplate code. So I figured I could make allstatic class
entities singletons, and that's worked out well so far. As a bonus, I can now store references to singleton instances in variables. For example:var stmt = EmptyStatement;
csc
andmcs
always seem to get their value type total initialization analysis wrong whenever I use auto-properties. So the D# compiler avoids the issue entirely by automatically inserting athis = default(T);
in value type constructors for you.- I often found myself iterating over multiple collections at the same time, so I created "multiple
foreach
" statements, which function more or less like a "zip" followed by aforeach
. Additionally, collection elements offoreach
statements on arrays are mutable, so ArraysExample.ds is a thing. - The
const
attribute can be applied to methods and constructors to mark them as pure. I'd like to have the compiler verify function purity in the future, but for now it's more of a hint to the optimizer. - Delegates have a well-defined type, so
var f = DoSomething;
(whereDoSomething
is a function) is perfectly legal in D#. The back-end is responsible for convertin function types such asint(int, int)
, toFunc<int, int, int>
.
Now, I don't claim that D# is the programming language of the future. It's not revolutionary: I haven't implemented any shiny new programming paradigms like EC#, Boo or Nemerle have. D# is the result of a number of incremental tweaks to C#. Frankly, it's just a useful tool that I made and still use everyday. I'm just a hobbyist who built their own programming language, and I'm really not aiming for world domination with D# here.
My main project is Flame, and I have a Flame front-end for D#. Ergo, I program in D#. But I'm not fundamentally opposed to switching to some other programming language, say EC#, as the main implementation language for Flame, as long as I can bootstrap the core Flame libraries with a Flame-based compiler for that language (such as fecs
), and it gives me the tools I need, like singleton objects.
Most, if not all, of D#'s distinguishing features compared to .NET languages such as Boo and Nemerle can be attributed to its use of Flame as a back-end. These include:
- Speed and program size. Flame's
-O3
optimization level typically improves both execution times and the size of the output program. It constructs an SSA control-flow graph to implement various intraprocedural optimizations, and uses inlining and scalar replacement of aggregates to reduce the overhead of abstraction mechanisms. Whenever a function becomes unreachable due to inlining (or some other optimization), it is removed from the executable, thus reducing code size. (I'm actually working on implementing exception handling support for-O3
right now.) - Static linking. Flame can compile library projects down to IR, and then statically link the IR files with an executable project. The result is a single
*.exe
that contains the minimal set of required entities, and can be optimized interprocedurally. - Multiple target platforms. Architecturally, there's nothing tying D# to .NET. The CLR back-end may be the only reliable back-end at this time, but there are a number of experimental Flame back-ends, which could mature and become stable. I'm mostly thinking of
-platform wasm
.
I understand that these pros are secondary concerns: picking a productive language should be the primary concern. But they are relevant to compiler writers, and the compiler framework is my main project.
from flame.
Could you give an example of "total initialization analysis wrong" with autoproperties?
It strikes me that D# could be backward compatible with C#, or nearly so. How close is your D# (Flame) compiler to running arbitrary C# code?
I'd like to make a proposal: combine D# and EC# into one language, with EC# and LeMP as the front-end, and Flame as the backend.
Currently EC# compiles to C# - I don't know if Flame's architecture could allow that, but I can imagine a "frankenstein" might still be useful, where Flame is used in the LeMP single-file generator for the sole purpose of semantic analysis, so that semantic errors can be reported in the original source file. I wanted to use Roslyn for that purpose, so that the C# error messages would match the EC# error messages, but ... whatever. Anything that works is good.
Whether it's worthwhile to have a version of EC# that cannot compile to C# - or even a version that does compile to C# but does not preserve the high-level code structure - I'm not sure; the fact that users aren't "married" to EC# is one of its main selling points.
In the long run, though, it's attractive to be able to compile EC# to Wasm and C++. It might even be useful in the short run: I could adapt my LLLPG-based LES parser so it can be used by C++ programs, assuming the C++ back-end produces predicable and "consumable" code (i.e. code that is easily used by existing C++ programs.)
Obviously, you're not seeking world domination, but you'd like some users, wouldn't you?
Have you ever tried to do Visual Studio integration? To do a D# project type?
from flame.
Sorry, did I say "sole purpose of semantic analysis"? That was wrong. A single-file generator can't do complete semantic analysis because it doesn't have access to the whole program. But it could offer limited error detection and limited code completion / intellisense, which Flame should be able to provide.
from flame.
About that total initialization analysis: I vaguely recall this not compiling under good old csc
, because the compiler fails to understand that X and Y are backed by fields, and should thus count toward total initialization. Turning X and Y into fields makes the program compile, and inserting a this = default(Vector);
works as well.
public struct Vector
{
public Vector(int x, int y)
{
X = x;
Y = y;
}
public int X { get; private set; }
public int Y { get; private set; }
}
EDIT: this example seems to compile fine under mcs
, though.
I'm open to the idea of building a front-end for a unified D#/EC#. Right now, Flame doesn't offer built-in support for async
/await
, and the D# front-end doesn't do type parameter inference/lamba type inference. These are hard-to-implement features, but there are no architectural roadblocks here that I can see.
I've ported my fair share of C# classes to D#. I'm my experience, porting a program from D# to C# - or vice-versa - can be done with a few minor tweaks. Perhaps a small number of EC# macros can be used to "lower" D# to EC# directly?
You can already get errors and warnings without writing the output to disk with the -fsyntax-only
option. So fecs file.ecs -fsyntax-only -Weverything
gets you the diagnostics you want if file.ecs
is the only input file. I must admit that I haven't tried doing code completion or visual studio plugins yet - and given that I don't use visual studio anymore, my interest in the latter has waned somewhat. Would it be entirely unreasonable to just pass the other source files (they're C#, and therefore we can handle them, right?) to the compiler and then have it analyze everything?
Really, I don't know if Flame is suitable as a code analysis tool for IDEs. Firstly, I don't know about performance: Flame was designed as a compiler framework, not as a high-performance semantic analysis framework. On the other hand, it does try to cache as much information as possible, so perhaps it really is fast enough once everything has been analyzed. But my greatest concern is that Flame doesn't expect entities to go away. Frankly, I haven't the faintest idea of how to handle the scenario, where the programmer deletes a method or a type, in an elegant way. Properly writing a tool that analyzes your code as you type sound like a lot of work, and I'm mostly interested in compiling things.
So my answer boils down to this:
- if you want to use Flame to compile something, then that's super easy and all you'll need is a front-end.
- if you want to use Flame in a command-line tool that figures out what's wrong with your code, then that's fairly easy. A front-end and some driver program code should suffice.
- if you want to use Flame as a code analysis engine in an IDE, then I all bets are off. It'd probably take a fair amount of work to get this working properly.
It's been a while since I've used the C++ back-end, so bitrot may have set in already, but I designed the C++ output to be as close to hand-written code as possible. The generated code should be fairly straightforward to use. Here be dragons, however, as you'll only have a fairly limited set of C++ standard library plugs at your disposal (fortunately, you can always just write your own), instead of the .NET framework's rich set of assemblies.
from flame.
I'm sorry if my previous response scared you off. I really didn't mean to. Re-reading it now, it dawns upon me that my response was actually fairly ambiguous. I'll try to succinctly re-phrase it point-by-point here:
- I'd like to unify D# and EC#.
- Creating a Flame front-end for a unified D#/EC# should be easy for the feature set that is already supported by the D# front-end, and doable for "fancy" C# features such as
async
/await
, lambda type inference, and generic parameter inference. - Flame decouples front-ends from back-ends, so a D#/EC# compiler will be able to produce CLR assemblies, WebAssembly and C++ output.
- Flame-based compilers can be told to generate a bunch of useful errors and warnings, which can easily be intercepted. Displaying these diagnostics in a GUI is just as feasible as printing them to the command-line.
- Building a linter/static analysis tool - like Clang's static analyzer - for D#/EC# should be fairly easy. Flame's IR can be used to analyze the program and look for bugs, like null pointer dereferences. Perhaps these analyses can be simplified significantly by converting the entire program to SSA form, which Flame can do out-of-the-box. Additionally, the IR retains enough information about the source code to accurately highlight the location of any potential bugs that the static analysis tool finds.
- I've never really done IDE integration. I don't know what the performance requirements are, and I don't know what information the IDE feeds to the language plug-in. Right now, I use Atom to write D# code, and rely on
dsc
for error and warning messages. I'd welcome IDE integration for D#/EC# and wouldn't mind adding features to Flame in order to make that possible, but I don't use Visual Studio myself (because my main OS is not Windows), so I probably won't personally be developing a Visual Studio plug-in any time soon.
Does that work for you?
from flame.
Well, let me respond to your first message first. To me, a basic IDE experience is critical. The two most annoying parts of using EC# right now is the lack of code completion (symbol lookup isn't too bad, I can press Ctrl+Comma to look up the symbol under the cursor, although it will take me to the plain C# version), and the fact that errors are only shown in the C# output (several times I've forgotten I was editing the output rather than the input, and then of course whatever I fixed breaks again next time I save the EC# file). I'm not quite sure what to do about those problems - it's a design flaw that VS allows users to edit the output file (with no warning) and then overwrite it accidentally. And even if Flame was suitable as a code completion engine, I would have to rewrite the Visual Studio extension - changing it from a COM component into a MEF component - in order to be able to add code completion features and red squiggly underlines. A royal pain in the butt, but ... worthwhile, if Flame were ready.
But then I realized, if you haven't implemented lambdas yet, or async-await, then certainly it wouldn't support dynamic
either. Pretty much all real-life software uses lambdas, most software relies on generic type inference, and I'd guess that at least half of software dynamic
or async
... and lots of people use LINQ. So we're quite far from having something that real devs would consider using. Certainly I wouldn't consider trying to "sell" a standalone EC# compiler until it can at least compile itself (Loyc.*.dll + *.exe). Clearly, right now, the easiest path to doing that is to use Roslyn as the back-end. Roslyn is "bulletproof", and also well-known, so to announce that EC# is built on Roslyn might garner far more interest.
Of course, eventually I want to do the "Loyc" thing and have multiple backends like C++ and Wasm, but maybe it's not the thing I should do right now.
I think many people that use C# use it for its excellent IDE features. That's true for me - if I didn't care about the IDE experience, possibly that would have tipped the scales three years ago, when I decided to continue working on Enhanced C# instead of joining the D camp. Remember that language we were talking about doing? The idea of doing a good IDE plus a "Learnable Programming" debugger for it makes me salivate.
from flame.
Let's see, what do you need for an IDE? I haven't studied how Roslyn works, but here are my thoughts.
- It isn't really required to "delete" members in the IDE. You don't need to respond instantaneously to changes to the program, so I think I'd do this:
- Pre-parse all source files, run LeMP on them, and run any initial information-gathering that does not rely on other source files. This step is "embarrassingly" parallel and easily done on multiple cores.
- Finish type resolution for the whole assembly (figure out what types are referred to by method signatures, fields, etc.) for the whole program. This is probably parallelizable too.
- In the current source file only, gather information about local variables. Once this is done, you're ready to answer code completion queries.
- When the user makes any changes to a file, start a two-second timer. Each time a key is pressed, reset the timer to 2 seconds. When it expires, reprocess the current file (reparse, run LeMP, etc.) and if the user hasn't modified the file in the meantime, discard the resolved type information of the entire assembly and redo type resolution from scratch. It's somewhat expensive, but it's only a fraction of the entire compilation process. NOTE: you don't really have to discard the old information until the new information is ready. That way, code completion will keep working during the rebuild process. A simple way to save CPU time is to detect when the signatures in the current file have not changed - i.e. when the user is changing a method body and not any method signatures, you can skip the assembly-wide process and only reprocess detailed info about local variables in the current file.
- If there are multiple projects that depend on each other, they should keep largely independent state. A sequence of two-second timers could be used to allow new information to cascade down through all the dependent projects, without stressing the CPU too badly.
- In the presence of LeMP, the source file doesn't quite correspond to the file you actually have to analyze - Flame would be analyzing an "expanded" version of the file. Actually I think a "fixup" step may be needed, because LeMP macros will generally produce a mixture of synthetic nodes (with no source location) and real nodes (with source locations). The fixup step would scan the syntax tree looking for synthetic nodes, and "guess" a source code location for each one based on source code locations associated with parents and children of the synthetic node. But maybe this process could be deferred until an error occurs.
- The IDE responds to things like
Foo.Bar(x, y).
with a pop-up list. To implement this, Flame would need to be able to answer three questions: (1) Where are we in the source code? (i.e. give me an object that we can use to make Code Completion queries); (2) What is the type of the expressionFoo.Bar(x, y)
at this location; and (3) given this type, what should the code completion list look like? - Similarly, when typing
Foo.Bar(
the questions are "what signature(s) are associated withFoo.Bar
at this location in the source code?" - The next most common request is "go to the definition of
Bar
inFoo.Bar
". For code completion you only need to know what members exist; for "go to definition" you also need to know where all the definitions are. - Member search (Ctrl+Comma) is an easier version of the above, in which the current context doesn't matter.
- "Find all references" and "find callers of this method" requires all the method bodies and their full type resolution - or at least you need to find all method bodies that might contain the relevant symbol. That's okay, speed is not as important for this feature as for the "
.
" popup list. - The most common "refactor" (if you can even call it that) is "insert using statement". That one is pretty easy so I don't think I need to say more.
- The most common and important real refactor is a rename. With LeMP this is a bit tricky, since it's necessary to change the original source file, not the postprocessed form. The postprocessed form will usually contain the correct source locations to change, though. So you can change those original locations, then run LeMP again and compare the output to the expected output. If they don't match, we could show a "diff" window that shows how the rename went wrong and asks the user whether to keep the new version or abort the process.
- The second most important refactor is "generate method" and "generate constructor" (or "generate type" if the specified type does not exist.) Sounds pretty straightforward except for the need to modify the original file rather than the postprocessed form. So again, I think we can take a "guess and check" approach, asking the user to confirm/reject the operation if it didn't produce the desired result.
- Differences between full compilation and partial compilation for IDE support:
- No codegen is needed.
- Error tolerance is important. No source code error can be allowed to prevent code analysis from running to completion, and errors in one file must not cause any trouble in an unrelated file.
- Duplicate members (including duplicate methods, fields and entire classes) should be considered perfectly acceptable; "go to definition" should find all duplicates.
- Except when doing a "find all references", you could discard the method bodies of all methods outside the current source file to save memory. Or cache X files. It's not worth managing a cache in the initial implementation, though. Note that
LNode
s are designed to use memory efficiently.
from flame.
So, how much does working on an IDE interest you? I can think of three IDEs that run on both Windows and Linux: Geany (written in C), Eclipse (Java) and Xamarin Studio (C#). Since it's C#, Xamarin Studio seems like the obvious thing to add code completion to - I don't know if it's designed to support "third-party" languages but ... well, it probably is.
from flame.
Actually, Flame does support lambdas. It just doesn't do lambda type inference, because that's the front-end's responsibility. The details of the type inference algorithm are specific to the source language, so the middle-end can't - and shouldn't - do that.
I chose not to implement type-inferred lambdas in the D# front-end because they are an ugly exception to the way expressions are typed in (most) imperative languages, and I at the time decided that doing so anyway would be more trouble than it's worth, at least for my own use-cases. Conversely, the micron front-end infers all types, so lambdas (i.e. local let
-bindings that take at least one parameter) are always type-inferred there.
My point is, though, that there's absolutely nothing stopping us from doing just that, and I do plan on implementing lambda type inference/generic parameter inference in a unified D#/EC# front-end. Likewise, I don't think implementing async
/await
will be fun, but roslyn is open-source, so we can always just look at how they do things. So I'm optimistic that we'll attain feature parity with roslyn sooner rather than later.
Besides, once I get started on building a D#/EC# front-end, then I at least want to be able to compile both Flame and Loyc, and in doing so move from a partially bootstrapping compiler to a fully bootstrapping compiler.
Which brings us to IDE support. I currently use MonoDevelop as my go-to C# IDE, which is basically the same as Xamarin Studio. I think. I once tried to create a D# project plug-in for MonoDevelop, but documentation was somewhat sparse, and I encountered mad girlfriend bugs: the plug-in obviously didn't do what it was supposed to do, but the IDE said that everything was fine. I lacked the patience to debug the whole thing, and eventually settled on my current work-flow.
Now, most of the features you've described really don't sound like things that will be handled by Flame per se. After all, Flame provides a common middle-end and ships with a number of back-ends, but neither of those are super important features for IDE support. Flame's IR may prove useful when looking things up in method bodies, but most IDE-related functionality is source language, and therefore front-end, specific.
That being said, supporting the scenarios you described would shape the front-end in certain way, which I am more than willing to do. I also wouldn't mind implementing certain IDE-specific features such as figuring out what the type of the expression at the cursor's location is, or retrieving a type's location in the source code. On the other hand, I'm not sure if I'm up for coding UI-related things from scratch. That has always proven to be a debugging nightmare in my experience.
For something completely different, I was pondering what the implementation details of a unified D#/EC# front-end would be. At first, I considered updating an existing front-end (either the dsc
or fecs
front-end), but in my opinion there are some valid technical objections against doing that:
- The D# front-end is kind of buggy and doesn't use a number of Flame features.
- Flame's current EC# front-end is an F# project, which precludes bootstrapping, and, from my understanding, F# is not your favorite programming language.
So I was thinking that maybe we should just create a new front-end from scratch in C#, and then rely on EC#'s backward compatibility with C# to compile said new front-end, once it matures. Thoughts?
from flame.
Yes. Maybe I just have a learning disability or what, but I found F# to have poor usability - unintuitive syntax plus unintuitive error messages. Anyway we should have a "dogfood" compiler (that "eats" itself) written in EC#. Would you agree with me that EC# should be the official name, because in the long run Google will find it more easily? 'EC#' isn't ambiguous (it's not used by anything else) and insofar as Google ignores the '#', EC and "Enhanced C" are both more unique than "D" which Google matches with words like "I'd".
For getting actual users I think we'll need VS integration, and for Linux/OSX we need Monodevelop. So, how about I be in charge of VS integration and you'll be in charge of Monodevelop integration?
I can agree in principle about most of the features you've added to D# compared to C#.
-
I am most skeptical about the auto-properties thing - C# definite assignment analysis doesn't fail in trivial cases so you should figure out exactly what problem you were having.
-
I agree with automatically choosing a
Func<>
type invar
declarations. -
Marking functions as pure shouldn't be officially supported until the feature is properly done and carefully thought out... but EC# supports arbitrary non-keyword attributes so you can use
pure
rather thanconst
. -
Static classes implementing interfaces sounds useful; and more generally I think any
static
member of a class should be eligible to be part of the implementation of an interface for that class. Could you give more details on the exact semantics you want / have implemented? -
A
foreach
with a mutable variable is useful but potentially breaks backward compatibility - perhaps we should define a newfor
loop instead; this is what I have in mind:for ($x in list) { Console.WriteLine($"list[$(x#)] = $x"); }
The
$
would be for consistency with pattern matching (match
) which uses$
already; it has the advantage of highlighting places where variables are created in a more lightweight manner thanvar
. Generally if you writefor ($name in list)
you'd also be definingname#
which holds the current list index, as well asname
which is an actual variable that cacheslist[name#]
, and if you writename = value
it would be implemented asname = list[name#] = value
. I don't think "zipping" is worthy of a whole language feature; instead we could support tuple deconstruction so you can write:for (($x, $y) in list1.Zip(list2)) { Console.WriteLine($"list1[$(##)] = $x and list2[$(##)] = $y"); }
and also general pattern deconstruction:
List<Point> points = ...; for ((X: $x, Y: $y) in points) { Console.WriteLine($"points[$(##)] = ($x, $y)"); }
It would be harder to support mutable loop variables in this case, but possible. Zip is already defined in Loyc.Essentials, but it could be enhanced to return a mutable list struct in case the two inputs both implement
IList<T>
. In this case the list item itself could be called#
--this is consistent with how three different LeMP macros already work;#
generally means "the current thing". Logically, then,##
would be the index of the current list item.for
could revoke mutability in case it is used on a type that hasGetEnumerator
and no mutable indexer, so thatforeach
is never needed, and would exist mainly for backward compatibility.
Oh, I must ask - if you start changing D# into EC#, can you architect it to lower EC# to plain C# with perfect preservation of semantics? This is a key feature that sets EC# apart from competing languages like Nemerle and F#. (Addendum) The implication here, I think, is that we need to keep the method bodies in the form of Loyc trees for a long time, and keep the "type tree" (types, namespaces and method signatures) linked back to the Loyc tree, so that it will be straightforward to output plain C#.
from flame.
By the way, have you tried Enhanced C#'s matchCode
and quote
? It's fantastic for manipulating Loyc trees. For me they have reduced the cognitive burden of writing macros substantially. I haven't studied your Flame IR but you could consider making a matchFlame
macro modeled after matchCode
for manipulating your own IR, and a quoteFlame
for generating your IR.
from flame.
Addendum - I didn't fully digest everything you said in your previous message, so let me add some comments in response.
So I was thinking that maybe we should just create a new front-end from scratch in C#, and then rely on EC#'s backward compatibility with C# to compile said new front-end, once it matures. Thoughts?
Yes, I agree. At the same time as we write a new front-end, we could start laying some of the foundation for a multi-language front-end - by avoiding any C#-specific features in the low levels, and by implementing a "parameterized space tree" - the "file system" concept I was telling you about earlier.
most IDE-related functionality is source language, and therefore front-end, specific.
That's not necessarily true. There is already an engine called ctags designed to support certain IDE features for many languages (sadly I've never had occasion to use ctags). IDE functionality may not be part of what you think of as Flame, but certainly it can (and should) be generalized across languages. Perhaps a small language-specific module is needed to understand incomplete code like Foo (x, y => y.
- but that's potentially a very small amount of code, especially if the front-end parser is designed to parse incomplete statements intelligently, and if multiple front ends implement a common interface for IDE features.
coding UI-related things from scratch
We'll be modifying existing IDEs to avoid doing anything from scratch. IDEs have built-in UIs for code completion; the challenge is just figuring out how to invoke them and to install various event handlers in the editor. For syntax highlighting, for VS that's already done, and for Monodevelop you can probably just tell it to use C# highlighting.
I'm sorry your effort to make a D# plug-in didn't work out, but could we try a different tactic? To begin with, EC# is currently used in Visual Studio as a single-file generator. So, could you investigate adding a LeMP SFG to MonoDevelop? Xamarin Studio already supports T4 templates (TextTemplatingFileGenerator) in a VS-compatible way, and Google found this file which might be the implementation of that. Perhaps you can copy & modify this code to make one for LeMP and LLLPG. This would let VS solutions that use LeMP work seamlessly in MonoDevelop/XS ... except Loyc.sln, apparently, which (in XS 5.9.4) just seems to build the solution forever without producing any errors (Build|Stop is greyed out) and for some reason LeMP.StdMacros is labeled "Invalid Configuration Mapping". Bleh.
from flame.
Would you agree with me that EC# should be the official name, because in the long run Google will find it more easily?
Sure. Though I may use 'D#/EC#' in the future to differentiate fecs
from the new EC# front-end.
I am most skeptical about the auto-properties thing - C# definite assignment analysis doesn't fail in trivial cases so you should figure out exactly what problem you were having.
Allow me to move the goalpost here just a little bit and manually desugar those auto-properties. This doesn't compile under mcs
:
public struct Vector2
{
public Vector2(double X, double Y)
{
this.X = X;
this.Y = Y;
}
private double x;
private double y;
public double X { get { return x; } set { x = value; } }
public double Y { get { return y; } set { y = value; } }
}
Besides, why is total initialization of value types mandatory while total initialization of reference types is optional? Having the compiler insert initialization code makes replacing class
by struct
a painless transition.
Marking functions as pure shouldn't be officially supported until the feature is properly done and carefully thought out... but EC# supports arbitrary non-keyword attributes so you can use
pure
rather thanconst
.
Agreed.
D# also supports a special syntax for attributes that are compiler intrinsics, which are understood by the middle-end linker and the back-ends. Any chance of getting support for that in EC#? Here's an example of a function that is implemented as a WebAssembly import. (ignore the module
thing for now, more on that later)
public module spectest
{
/// <summary>
/// Prints an integer to standard output.
/// </summary>
[[import]]
public void print(int Value);
}
Static classes implementing interfaces sounds useful; and more generally I think any
static
member of a class should be eligible to be part of the implementation of an interface for that class. Could you give more details on the exact semantics you want / have implemented?
Right now, the following:
public static class Foo : IFoo
{
public static int Bar()
{
return 4;
}
}
desugars to:
public class Foo : IFoo
{
private Foo() { }
// Actual, real-deal static member
public static Foo Instance
{
get
{
// Not thread-safe, I know. This could easily be done in
// a static constructor, and I plan on re-implementing it
// like that in the EC# front-end.
if (instance_value == null) instance_value = new Foo();
return instance_value;
}
}
public int Bar()
{
return 4;
}
}
Member static
methods/properties of instance types are implemented as members of a nested Static_Singleton
singleton class. I initially planned on implementing 'static
inheritance', where public class BigNum : IComparable<BigNum>, static IComparer<BigNum>
would be legal, but I've kind of changed my mind on that lately, because I'm not sure if it adds much value compared to implementing IComparer<BigNum>
in a separate singleton class.
Ironically, I eventually ended up re-implementing the old C# static class
behavior, as module
, because I needed actual static
classes for (1) extension methods and (2) certain back-ends which don't have reference types yet (like the WebAssembly back-end). So maybe we should really just keep the old static class
semantics, and just use something like public class object Foo
to create singletons.
I'm also not sure if a lexical macro can reliably lower singleton entities to regular classes. We can't just do a find-and-replace, because method and variable names take precedence over class names in.
I don't think "zipping" is worthy of a whole language feature; instead we could support tuple deconstruction
I actually find multiple foreach
to be an elegant and useful construct most of the time. It gets rid of the cruft that is indexing when applying some kind of operation to an entire array. I also really don't see why it should be replaced by an explicit 'zip' call followed by tuple deconstruction, because:
- Multiple
foreach
is implemented efficiently. Generating equivalent code for an explicit 'zip' followed by tuple deconstruction requires zealous function inlining and scalar replacement of aggregates. The only upside to this is that it makes Flame's-O3
look good on benchmarks. - Multiple
foreach
was easy to implement, as is really just a generalization of regularforeach
. - I don't think multiple
foreach
interferes with existing syntax. - It reduces the cognitive burden that is associated with looping over multiple arrays, both for the code's author, and for any readers.
Here's an example of a multiple foreach
in Flame:
public static IExpression[] VisitAll(INodeVisitor Visitor, IExpression[] Values)
{
var results = new IExpression[Values.Length];
foreach (var output in results, var input in Values)
{
output = Visitor.Visit(input);
}
return results;
}
Can you provide an example where mutable loop variables are backwards incompatible? The usual suspect - assigning a value to an iteration variable - makes mcs
report a compiler error, so that won't get us in trouble:
test.cs(10,13): error CS1656: Cannot assign to `item' because it is a `foreach iteration variable'
you could consider making a
matchFlame
macro modeled aftermatchCode
for manipulating your own IR, and aquoteFlame
for generating your IR.
Mmmhh. quoteFlame
sounds a whole lot like an embedded __asm
statement, which shouldn't be too hard to implement, and may also be quite useful. I'm not sure if matchFlame
is useful for macros, because the order of optimizations is important, and macros get evaluated before the pass pipeline is invoked. It certainly is worth looking into, though. A macro that generates Flame API code which builds an IR tree at run-time for a given snippet of code may also be useful.
There is already an engine called ctags designed to support certain IDE features for many languages
Sure, but that all depends on which features you want. As far as I can tell, ctags
implements one thing: "go to definition". That's a tremendously helpful feature that's more or less language independent, but more advanced features, such as refactors and accurate code completion, simply can't be done by the middle-end. Furthermore, it's entirely possible to have Flame search a parsed (and analyzed) project and map type/method/field/property names to their point of definition, because that information is - as you have suggested - clearly language-independent.
and if multiple front ends implement a common interface for IDE features.
A common interface for IDE features would certainly be useful, and could also decouple source language-specific logic from the IDE itself. I am in favor of that.
So, could you investigate adding a LeMP SFG to MonoDevelop?
All right, but I want to get the EC# front-end working first.
from flame.
I think I'll create a separate repository for the (new) EC# Flame front-end. What do you want to call it? We could just re-use the 'fecs
' name, which has already been taken, but is conveniently pronounceable, or we could go with 'ecsc
'.
from flame.
Yes, I knew that normal properties couldn't be initialized that way, only autoproperties (the latter is supported because there's no way to refer to the backing field.)
why is total initialization of value types mandatory while total initialization of reference types is optional
From a theoretical perspective it may not make sense, but from an implementation perspective it does, because value types (which often exist on the stack) actually need to have every member assigned, but reference types do not because the heap is zeroed before it is allocated to specific objects.
I originally planned to support this exact syntax as a way to invoke lexical macros:
[[import]]
public void print(int Value);
But I punted on it, realizing I could use normal attributes by writing a macro for #fn
rather than the attribute itself. This, however, is not a good long-term solution... (edited) it doesn't scale well, because each new macro separately scans for attributes, and more importantly, it's hard to support different macros modifying the same construct (if two macros change a method in response to two different attributes, LeMP has to report an ambiguity error, as it doesn't know how to combine the changes). I don't know what the right solution is.
C# already supports several back-end-specific attributes, such as [Conditional] and [Serializable]\ (the back end has to specially recognize it, since it's converted from an attribute into a one-bit flag in the assembly). Why don't you want to continue this pattern for other "intrinsics"?
See #27 re: static interfaces.
I'm not sure if matchFlame is useful for macros, because the order of optimizations is important, and macros get evaluated before the pass pipeline is invoked
I don't think you understand - matchFlame
would not be used for macros, rather it would be a macro that is used for pattern-matching your IR inside your backends. You could just use match
for that purpose, but a special-purpose macro might work better. Likewise quoteFlame
would not be like an __asm
statement, more a way to quote an __asm
statement. Both of these would require you to define a compact DSL to represent your IR. Anyway, I don't really know what I'm talking about since I haven't worked with Flame.
I thought ctags also provided (inaccurate) code completion but I never used it, so, not sure. I'm glad we're in agreement about trying to do IDE features in a way that is as language-independent as possible.
I want to get the EC# front-end working first.
I'm confused. In my mind, LeMP is literally an EC# front-end that works already, and a single-file generator already exists, so it should be easy to port to MonoDevelop. But maybe in your mind, D# is the closest thing to an EC# front end. Which reminds me, you didn't address my earlier question:
if you start changing D# into EC#, can you architect it to lower EC# to plain C# with perfect preservation of semantics?
from flame.
Can you provide an example where mutable loop variables are backwards incompatible?
Yes. In order to support a mutable loop variable in foreach
, you wouldn't be able to use IEnumerator
anymore. I was assuming that instead you would use the indexer (list[index]
) and a potentially hidden index variable. There is an alternative - you could implement mutation by setting the Current
property of the enumerator - but most enumerators don't support that. It doesn't feel right to use MoveNext/Current
and then suddenly switch to a completely different approach if the user mutates the loop variable. I think that would surprise users. So, assuming your foreach
always uses an indexer and Count
(where available), there is a theoretical possibility that it is not backward compatible, since Count
and the indexer are not guaranteed to behave the same way as the enumerator. Also, performance will change; the indexer may be faster or slower than the enumerator depending on the circumstances. In case of AList
, the enumerator is faster in theory (O(1), whereas the indexer is O(log N)) although I don't know about practice.
I want EC# to be strict about backward compatibility - if a given C# program compiles under EC#, it should behave identically, so to me it seems better if foreach
doesn't support mutability at all, or if it supports mutability in a conservative way, by setting the Current
property. That's why I propose a new for
loop for the new functionality.
I'm kind of on the fence about multi-foreach since it's neither something I would use a lot, nor is it needed (given Zip
) but on the other hand, it's not hard to implement. By nature, EC# can't be one of those "small core" languages where the functionality is in libraries (including macro libraries). Still, I do like to be careful about adding features to the core language. So I feel a tension. But ... I guess I can ... accept ... having it in the language.
from flame.
From a theoretical perspective it may not make sense, but from an implementation perspective it does, because value types (which often exist on the stack) actually need to have every member assigned, but reference types do not because the heap is zeroed before it is allocated to specific objects.
Actually, if I'm not mistaken, stack objects are zeroed out in verifiable code, which is what C# compilers produce. That's exactly what init
means in IL declaration .locals init (int32 V_0)
. Total initialization is required because C# compilers routinely optimize x = new X(...);
to X..ctor(ref x, ...);
. This optimization is fairly fragile, by the way. For example, I can trick mcs
into doing the following:
using System;
public struct Vector2
{
public Vector2(ref Vector2 Other)
{
this = default(Vector2);
this.X = Other.X;
this.Y = Other.Y;
}
public double X;
public double Y;
}
public static class Program
{
public static void Main()
{
Vector2 vec;
vec.X = 4;
vec.Y = 5;
// Insert unsafe optimization here, by emitting
// a direct `call` to `Vector2::.ctor` instead of
// a `newobj` for `Vector2::.ctor`.
vec = new Vector2(ref vec);
// Actually manages to print "0\n0\n". ಠ_ಠ
Console.WriteLine(vec.X);
Console.WriteLine(vec.Y);
}
}
C# already supports several back-end-specific attributes, such as [Conditional] and [Serializable]\ (the back end has to specially recognize it, since it's converted from an attribute into a one-bit flag in the assembly). Why don't you want to continue this pattern for other "intrinsics"?
That's a really good point. I actually did that in the past (and still do for some intrinsics). At some point, however, I realized that this whole approach was actually a huge hack. These downsides quickly started to outweigh whatever upsides that remained:
- The attribute class has to be defined in an external library, for it to be usable by both the compiler and the program that is being compiled. The C# compiler circumvents that problem by putting the attribute class in the .NET framework class library, but we don't have that luxury. Also, developers shouldn't have to juggle libraries for things that the compiler understands completely.
- It's really hard to tell what is an intrinsic attribute and what isn't. Comparing the attribute class' name to a known string is not very reliable, because any EC# programmer can unwittingly define their own hypothetical
CompilerRuntime.ImportAttribute
, which will then be interpreted as a compiler intrinsic, even though that was not the programmer's intention. - Encoding compiler intrinsics as attributes ties the compiler's version to the runtime library's version. Changes to the runtime library will require changes to the compiler. That's a versioning disaster waiting to happen.
Yes. In order to support a mutable loop variable in
foreach
, you wouldn't be able to useIEnumerator
anymore.
The D# compiler is pretty conservative here: loop variables are only mutable when looping over an array. C# compilers already optimize array foreach
by lowering them to for
loops. The D# compiler simply takes that one step further and lets the programmer modify the loop variable derived from that array. This is a bit of a special case, I know, but initializing and modifying arrays is the main use case for mutable loop variables.
Likewise
quoteFlame
would not be like an__asm
statement, more a way to quote an__asm
statement. Both of these would require you to define a compact DSL to represent your IR.
Oh, I see. Yeah, that could definitely be useful, especially for the "lowering" passes.
But maybe in your mind, D# is the closest thing to an EC# front end.
What I meant is that I want to create a Flame front-end for EC# first. That'd enable the IDE plug-in to perform semantic analysis on EC# code.
Which reminds me, you didn't address my earlier question:
if you start changing D# into EC#, can you architect it to lower EC# to plain C# with perfect preservation of semantics?
I don't know if I can. But I'll try, that's for sure.
Update: I created an ecsc
(enhanced C# compiler) repository for a Flame EC# front-end.
from flame.
Actually, if I'm not mistaken, stack objects are zeroed out in verifiable code
IIRC,init
is a flag that all locals are to be zeroed, and I was surprised to learn (years ago) that the verifier requires this rather than doing a static assignment analysis. Then the JIT tries to detect and eliminate the double-initializations. So yeah, you're right - on the matter of struct initialization, C# seems to follow the way the CLR might have worked rather than the way it does work.
Perhaps another reason why DAA is done for structs and not classes is that structs (unlike classes) tend to be small, so it was felt more reasonable to require all members to be explicitly initialized.
Wow, your code causes the same behavior in csc (0 0
) even in a Debug build. I checked the IL - right at the start the constructor has
ldarg.0
initobj .../Vector2
Maybe this is required by the verifier too, but it makes C#'s DAA seem positively pointless.
Even so, I have to say, I don't think this change to the C# language is worth making. Have you heard of the "point" system the C# team uses (or used to use?) for adding features? I'd like to treat EC# as having a similar point system, except with a lower threshold for additions, and with the threshold modulated by implementation difficulty - e.g. supporting underscore literals like 1_000_000
is incredibly easy and so has a very low threshold. I suppose if the C# team had modulated their thresholds by difficulty/complexity of the feature, C# 2.0 would have already had underscored and binary literals...
Re: attributes for intrinsics, the C# team mentioned "Compile-time only attributes" in one of their Design Notes.
The attribute class has to be defined in an external library, for it to be usable by both the compiler and the program that is being compiled.
That's not actually true, e.g. you can define class ExtensionAttribute
in your own .NET 2 assembly in order to use extension methods. It's potentially hazardous though. My memory sucks, but I think I might have once had a compatibility problem where my .NET 3.5 assembly had a problem consuming a .NET 2 assembly.
It's really hard to tell what is an intrinsic attribute and what isn't.
True - to address this. I've been using the convention of using a lowercase first letter both for macros themselves, and for attributes that macros recognize.
Encoding compiler intrinsics as attributes ties the compiler's version to the runtime library's version.
At this point I want to point out that using attribute syntax doesn't necessarily mean that the attribute has to exist at runtime, even though it has been done that way in the past. None of the LeMP-specific attributes exist at runtime.
loop variables are only mutable when looping over an array
Ouch - I don't use arrays very much, so I don't think the feature should be constrained in that way.
Um, about repos, I was thinking of (A) combining Flame and the Loyc libraries in one repo, with Flame being a subtree (in the same way that LoycCore is a subtree of this repo), and (B) defining a 'ecsharp' or 'Loyc' "organization" on github and putting the official version of our code there. Anyway, there's no hurry, I'll be busy for awhile modifying the parser and finishing basic support for extracting 'sequence expressions' out of if
statements, loops, etc.
I don't know if I asked this before but why is it called 'Flame'?
from flame.
I'd like to treat EC# as having a similar point system, except with a lower threshold for additions, and with the threshold modulated by implementation difficulty
I'd argue that automatic initialization of struct
s is very easy to implement: simply insert a this = default(T);
. In fact, that's a lot easier to implement than forcing manual total initialization on the programmer, because said rule in turn requires flow analysis to ensure that the struct
initialization paradigm is respected.
I've been using the convention of using a lowercase first letter both for macros themselves, and for attributes that macros recognize.
Do you think that simply using [#import]
is acceptable? That way, there's no ambiguity, and no separate attributes. I'm okay with this:
public static class spectest
{
/// <summary>
/// Prints an integer to standard output.
/// </summary>
[#import]
public void print(int Value);
}
I was thinking of (A) combining Flame and the Loyc libraries in one repo, with Flame being a subtree
Do you mean, like, in a separate repository? If so, then the ecsc
repository might be a good candidate, because it uses both Loyc and Flame libraries.
and (B) defining a 'ecsharp' or 'Loyc' "organization" on github and putting the official version of our code there.
I wouldn't mind transferring ecsc
to a Loyc/ecsharp "organization." Not so sure about the Flame repository itself, though, because it's more of a compiler construction kit, not just the back-end for the EC# compiler (I'm not modifying Flame in any way to compile EC#). Moving it to an ecsharp organization might send the wrong message. It's also kind of my baby, and I'm not sure if I'm entirely not sure if I'm ready to let go of it yet.
I don't know if I asked this before but why is it called 'Flame'?
Good question. I had to call it something, so I figured 'Flame' would be a cool name. It's no acronym; I consider those to be fairly fickle, especially when used in compilers. Just look at GCC (GNU Compiler Collection now, originally GNU C Compiler) and LLVM (formerly known as the Low Level Virtual Machine). Plus, 'Flame' also keeps the all caps shouting to a minimum.
from flame.
Yes, it's easy to implement. However, I think many programmers would actively oppose this change and among those that aren't opposed, it's a minor cognitive burden having to keep track of this difference between C# and EC#. That gives it, in a sense, negative points that offset the positive points from making this the default behavior. Now how many positive points does it earn? Well, most people don't write structs very often, and among those that do, this is a change that is only beneficial when the constructor intentionally didn't initialize all members in a way that the DAA understands. So the benefit I see here is tiny - so small that the overall value is zero or negative.
Yes, putting a #
at the front of an attribute makes it very clear that it isn't a runtime attribute, so that's good. I have been a bit indecisive about where to use #
- it's a small burden to type it and a noisy character visually, so I don't think all macros should use it. For example if someone uses contract attributes a lot, the many #
s might look cluttered...
[#ensures(# >= 0)] double Sqrt([#requires(# >= 0)] double x) => ...;
The same could potentially be argued of #import
. Conventionally all attributes are capitalized so I think lowercase is already a strong enough hint.
from flame.
I think many programmers would actively oppose this change and among those that aren't opposed, it's a minor cognitive burden having to keep track of this difference between C# and EC#.
I've never seen any real justification for the current struct
initialization paradigm. I think it's ugly because it's based on a few special cases - any initialization logic that depends on a separate method to initialize some part of a struct
, is simply out of luck -, and I don't really see why anyone would oppose fixing that in a perfectly backward-compatible manner. Any additional cognitive burden would be caused by EC# making our lives easier in a way that C# does not. And isn't that kind of the point of EC#?
I have been a bit indecisive about where to use
#
- it's a small burden to type it and a noisy character visually, so I don't think all macros should use it.
Right, but I don't expect #import
to suddenly pop up everywhere. Like P/Invoke, it's a feature for library writers, so it doesn't have to be pretty. Anyway, this is a moot point, since I realized that I can just re-use the extern
keyword for this purpose. I might use #
as a prefix for some esoteric compiler intrinsic attributes in the future, though.
On a different note, I've gotten to the point where I'd like to implement binary operator resolution in ecsc
, and the C# (binary) operator resolution algorithm that Roslyn uses turns out to be ridiculously complicated, as evidenced by BinaryOperatorOverloadResolution.cs. Would you mind if I copied and adapted that file for use in ecsc
? Roslyn is Apache licensed, which should be compatible with ecsc
's MIT license.
from flame.
And isn't that kind of the point of EC#?
Yes. And if DAA errors were something I encountered on a daily or weekly basis, I would agree with you... Okay, maybe we could compromise? Like for every difference between C# and EC#, we print a warning that is on by default and you'd have to turn it off somehow. So you'd do the DAA and detect that a variable is unassigned, but print C# requires field
Foo to be fully assigned before control is returned to the caller
as a warning that can then be suppressed.
Sure, copy whatever code you want from Roslyn - and I applaud your initiative. By the way, my understanding is that I don't need to change the license of ecsharp to incorporate code licensed MIT, Apache or whatever... but traditionally Loyc has been LGPL licensed. This hasn't bought me anything so far and did attract the wrath of an ignoramus so I wonder if I should change to some other license. Is there a particular reason you picked MIT?
append: yeah, extern
makes sense.
from flame.
Like for every difference between C# and EC#, we print a warning that is on by default and you'd have to turn it off somehow. So you'd do the DAA and detect that a variable is unassigned, but print
C# requires field
Footo be fully assigned before control is returned to the caller
as a warning that can then be suppressed.
That works for me. Perhaps we can group warnings for EC# extensions as -pedantic
, so they can be enabled or disabled all at once.
Is there a particular reason you picked MIT?
I didn't want to discourage people who were thinking about using the Flame libraries because of a licensing issue. The GPL, and, to a lesser degree, the LGPL, can complicate things for developers who decide to license their work under the MIT license. That's a significant group of users that I'd rather not alienate. I also don't buy the "evil corporations will steal your code" argument - for example, LLVM is licensed under a permissive open source license, and lots of corporations contribute code to LLVM even though they technically don't have to - so I don't see what the advantages there are to the (L)GPL.
But really, picking a license is a personal choice - it's your copyright, after all. And licensing is also kind of a practical detail; if people are truly invested in Loyc, then they can just ask you to give them a different license.
from flame.
Just to let you know, I've been lazy for the past three weeks. Partly it's because the Bureau of Immigration required me to spend four (!) days in Manila, but also I've just been spending time with family, reading news and writing. I am still working on the algorithm for eliminating #runSequence
expressions and the ::
quick-binding operator.
from flame.
Sorry it took me three days to respond. Life's kind of intervened for me as well here. My exams are coming up soon now, and I've also been busy changing Flame's underlying data structures for attributes (flat sequences were silly, so I switched to tiny hashmaps) and names (strings were a bad idea to begin with). ecsc
is slowly making some progress, too.
I see that you kicked off a discussion on roslyn. Awesome. Anyway, I'm sure that the roslyn folks will give EC# the attention it deserves.
from flame.
Related Issues (7)
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from flame.