Related to <a class="issue-link js-issue-link" data-error-text="Failed to load title"

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Take a look at the example made by <a class="user-mention notranslate" data-hovercard-

Include Comments in the AST,about romanyankovsky/delphiast

Comments (17)

RomanYankovsky commented on June 12, 2024

Comments are ignored by lexer, parser doesn't have this information. But I think it is possible to implement. I'll take a look.

from delphiast.

uschuster commented on June 12, 2024

I've made the first changes for this issue in my fork, but note the TODO in the log message.
uschuster@e6e7c22

from delphiast.

RomanYankovsky commented on June 12, 2024

@uschuster unfortunately, that's much more complicated...

Try to parse this code and take a look at the syntax tree:

unit commenttest;

interface

var
  Int1 {MyFavoriteInt}, Int2: Integer;

implementation

procedure TestProc;
begin
  Int1 {That's my favorite int } := Int2 * {mul} 2;
end;

end.

But it is a good start. Great!

P.S. To save your time. This is the syntax tree for code above.

<?xml version="1.0"?>
<UNIT line="1" col="1" name="commenttest">
  <INTERFACE line="3" col="1">
    <VARIABLES line="5" col="1">
      <VARIABLE>
        <NAME line="6" col="3" value="Int1"/>
        <TYPE line="6" col="31" name="Integer"/>
      </VARIABLE>
      <VARIABLE>
        <NAME line="6" col="25" value="Int2"/>
        <TYPE line="6" col="31" name="Integer"/>
      </VARIABLE>
    </VARIABLES>
  </INTERFACE>
  <IMPLEMENTATION line="8" col="1">
    <METHOD line="10" col="1" name="TestProc" kind="procedure">
      <STATEMENTS end_line="13" begin_line="12" end_col="1" begin_col="3">
        <ASSIGN line="12" col="3">
          <LHS>
            <COMMENT end_line="12" type="Borland" begin_line="12" end_col="32" value="{That&apos;s my favorite int }" begin_col="8"/>
          </LHS>
          <RHS>
            <EXPRESSION line="12" col="37">
              <MUL line="12" col="42">
                <COMMENT end_line="12" type="Borland" begin_line="12" end_col="48" value="{mul}" begin_col="44"/>
                <LITERAL line="12" col="50" type="numeric" value="2"/>
              </MUL>
            </EXPRESSION>
          </RHS>
        </ASSIGN>
      </STATEMENTS>
    </METHOD>
  </IMPLEMENTATION>
</UNIT>

from delphiast.

uschuster commented on June 12, 2024

@RomanYankovsky Ah I see. I just tried to add comments into a separate child list of TSyntaxNode, but for some constructs the comments got lost. I think using a separate list for all comments and try to attach the comments at the end could be the way to go.

from delphiast.

barbalion commented on June 12, 2024

Without comments the AST is incomplete and useless for my purposes :(
So the case is important. Please consider to fix it.

from delphiast.

Wosi commented on June 12, 2024

It'd be nice to access the comments for different reasons. But comments are not part of an AST. They live next to it. So there should be a separate list like @uschuster has already mentioned.

from delphiast.

vintagedave commented on June 12, 2024

Why should it be a separate list? What's wrong with having nodes that
represent comments inside the syntax tree? A comment is syntactically
valid - its syntax is such that its contents are ignored by the compiler.
But it's still valid code.

On 2 September 2015 at 13:04, Christopher Wosinski <[email protected]

wrote:

It'd be nice to access the comments for different reasons. But comments
are not part of an AST. They live next to it. So there should be a separate
list like @uschuster https://github.com/uschuster has already
mentioned.

—
Reply to this email directly or view it on GitHub
#39 (comment)
.

from delphiast.

RomanYankovsky commented on June 12, 2024

@vintagedave may be I'm missing an idea, but can you please show me a sample correct syntax tree for code below? I just can't imaging how to do this.

unit commenttest;

interface

var
  Int1 {MyFavoriteInt}, Int2: Integer;

implementation

procedure TestProc;
begin
  Int1 {That's my favorite int } := Int2 * {mul} 2;
end;

end.

from delphiast.

RomanYankovsky commented on June 12, 2024

@uschuster worked on that, but have never done a pull request. Did he finished his effort?

from delphiast.

Wosi commented on June 12, 2024

Take a look at the example made by @RomanYankovsky. Comments can be nested everywhere. How should the AST for that code look like in your opinion?
Of course you can squeeze the comments into the AST anyway but does the result look good? Is it still easy to gather information from the AST?

Let's do some extreme things.

  {comment1} MyObject{comment2}.{comment3}Prop1 //Comment4
                                         .{comment5}Method1(
                                          {comment6}6,nil{comment7}, 'hello'{comment8} + 'world' // comment9
                                          ){comment10}.SubProp1 := //comment11
                          {comment12}'value' + {comment13} + IntToStr(14{comment14});

If you want to represent this valid code in a comment including AST you should change the whole structure. I think if you put comments into the abstract syntax tree you would end up in a concrete syntax tree which is much harder to get information from.

from delphiast.

vintagedave commented on June 12, 2024

That is a good point - abstract vs concrete.

Is it possible - or should it be possible - to reconstruct the original
code, exactly as it was, from the syntax tree? (Even ignoring comments?)

On 2 September 2015 at 13:28, Christopher Wosinski <[email protected]

wrote:

Take a look at the example made by @RomanYankovsky
https://github.com/RomanYankovsky. Comments can be nested everywhere.
How should the AST for that code look like in your opinion?
Of course you can squeeze the comments into the AST anyway but does the
result look good? Is it still easy to gather information from the AST?

Let's to some extreme things.

{comment1} MyObject{comment2}.{comment3}Prop1 //Comment4
.{comment5}Method1(
{comment6}6,nil{comment7}, 'hello'{comment8} + 'world' // comment9
){comment10}.SubProp1 := //comment11
{comment12}'value' + {comment13} + IntToStr(14{comment14});

If you want to represent this valid code in a comment including AST you
should change the whole structure. I think if you put comments into the
abstract syntax tree you would end up in a concrete syntax tree which
is much harder to get information from.

—
Reply to this email directly or view it on GitHub
#39 (comment)
.

from delphiast.

vintagedave commented on June 12, 2024

Roman, all I can think of is that you end up with nodes all over
the place: embedded anywhere. And that might not be ideal.

On 2 September 2015 at 13:24, Roman Yankovsky [email protected]
wrote:

@vintagedave https://github.com/vintagedave may be I'm missing an idea,
but can you please show me a sample correct syntax tree for code below? I
just can't imaging how to do this.

unit commenttest;
interface
var
Int1 {MyFavoriteInt}, Int2: Integer;
implementation
procedure TestProc;begin
Int1 {That's my favorite int } := Int2 * {mul} 2;end;
end.```

—
Reply to this email directly or view it on GitHub
#39 (comment)
.

from delphiast.

barbalion commented on June 12, 2024

I would suggest to introduce a new property, similar to attributes: Syntax Tree Node would get property Comments: TNodeList (serialized to XML elements).
Comment will be related to preceding or following Node (arguably, maybe two properties CommentsBefore and CommentsAfter would fit better).
Your last example this would look like this:

Alexander

From: David Millington [mailto:[email protected]]
Sent: Wednesday, September 2, 2015 3:00 PM
To: RomanYankovsky/DelphiAST [email protected]
Cc: barbalion [email protected]
Subject: Re: [DelphiAST] Include Comments in the AST (#39)

Roman, all I can think of is that you end up with nodes all over
the place: embedded anywhere. And that might not be ideal.

On 2 September 2015 at 13:24, Roman Yankovsky <[email protected]mailto:[email protected]>
wrote:

@vintagedave https://github.com/vintagedave may be I'm missing an idea,
but can you please show me a sample correct syntax tree for code below? I
just can't imaging how to do this.

unit commenttest;
interface
var
Int1 {MyFavoriteInt}, Int2: Integer;
implementation
procedure TestProc;begin
Int1 {That's my favorite int } := Int2 * {mul} 2;end;
end.```

—
Reply to this email directly or view it on GitHub
#39 (comment)
.

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/39#issuecomment-137047405.

from delphiast.

Wosi commented on June 12, 2024

@barbalion How do you decide that a comment is after or before a syntax node? And will there be stand alone comments?

What's about this code?

unit Basics;

interface
  function IntToStr(Value: integer): string;
  function StrToInt(const Str: string): integer;

implementation
// Converts an integer to a string
function IntToStr(Value: integer): string;
begin
  // ...
end;

// Converts a string to an integer
function StrToInt(const Str: string): integer;
begin
  //....
end;

end.

How would the method headers be represented in AST? Would they be stand alone comments? Would they be part of the function nodes in commentsbefore nodes? Or would the first header be a stand alone comment (as a child of implementation) while the second one appears in the commentsafter section of the first function node? And how would you figure out what to do?
Linking comments to their context seems fuzzy to me. Sometimes comments appear before the commented code, sometimes after. And sometimes its really weird like this one:

type TMyObject = class // this class should only
private                // be used by the basic 
  FName: string;       // code libraries like
  FSize: integer;      // Lib1, Lib2 and LibOld
end;

Which of these comments is part of the typedeclaration node? Which is part of the private, field, name or type node?
After answering these questions - Are we happy with the resulting AST? Would it be easy for you to get the comments out of the AST and do something with them? Or would it be easier to have a list of comments including their source position and maybe having references to the syntax nodes before and behind the comment?

from delphiast.

uschuster commented on June 12, 2024

I haven't had time in the past months and won't have for at least one or two months. I am not yet satisfied with the implementation. My last problem where incorrect position information for different statements and thatswhy the Clang alike attempt to attach comments to nodes did fail.

from delphiast.

barbalion commented on June 12, 2024

@barbalion How do you decide that a comment is after or before a syntax node?
This is a tricky thing. But there are two options:

  Make Before and After the same (After for previous Node has the same as Before for following one).

  Add some heuristics to guess the right one.

And will there be stand alone comments?
I would answer no.

What's about this code?
I would say that this would look like this:
… Note some duplications. You can avoid them if apply some heuristics (like ‘if there empty line before’), but you can leave it this way.
How would the method headers be represented in AST? Would they be stand alone comments? Would they be part of the function nodes in commentsbefore nodes? Or would the first header be a stand alone comment (as a child of implementation) while the second one appears in the commentsafter section of the first function node? And how would you figure out what to do?
You didn’t get my idea. I’m proposing to make a separate property of Node. This property will represent comments, but the comment themselves will not create a Node. To give you an example: look at METHOD node. You can see that name doesn’t create a node – it’s an attribute of METHOD node. The idea with comment is to make them similar to these attributes. The reason I put CommentBefore and CommentAfter into XML Element is just because there could be several comments at one node. And there is no way to put multiple values into a single XML attribute.

Arguably you can put all comments into single XML attribute without separating them. For example:

// Converts an integer ...

// ... to a string

function IntToStr(Value: integer): string;
METHOD node here has two comments. But for practical use we can consider them as one big multiline comment and put it into XML attribute:

(XML attribute supports multiline values.) But in this case we lose information about comment’s start and end (and that’s bad).
So in other words, my suggestion is really to keep comments out of the AST, but at the same time link them to the nodes (like attributes).

Linking comments to their context seems fuzzy to me. Sometimes comments appear before the commented code, sometimes after. And sometimes its really weird like this one:
type TMyObject = class // this class should only
private // be used by the basic
FName: string; // code libraries like
FSize: integer; // Lib1, Lib2 and LibOld
end;
Which of these comments is part of the typedeclaration node? Which is part of the private, field, name or type node?
After answering these questions - Are we happy with the resulting AST? Would it be easy for you to get the comments out of the AST and do something with them?
Or would it be easier to have a list of comments including their source position and maybe having references to the syntax nodes before and behind the comment?
As I said – there will be no AST Nodes for comments – but only additional property of existing Nodes.

Alexander

From: Christopher Wosinski [mailto:[email protected]]
Sent: Wednesday, September 2, 2015 7:20 PM
To: RomanYankovsky/DelphiAST [email protected]
Cc: barbalion [email protected]
Subject: Re: [DelphiAST] Include Comments in the AST (#39)

@barbalionhttps://github.com/barbalion How do you decide that a comment is after or before a syntax node? And will there be stand alone comments?

What's about this code?

unit Basics;

interface

function IntToStr(Value: integer): string;

function StrToInt(const Str: string): integer;

implementation

// Converts an integer to a string

function IntToStr(Value: integer): string;

begin

// ...

end;

// Converts a string to an integer

function StrToInt(const Str: string): integer;

begin

//....

end;

end.

How would the method headers be represented in AST? Would they be stand alone comments? Would they be part of the function nodes in commentsbefore nodes? Or would the first header be a stand alone comment (as a child of implementation) while the second one appears in the commentsafter section of the first function node? And how would you figure out what to do?
Linking comments to their context seems fuzzy to me. Sometimes comments appear before the commented code, sometimes after. And sometimes its really weird like this one:

type TMyObject = class // this class should only

private // be used by the basic

FName: string; // code libraries like

FSize: integer; // Lib1, Lib2 and LibOld

end;

Which of these comments is part of the typedeclaration node? Which is part of the private, field, name or type node?
After answering these questions - Are we happy with the resulting AST? Would it be easy for you to get the comments out of the AST and do something with them? Or would it be easier to have a list of comments including their source position and maybe having references to the syntax nodes before and behind the comment?

—
Reply to this email directly or view it on GitHubhttps://github.com//issues/39#issuecomment-137151844.

from delphiast.

RomanYankovsky commented on June 12, 2024

I did add TPasSyntaxTreeBuilder.Comments property. It stores all comments in a separate list. Please give it a try. See 25eb2ac

from delphiast.

Include Comments in the AST about delphiast HOT 17 OPEN

Comments (17)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent