Giter Club home page Giter Club logo

comment_parser's People

Contributors

anurbol avatar dexpota avatar ivan-magda avatar jeanralphaviles avatar lukaskaufmannrelaxdays avatar priv-kweihmann avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar

comment_parser's Issues

Suggest add an option to ignore special encoding characters

Hi, this tool works well in many cases. But I found two problems.

  1. Encoding problem

If a file contains other encoding characters, e.g., Chinese characters and ½, an exception will occur in extract_comments method.

I added "errors='ignore'" in the following statement on my local computer, and it can ignore the above special characters and continue to extract the rest characters of a comment.

def extract_comments(filename, mime=None):
    with open(filename, 'r', errors='ignore') as code: 

So I think we can provide this option to users and let them determine to ignore or not.

  1. Complex string

The tool throws an exception when parser this java file. I found the cause may be the complex string in line 99.

Thanks for your tool, it helps me a lot. Hope better~

C# Support

Hey,

Is there any support for C# in mind? I'm looking into processing some C# and JavaScript files in order to create some documentation.

If you don't mind I would like to help.

Thanks in advance!

PHP Support?

Hello,

Do you have any plans to support extracting comments from PHP files?

Many thanks
Rob

Cannot handle the strings containing "/*".

Hi I met

Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/comment_parser/comment_parser.py", line 99, in extract_comments_from_str
    return parser.extract_comments(code)
  File "/usr/local/lib/python3.6/dist-packages/comment_parser/parsers/c_parser.py", line 66, in extract_comments
    raise common.UnterminatedCommentError()
comment_parser.parsers.common.UnterminatedCommentError

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python3.6/dist-packages/comment_parser/comment_parser.py", line 74, in extract_comments
    return extract_comments_from_str(code.read(), mime)
  File "/usr/local/lib/python3.6/dist-packages/comment_parser/comment_parser.py", line 101, in extract_comments_from_str
    raise ParseError(str(e))
comment_parser.comment_parser.ParseError

After a simple analysis and testing, I found it is caused by the "/*" in this statement

        assertEquals("{\"Version\":\"2008-10-17\",\"Id\":\"Policy4324355464\",\"Statement\":[{\"Sid\":\"Stmt456464646477\",\"Action\":[\"s3:GetObject\"],\"Effect\":\"Allow\",\"Resource\":"
                + "[\"arn:aws:s3:::mybucket/some/path/*\"],\"Principal\":{\"AWS\":[\"*\"]}}]}", endpoint.getConfiguration().getPolicy());

Status of the project

The README states that this project might see more features/support of languages soon and there are a bunch of PRs which would add a lot of value to the library, still they haven't been merged or commented on in quite a while.

So what is the status of the project? Can one expect new language support at any point in the future, or has this project to be considered feature frozen?

What is the best way to reactivate, add features, add missing support and ultimately get things into a release?

Python MIME type not recognised on Mac

In comment_parser.py, the MIME_MAP dictionary only recognises "text/x-python" for Python. On Mac, the MIME type is reported as "text/x-script.python" (at least by the magic module). This ought to be handled too.

Javascript parsing fails with quoted path.

If there is a quoted path that contains a * the parser fails with UnterminatedCommentError().
I found this when parsing a file with CDK having a policy statement like:

new iam.PolicyStatement({
        effect: iam.Effect.ALLOW,
        actions: [
          "logs:CreateLogGroup",
          "logs:CreateLogStream",
          "logs:PutLogEvents"
        ],
        resources: [`arn:aws:logs:${environment.region}:${environment.account}:log-group/*`]
      })

Would be nice if the quotes were tracked so you would know if the /* was a string rather than comment

libmagic database for golang?

Hey, sorry to post a non-issue like this, I just wanted to ask: Do you have a libmagic database for recognizing golang code?

Basically I noticed that you support a text/x-go mime type here (and work at google haha), and so I thought it might be worth asking. I've looked around online but not managed to find any, and unfortunately I don't think I know go well enough to write my own.

Anyway, no worries if not, sorry again for the noise

Infinite Loop

ff = '''
private String extractFileName(Header header) {
if (header != null) {
String value = header.getValue();
int start = value.indexOf(FILENAME_HEADER_PREFIX);
if (start != -1) {
value = value.substring(start + FILENAME_HEADER_PREFIX.length());
int end = value.indexOf('\"');
if (end != -1) {
return value.substring(0, end);
}
}
}
return null;
}'''

xx = comment_parser.extract_comments_from_str(ff,mime='text/x-java-source')

The script seems not able to preprocess this method. Specifically the problematic instruction is the bold one.
Do you have any workarounds?

c++ extraction fails

Hi!When I tried this tool to extract c++ comments from the attached file,it only extracted 9 comments.Obviously,there are more than 9 comments....
Expression.txt

Make libmagic an optional dependency

I use extract_comments_from_str only, and I know the mimetype and supply it always. However it's still complaining about "failed to find libmagic".

Can you make it only to import and call libmagic if it's actually needed?

Mime Error

At the time of running your tool with one of my script files, I faced the following error:
AttributeError: 'str' object has no attribute 'decode'

If I remove decode, I again saw the following error:
comment_parser.comment_parser.UnsupportedError: Unsupported MIME type

I am using python 3.5.2

Extract comments from string?

Is there any way of extract comments from a string? I tried this:

comment_parser.extract_comments(a_lis mime='text/x-java-source'')

However, I got:

~/anaconda3/envs/lib/python3.6/site-packages/comment_parser/comment_parser.py in extract_comments(filename, mime)
     78     except common.Error as exception:
     79         raise ParseError(str(exception))
---> 80     return parser.extract_comments(filename)
     81 
     82 

~/anaconda3/envs/lib/python3.6/site-packages/comment_parser/parsers/c_parser.py in extract_comments(filename)
     73         return comments
     74     except OSError as exception:
---> 75         raise common.FileError(str(exception))
     76 

FileError: [Errno 36] File name too long:```

I guess it would be very handy to be able to extract the comments from a string. In my case, `a_list` is  a list which contains the code in a string format.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.