Giter Club home page Giter Club logo

langproc-cw's People

Contributors

dwrchyngqxs avatar fiwo735 avatar johnwickerson avatar jpnock avatar sanjitraman avatar saturn691 avatar simon-staal avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

langproc-cw's Issues

Memory leaks in compiler skeleton

TLDR: (relevant for students doing their coursework)

  • The -fsanitize=address -static-libasan flags don't actually detect the majority of memory leak errors in the current skeleton - remove these and run your program with valgrind instead to actually see if you're leaking memory.
  • Currently, the compiler skeleton is leaking 4 blocks of memory - this will be fixed in an upcoming pr, but any extra blocks are on you.

Full story
For context, I've been working on making some improvements to the skeleton compiler we're providing, and I came across some peculiar behaviour when verifying that I'm not leaking any memory.

For starters, the -fsanitize=address -static-libasan flags that we're compiling with seem to be quite bad at detecting leaks, as I don't think they provide any instrumentation into the parser and lexer generated files, which is where the majority of memory allocation is done (I confirmed this by intentionally adding memory leaks into the parser, which was completely undetected by the sanitizer). A better way to test is to use valgrind (and disable the aforementioned flags as they don't play nice with it), which produces the following output when run on the current version of main:

root@3ac80c3969e9:/workspaces/langproc-cw# valgrind ./bin/c_compiler -S compiler_tests/_example/example.c -o /bin/output/test.s
// ...
==60672== HEAP SUMMARY:
==60672==     in use at exit: 16,930 bytes in 4 blocks
==60672==   total heap usage: 46 allocs, 42 frees, 102,014 bytes allocated
==60672== 
==60672== LEAK SUMMARY:
==60672==    definitely lost: 0 bytes in 0 blocks
==60672==    indirectly lost: 0 bytes in 0 blocks
==60672==      possibly lost: 0 bytes in 0 blocks
==60672==    still reachable: 16,930 bytes in 4 blocks
==60672==         suppressed: 0 bytes in 0 blocks
==60672== Rerun with --leak-check=full to see details of leaked memory
==60672== 
==60672== For lists of detected and suppressed errors, rerun with: -s
==60672== ERROR SUMMARY: 0 errors from 0 contexts (suppressed: 0 from 0)

Taking a closer look at what exactly is being leaked (using --leak-check=full --show-leak-kinds=all), we can see:

==60833== 8 bytes in 1 blocks are still reachable in loss record 1 of 4
==60833==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==60833==    by 0x1295D6: yyalloc(unsigned long) (lexer.yy.cpp:2346)
==60833==    by 0x128D89: yyensure_buffer_stack() (lexer.yy.cpp:2045)
==60833==    by 0x125F8D: yylex() (lexer.yy.cpp:838)
==60833==    by 0x1238ED: yyparse() (parser.tab.cpp:1108)
==60833==    by 0x124B78: ParseAST(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (parser.y:204)
==60833==    by 0x121FD1: Parse(CommandLineArguments&) (compiler.cpp:10)
==60833==    by 0x122794: main (compiler.cpp:51)
==60833== 
==60833== 64 bytes in 1 blocks are still reachable in loss record 2 of 4
==60833==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==60833==    by 0x1295D6: yyalloc(unsigned long) (lexer.yy.cpp:2346)
==60833==    by 0x1285C8: yy_create_buffer(_IO_FILE*, int) (lexer.yy.cpp:1884)
==60833==    by 0x125FC9: yylex() (lexer.yy.cpp:840)
==60833==    by 0x1238ED: yyparse() (parser.tab.cpp:1108)
==60833==    by 0x124B78: ParseAST(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (parser.y:204)
==60833==    by 0x121FD1: Parse(CommandLineArguments&) (compiler.cpp:10)
==60833==    by 0x122794: main (compiler.cpp:51)
==60833== 
==60833== 472 bytes in 1 blocks are still reachable in loss record 3 of 4
==60833==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==60833==    by 0x4B2A64D: __fopen_internal (iofopen.c:65)
==60833==    by 0x4B2A64D: fopen@@GLIBC_2.2.5 (iofopen.c:86)
==60833==    by 0x124AB0: ParseAST(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (parser.y:198)
==60833==    by 0x121FD1: Parse(CommandLineArguments&) (compiler.cpp:10)
==60833==    by 0x122794: main (compiler.cpp:51)
==60833== 
==60833== 16,386 bytes in 1 blocks are still reachable in loss record 4 of 4
==60833==    at 0x4848899: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==60833==    by 0x1295D6: yyalloc(unsigned long) (lexer.yy.cpp:2346)
==60833==    by 0x128624: yy_create_buffer(_IO_FILE*, int) (lexer.yy.cpp:1893)
==60833==    by 0x125FC9: yylex() (lexer.yy.cpp:840)
==60833==    by 0x1238ED: yyparse() (parser.tab.cpp:1108)
==60833==    by 0x124B78: ParseAST(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> >) (parser.y:204)
==60833==    by 0x121FD1: Parse(CommandLineArguments&) (compiler.cpp:10)
==60833==    by 0x122794: main (compiler.cpp:51)

Long story short, 3 of these is caused by flex and the other one is caused by not fclose()ing yyin. My PR (pending write access to the repo) will include fixes for both of these - but until then worth being aware of this.

AST visualisation

I've started working on an AST visualisation in the form of a tree graph. Before I commit to the development, I wanted to get your feedback about the approach below, especially if you have some experience with similar tasks:

  1. Create parser_visualizer.y which has the same grammar as parser_full.y and each rule builds a generic tree composed of TreeNode (declared in ast_node.hpp, it would have no other information, but name and children).
  2. Write a C++ driver that parses input program using parser_visualizer.y and then walks the tree and outputs a very simple .dot file.
  3. Call Graphviz CLI (or similar) to create a graph, using in-built arguments to make it look nice.

I've also considered and found issues with other approaches/aspects:

  • Walk the already existing AST (parsed with ParseAST in compiler.cpp): no easy way to walk since we removed the generic branches_, hence each node would require overriding a base virtual function like Walk in order to handle all named children.
  • Put the C++ visualiser logic in compiler.cpp: the logic might be non-trivial, adding to the existing complexity of understanding that file and making it look scarier + separation is usually nice.
  • Use C++ Graphviz module to manipulate the graphs directly in C++: looks less portable if we ever wanted to change the graphing + more cumbersome to implement it (Graphviz CLI seems easier to use).
  • Use GCC/Clang to generate .dot files: their AST is different to ours + it'd be hard to match the C version + that option is not very well documented/developped (to my surprise, but maybe that's due to insufficient googling).

(SUGGESTION) Add instructions to attach GDB to `.vscode/launch.json`

Hi @Fiwo735, it's @saturn691, the guy who wrote the Python script.

In my opinion, one of the best debugging tools there is to this CW has been the attaching GDB to VSCode. GDB with the command line is ok but nowhere near as powerful, but you may disagree with me here. Perhaps have a few instructions in the debugging section to make everyone aware of VSCode's integrated debugger?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.