Giter Club home page Giter Club logo

syntactic-neural-code-completion's Introduction

Adaptation of the paper A Syntactic Neural Model for General-Purpose Code Generation to code completion.

To visualise the AST of a .java file:

./scripts/sh_compile_and_visualise.sh -v -- ./test/Example.java

Tests/Debug:

python ./test/test_compute_action_sequence.py --max-num-file=10 ../../corpus-features/jsoup/
python ./test/test_tensorise_sequence.py --max-num-file=10 ../../corpus-features/jsoup/
python ./test/test_compute_grammar.py --max-num-file=10 ../../corpus-features/jsoup/
python ./test/test_compute_vocabulary.py --max-num-file=10 ./test

Compute data:

python train.py --compute-data\
                --saved-data-dir="./data"\
                --train-data-dir="../corpus-features"\
                --log-file="./logs/training.log"\
                --tensorboard-logs-path="./logs_tensorboard"\
                --max-num-files 50

Train:

python train.py --model='v2'\
                --save-dir="./trained_models"\
                --saved-data-dir="./data/250"\
                --log-file='./logs/training.log'\
                --tensorboard-logs-path="./logs_tensorboard"\
                --max-num-epochs 100\
                --patience 5

Hyper-parameter search:

python hyper_parameter_search.py --model='v2'\
                                  --save-dir="./trained_models"\
                                  --saved-data-dir="./data/250"\
                                  --log-file='./logs/training.log'\
                                  --log-file-hyperparams='./logs'\
                                  --tensorboard-logs-path="./logs_tensorboard"\
                                  --max-num-epochs 5\
                                  --patience 10

Evaluate:

python evaluate.py --trained-model="trained_models/RNNModel-2020-03-05-15-24-55_best_model.bin"\
                   --saved-data-dir="./data/250"\
                   --model="v1"\
                   --qualitative

Compute training data statistics:

python read_training_data.py --train-data-dir="../corpus-features"

Best models trained:

v1 RNNModel-2020-03-05-15-24-55_best_model.bin
trained on 5000 trained files, --max-num-epochs 400 --patience 10 
train_data:  Loss 0.0043, Acc 0.827
valid_data:  Loss 0.0057, Acc 0.806
seen_test_data:  Loss 0.0056, Acc 0.808
unseen_test_data:  Loss 0.0066, Acc 0.778

v2
trained on 5000 trained files

Wiki of internals

  • A Node has two important fields: .type(Type of node in our own AST) and .contents(Java Symbol Type)

Special things in the AST

  • Each variable in int a, b, c has a node VARIABLE that has an individual child TYPE, and all three type nodes are connected to the same PRIMITIVE_TYPE

syntactic-neural-code-completion's People

Contributors

andreimargeloiu avatar

Watchers

 avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.