Giter Club home page Giter Club logo

karthikrajkumar / demo Goto Github PK

View Code? Open in Web Editor NEW

This project forked from yijunyu/demo

0.0 0.0 0.0 570.87 MB

ICSE 2019 Demo

Home Page: https://gitpod.io#https://github.com/yijunyu/demo

License: BSD 2-Clause "Simplified" License

Shell 0.12% CMake 0.02% Dockerfile 0.01% Java 0.89% Ruby 0.01% Objective-C 0.56% Swift 0.01% CSS 0.01% PowerShell 0.01% Batchfile 0.01% Groovy 0.01% C# 13.17% C 78.39% C++ 4.37% Makefile 0.24% DIGITAL Command Language 0.01% Roff 0.01% Perl 0.12% Python 2.10% 1C Enterprise 0.01%

demo's Introduction

Flattened Abstract Syntax Trees

This repository provides a demonstration of the deep learning package for classifying the code parsed by the fast utility. See also the Visual Studio Code Extension.

You can run fast in your own machine as the docker container of course, but here you don't even need that: all the binary and python dependencies have been provided, including also the trained models and the pre-trained embeddings.

To reproduce the results, all you need is to enable the GitPod app to access your GitHub account so that the commands can run on a remote server belonging to yourself.

Usage of fAST in Deep Learning for Algorithm Classification

Examples of algorithms in Java and C++ are provided to test the algorithm classification deep learning tool. Once your gitpod machine is running, it will launch the following command:

run.sh datasets/github_java_10/4/1.java

You will see the predicted probabilistic distribution of the class labels: the correctly classified label will be shown in blue, and the misclassified label will be shown in red.

To understand why, click at the HTML file "datasets/github_java_10/4/1.html" and use the Preview button on the up-right corner of the tab to see visualisation results in a split pane. The colours on the tokens indicate which parts of the code that have got the most attention by the classification algorithm.

To run another example, type:

run.sh datasets/github_java_10/4/3.java
run.sh datasets/github_cs_10/4/1.cs
run.sh datasets/github_cpp_10/4/1.cpp

In these examples, it shows that even though the model was trained using Java programs, when applying it to other programming languages such as C# or C++, it normally works well too. We call this feature "Cross-Language Algorithm Classification" [Bui et al SANER'19].

Usage of the fAST utility

cd datasets

# print the command line options and arguments
fast
# convert a C++ code into protobuffer representation
fast tensorflow-1.0.1/tensorflow/cc/saved_model/loader_test.cc tensorflow-1.0.1/tensorflow/cc/saved_model/loader_test.cc.pb
# convert a Java code into flatbuffers representation
fast RxJava-1.2.9/src/test/java/rx/ErrorHandlingTests.java.java RxJava-1.2.9/src/test/java/rx/ErrorHandlingTests.java.fbs
# convert a flatbuffers representation back to C#
fast corefx-1.0.4/src/System.IO.IsolatedStorage/ref/System.IO.IsolatedStorage.cs.fbs corefx-1.0.4/src/System.IO.IsolatedStorage/ref/System.IO.IsolatedStorage.cs
# slice a program
fast -S -G RxJava-1.2.9/src/test/java/rx/ErrorHandlingTests.java RxJava-1.2.9/src/test/java/rx/ErrorHandlingTests-ggnn.fbs
# diff two programs
fast -D github_java_10/4/1.java github_java_10/4/3.java

Usage of fAST in Bug Localisation

cd usr/bin

java -cp /workspace/demo/usr/config:/workspace/demo/usr/config/lic:/workspace/demo/usr/lib/ConCodeSe-1.0.0.jar com.concodese.ConCodeSeJettyServerStarter SERVER_PORT=8081

You can call fAST anywhere when you have docker installed:

alias fast=”docker run -v $PWD:/e yijun/fast”

Reference and Applications

Yijun Yu. "fAST: Flattening Abstract Syntax Trees for Efficiency". In: 41st ACM/IEEE International Conference on Software Engineering, 25-31 May 2019, Montreal, Canada, ACM and IEEE. demo, paper, poster

Deep Learning

Nghi D. Q. Bui, Yijun Yu, Lingxiao Jiang. "Learning Cross-Language API Mappings with Little Knowledge", In the 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), Tallinn, Estonia, 26-30 August 2019.

Nghi D. Q. Bui, Yijun Yu, Lingxiao Jiang. "Bilateral Dependency Neural Networks for Cross-Language Algorithm Classification", In the 26th edition of the IEEE International Conference on Software Analysis, Evolution and Reengineering, Research Track, Hangzhou, China, February 24-27, 2019. GGNN, DTBCNN

Nghi D. Q. Bui, Lingxiao Jiang, and Yijun Yu. "Cross-Language Learning for Program Classification Using Bilateral Tree-Based Convolutional Neural Networks", In the proceedings of the 32nd AAAI Conference on Artificial Intelligence (AAAI) Workshop on NLP for Software Engineering, New Orleans, Louisiana, USA, 2018. Bi-TBCNN

Miltiadis Allamanis, Marc Brockschmidt, Mahmoud Khademi. "Learning to Represent Programs with Graphs", In: 6th International Conference on Language Representations (ICLR), 2018. GGNN

Y. Li, D. Tarlow, M. Brockschmidt, R. Zemel. "Gated graph sequence neural networks", In: 4th International Conference on Language Representations (ICLR), 2016.

Lili Mou, Ge Li, Lu Zhang, Tao Wang, Zhi Jin: "Convolutional Neural Networks over Tree Structures for Programming Language Processing". In: AAAI 2016: 1287-1293. TBCNN, datasets/pku_cpp_104/

Parsing

M. L. Collard and J. I. Maletic, "srcML 1.0: Explore, Analyze, and Manipulate Source Code," 2016 IEEE International Conference on Software Maintenance and Evolution (ICSME), Raleigh, NC, 2016, pp. 649-649. srcML

Parr, T. J. and Quong, R. W. 1995. "ANTLR: a predicated-LL(k) parser generator". Softw. Pract. Exper. 25, 7 (Jul. 1995), 789-810. ANTLR

Slicing

Hakam W. Alomari, Michael L. Collard, Jonathan I. Maletic, Nouh Alhindawi and Omar Meqdadi. “srcSlice: very efficient and scalable forward static slicing”. Software: Evolution and Process, 26(11):931-961, November 2014.

Diffing

Jean-Rémy Falleri, Floréal Morandat, Xavier Blanc, Matias Martinez, and Martin Monperrus. 2014. "Fine-grained and accurate source code differencing". In Proceedings of the 29th ACM/IEEE international conference on Automated software engineering (ASE '14). ACM, New York, NY, USA, 313-324. GumTreeDiff

Yijun Yu, Thein Thun Tun, and Bashar Nuseibeh, "Specifying and detecting meaningful changes in programs," In: Proc. of the 26th IEEE/ACM Conference on Automated Software Engineering, pp. 273-282, 2011. MCT

Bug Localisation

Tezcan Dilshener, Michel Wermelinger, Yijun Yu: “Locating bugs without looking back”. Automated Software Engineering 25(3): 383-434 (2018) ConCodeSe

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.