Giter Club home page Giter Club logo

jsdiffer's People

Contributors

mkshiblu avatar

Stargazers

 avatar  avatar  avatar

Watchers

 avatar  avatar  avatar

jsdiffer's Issues

Debug replacement type for vue runtime snippet

v1

/**
 * Runtime helper for checking keyCodes from config.
 * exposed as Vue.prototype._k
 * passing in eventKeyName as last argument separately for backwards compat
 */
function checkKeyCodes (
    eventKeyCode,
    key,
    builtInAlias,
    eventKeyName
) {
    var keyCodes = config.keyCodes[key] || builtInAlias;
    if (keyCodes) {
        if (Array.isArray(keyCodes)) {
            return keyCodes.indexOf(eventKeyCode) === -1
        } else {
            return keyCodes !== eventKeyCode
        }
    } else if (eventKeyName) {
        return hyphenate(eventKeyName) !== key
    }
}

v2

function isKeyNotMatch (expect, actual) {
    if (Array.isArray(expect)) {
        return expect.indexOf(actual) === -1
    } else {
        return expect !== actual
    }
}

/**
 * Runtime helper for checking keyCodes from config.
 * exposed as Vue.prototype._k
 * passing in eventKeyName as last argument separately for backwards compat
 */
function checkKeyCodes (
    eventKeyCode,
    key,
    builtInKeyCode,
    eventKeyName,
    builtInKeyName
) {
    var mappedKeyCode = config.keyCodes[key] || builtInKeyCode;
    if (builtInKeyName && eventKeyName && !config.keyCodes[key]) {
        return isKeyNotMatch(builtInKeyName, eventKeyName)
    } else if (mappedKeyCode) {
        return isKeyNotMatch(mappedKeyCode, eventKeyCode)
    } else if (eventKeyName) {
        return hyphenate(eventKeyName) !== key
    }
}

diff

image

-- Mappings

Parent Mapper Mappings

image

Here on the parent mapper, the type of replacement ArugmentReplacedWithVariable is incorrect. However, these statements should be matched.

Match nested function declarations

Unlike java functions and code can reside in a source file directly without being inside of a class. The whole program could be written as a script.

To match the statements of the script, the whole file could be thought of as a top-level function declaration or a container. Similarly, for the nested function declarations, to preserve the bottom-up approach it, they could be matched recursively.

Therefore, one way to match any statements inside a container (could be a source file or FD / CD), we could probably come up with a recursive way to match the statements to map the containers.

Write integration tests for refactorings

Due to variations of java and javascript especially function signature matching, often the previously working refactoring is not working anymore or returning wrong mapping. IT's best to write an integration test to ensure that mappings are working correctly

Support all JavaScript statements

Related to #6, we just supported a small but most relevant portion of statements such as FunctionDeclarations, IfStatements, etc. for prototyping. We should support parsing and loading of other statements such as While, ForLoop, etc. as we go along.

Support regular expression pattern

Using a regular expression literal, which consists of a pattern enclosed between slashes, as follows:

let re = /ab+c/;

This could be considered as a String literal for the miner.

Support Try Statement and Catch Clause

In RM, the catch clause is treated as a child statement of the parent of Try statement. So it's not a part of the Try Statement which is the default of AST.

Fix static nature of Argumentizer (CodeAfterReplacement) and rename it

The caching nature of Argumentizer is bug-prone. It should be renamed.

Basically, it holds the argumentized string after replacement and used in many places. This is similar to the CodeFragementAfterReplacement variable of the node.

Take special care while resetting and the referencese

Detect Extract Function Refactoring

Functions with the same signature will have a body mapper that will contain the set of matched and unmatched statements. For now, the signature could be considered as the same name since, in JavaScript, the duplicated function name in the same scope is not common since the last function will be considered only. Therefore it would be safe to assume almost all the code will contain a unique name for function in the same scope.

Other functions in the same file with different names are considered as potentially removed (on the old code/ left hands side) or potentially added (new code/right-hand side).

When a function is extracted it is called from the original method. Out of all the same name functions, which are mapped using FunctionBodyMapper) we need to check the function calls from the code of the right side and see if it contains any of the added methods. If so, we do another matching with the added function and the body mapper.

Parse Function Declaration as Object Property

From Jquery qunit

    queueHook: function (hook, hookName) {
        var promise,
            test = this;
        return function runHook() {
            config.current = test;
            if (config.notrycatch) {
                callHook();
                return;
            }
            try {
                callHook();
            } catch (error) {
                test.pushFailure(hookName + " failed on " + test.testName + ": " +
                    (error.message || error), extractStacktrace(error, 0));
            }

            function callHook() {
                promise = hook.call(test.testEnvironment, test.assert);
                test.resolvePromise(promise, hookName);
            }
        };
    },

Here queueHook is a property that is essentially a function that can be used.

One example

obj.queueHook(10, "dsa");

Remove simple named Arguments for replacements

While parsing, we need to make sure things that are separately replaced (literals, variables, etc.) are not duplicated on arguments or anything else.

expect.indexOf(actual) === -1

Here it should be if actual is a simple name

identifiers = [expect, actual]

arguments should not contain anything since they are replaced later.

Filter in java

private void processArgument(Expression argument) {
		if(argument instanceof SuperMethodInvocation ||
				argument instanceof Name ||
				argument instanceof StringLiteral ||
				argument instanceof BooleanLiteral ||
				(argument instanceof FieldAccess && ((FieldAccess)argument).getExpression() instanceof ThisExpression) ||
				(argument instanceof ArrayAccess && invalidArrayAccess((ArrayAccess)argument)) ||
				(argument instanceof InfixExpression && invalidInfix((InfixExpression)argument)))
			return;
		this.arguments.add(argument.toString());
		if(current.getUserObject() != null) {
			AnonymousClassDeclarationObject anonymous = (AnonymousClassDeclarationObject)current.getUserObject();
			anonymous.getArguments().add(argument.toString());
		}
	}

Support Function Expression

var loggingCallback = function( callback ) {
			if ( objectType( callback ) !== "function" ) {
				throw new Error(
					"QUnit logging methods require a callback function as their first parameters."
				);
			}

			config.callbacks[ key ].push( callback );
		};

Implement Inline Method Refactoring

Similar to inline method refactoring there is a UML operation body mapper constructor for the removed operation to detect inline refactoring

Add flexibility to input API for testing

We need APIs for testing code between two directories, two code snippets, etc. The current code is too messy for utilizing the underlying directories by Git.

[Bug] Fix variables and object creations not coming from JS

return new Address(host, port);

The json is returning empty for object creations and variables

{
    "type": "BlockStatement",
    "text": "{",
    "statements": [{
            "text": "this.host = host;",
            "identifiers": [null, "host", "host"],
            "numericLiterals": [],
            "stringLiterals": [],
            "infixOperators": ["="],
            "prefixOperators": [],
            "postfixOperators": [],
            "variableDeclarations": [],
            "functionInvocations": [],
            "constructorInvocations": [],
            "objectCreations": [],
            "arguments": [],
            "loc": {
                "start": 496,
                "end": 513,
                "startLine": 26,
                "endLine": 26,
                "startColumn": 4,
                "endColumn": 21
            },
            "type": "ExpressionStatement"
        }, {
            "text": "this.port = port;",
            "identifiers": [null, "port", "port"],
            "numericLiterals": [],
            "stringLiterals": [],
            "infixOperators": ["="],
            "prefixOperators": [],
            "postfixOperators": [],
            "variableDeclarations": [],
            "functionInvocations": [],
            "constructorInvocations": [],
            "objectCreations": [],
            "arguments": [],
            "loc": {
                "start": 518,
                "end": 535,
                "startLine": 27,
                "endLine": 27,
                "startColumn": 4,
                "endColumn": 21
            },
            "type": "ExpressionStatement"
        }
    ],
    "loc": {
        "start": 490,
        "end": 537,
        "startLine": 25,
        "endLine": 28,
        "startColumn": 29,
        "endColumn": 1
    }
}

The variables should hold port and object creation should hold Address creation

Support Statements which are directly inside the file

In RrefactoringMiner statements are considered to be always inside methods. Field initializations and static block initializations are not matched in the same way the statements inside of a method.

In JavaScript, however, we have found that in most of the javascript evaluation project mentioned in RefDiff (Vue, Jquery etc) has statements directly under the file and not within a function declarations.
Vue Runtime
https://github.com/vuejs/vue/blob/9edcc6b6c7612efc15e4bfc5079279533190a2f2/dist/vue.runtime.js

/*!
 * Vue.js v2.5.4
 * (c) 2014-2017 Evan You
 * Released under the MIT License.
 */
(function (global, factory) {
	typeof exports === 'object' && typeof module !== 'undefined' ? module.exports = factory() :
	typeof define === 'function' && define.amd ? define(factory) :
	(global.Vue = factory());
}(this, (function () { 'use strict';

/*  */

var emptyObject = Object.freeze({});


//`.....more`


}, 0);

/*  */

return Vue$3;

})));
```

Here almost the whole program is written as
JQuery Qunit:

https://github.com/jquery/jquery/blob/ecd8ddea33dc40ae2a57e4340be03faf2ba2f99b/external/qunit/qunit.js

It starts with this snippet:
`````js
( function( global ) {

var QUnit = {};

var Date = global.Date;
var now = Date.now || function() {
	return new Date().getTime();
};

var setTimeout = global.setTimeout;
var clearTimeout = global.clearTimeout;

Snippet from Chart js

https://github.com/chartjs/Chart.js/blob/35dcfe00b1ae7199f8ed6c3748a72f4700c9876d/src/scales/scale.time.js

/* global window: false */
'use strict';

var moment = require('moment');
moment = typeof(moment) === 'function' ? moment : window.moment;

module.exports = function(Chart) {

	var helpers = Chart.helpers;
	var interval = {
		millisecond: {
			size: 1,
			steps: [1, 2, 5, 10, 20, 50, 100, 250, 500]
		},
		second: {
			size: 1000,
			steps: [1, 2, 5, 10, 30]
		},
		minute: {
			size: 60000,
			steps: [1, 2, 5, 10, 30]
		},
		hour: {
			size: 3600000,
			steps: [1, 2, 3, 6, 12]
		},
		day: {
			size: 86400000,
			steps: [1, 2, 5]
		},
		week: {
			size: 604800000,
			maxStep: 4
		},
		month: {
			size: 2.628e9,
			maxStep: 3
		},
		quarter: {
			size: 7.884e9,
			maxStep: 4
		},
		year: {
			size: 3.154e10,
			maxStep: false
		}
	};

	var defaultConfig = {
		position: 'bottom',

		time: {
			parser: false, // false == a pattern string from http://momentjs.com/docs/#/parsing/string-format/ or a custom callback that converts its argument to a moment
			format: false, // DEPRECATED false == date objects, moment object, callback or a pattern string from http://momentjs.com/docs/#/parsing/string-format/
			unit: false, // false == automatic or override with week, month, year, etc.
			round: false, // none, or override with week, month, year, etc.
			displayFormat: false, // DEPRECATED
			isoWeekday: false, // override week start day - see http://momentjs.com/docs/#/get-set/iso-weekday/
			minUnit: 'millisecond',

			// defaults to unit's corresponding unitFormat below or override using pattern string from http://momentjs.com/docs/#/displaying/format/
			displayFormats: {
				millisecond: 'h:mm:ss.SSS a', // 11:20:01.123 AM,
				second: 'h:mm:ss a', // 11:20:01 AM
				minute: 'h:mm:ss a', // 11:20:01 AM
				hour: 'MMM D, hA', // Sept 4, 5PM
				day: 'll', // Sep 4 2015
				week: 'll', // Week 46, or maybe "[W]WW - YYYY" ?
				month: 'MMM YYYY', // Sept 2015
				quarter: '[Q]Q - YYYY', // Q3
				year: 'YYYY' // 2015
			},
		},
		ticks: {
			autoSkip: false
		}
	};

Semantic UI

https://github.com/Semantic-Org/Semantic-UI/blob/1b48f527eb73d6bc4b1af2e94d52f51c32cec3c3/dist/components/popup.js

/*!
 * # Semantic UI 2.2.10 - Popup
 * http://github.com/semantic-org/semantic-ui/
 *
 *
 * Released under the MIT license
 * http://opensource.org/licenses/MIT
 *
 */

;(function ($, window, document, undefined) {

"use strict";

window = (typeof window != 'undefined' && window.Math == Math)
  ? window
  : (typeof self != 'undefined' && self.Math == Math)
    ? self
    : Function('return this')()
;

$.fn.popup = function(parameters) {
  var
    $allModules    = $(this),
    $document      = $(document),
    $window        = $(window),
    $body          = $('body'),

    moduleSelector = $allModules.selector || '',

    hasTouch       = (true),
    time           = new Date().getTime(),
    performance    = [],

    query          = arguments[0],
    methodInvoked  = (typeof query == 'string'),
    queryArguments = [].slice.call(arguments, 1),

    returnedValue
  ;

Design Idea for Supporting This direct statement Matching

So far we came up with the following approaches -

###1. Container Approach: In this design proposal Containers are entities having 3 things that can be used to model a file or function and even class declaration

List<Statment> statementList
List<UMLOperation> operations;
List<UMLClass> classList;
  • Files and any declarations can be thought of as a container.
    -A common Abstraction that will basically replace UML Class and UMLOperation.
  • UMLModel will have a list of the container (i.e. Source File) and will do container diff (similar to class diff) and will do container diff recursively on the classList, and operations.
  • Instead of UMLOperationBodyMapper, we will have ContainerBodyMapper or better if we have ContainerBodyMapper as a superclass for common diffs and operation mapper for a more customized approach for function matching if needed.
  • Since the ContainerBodyMapper will have almost identical code to operationBodyMapper, it will also match the statements hence the statements declared directly in the files will be also matched.

Cons

  • WIll need a lot of modification on UMLModel, ClassDiff, and of course the heart of RefeactoringMiner the opertaionBodyMapper class. This may create bugs initially since the body mapper has lots of logic assuming statements are coming from a function
  • Classes don't have statements (The solution could be moving these 3 things to ContainerBody and have containers return getBody())

####2. Extending UMLClass Approach:

  • Proposed by Dr. Tsantalis that extending UMLClass and adding a new attribute
List<AbstractStatement>

Then, extend UMLClassDiff and add a method to process the statements process statements()

and override process() to hook the execution of processStatements()

should allow us to match the statements. This will require writing an independent statement matching code just for the ones that are declared immediately inside a Files.

Pros:

  • Keeping the existing function matching algorithm untouched will preserve the accuracy
  • Should not take as much as re-structuring as step-1

Cons:

  • Probably introduced duplicated code or logic for the same statement matching logic if not done correctly.

###3. Considering UMLAttribute as Statement**

  • We can probably investigate how feasible it would be to consider employing a similar technique of matching UMLAttribute as statements

Parse returned function invocation

Case: 1

From JQuery,

config.queue.shift()();

Here shift() is returning a function which is then invoked

Case: 2
From react-native

(function() {
... codes

})();

Here in case 2, the function is declared and immediately invoked using the parenthesis around it. (Self-invoking function)

From W3School's definition-

Self-Invoking Functions
Function expressions can be made "self-invoking".

A self-invoking expression is invoked (started) automatically, without being called.

Function expressions will execute automatically if the expression is followed by ().

You cannot self-invoke a function declaration.

You have to add parentheses around the function to indicate that it is a function expression:

In AST we get these as call expressions and here for case 2, the callee ( expression before the parenthesis ) is a Parenthesis expression with an unnamed function declaration in it.

We need to determine the name of the functionName for this OperationInvocation

Efficiently Parse AST

With J2v8 library for JavaScript to Java bindings, we could obtain a V8Object in Java from where we can not do much practically for our purpose since recursively visiting the nodes in the AST is not very straight forward, probably not efficient and might increase the chance of memory leaks using the library.

1. Java-JS approach
One approach so far has been taken is to parse the source code in JavaScript and return it as JSON from which, on the Java side we could find variable declarations and function declarations, etc. This reduces performance since for a simple code JONIFY generates a huge String.

For a js file with simple code like this,

var y;

function ARenamed(){
	alert('Hello');
}

It is converted and returned to Java code represented by the following JSON -

{
	"type": "File",
	"start": 0,
	"end": 47,
	"loc": {
		"start": {
			"line": 1,
			"column": 0
		},
		"end": {
			"line": 5,
			"column": 1
		}
	},
	"range": [
		0,
		47
	],
	"program": {
		"type": "Program",
		"start": 0,
		"end": 47,
		"loc": {
			"start": {
				"line": 1,
				"column": 0
			},
			"end": {
				"line": 5,
				"column": 1
			}
		},
		"range": [
			0,
			47
		],
		"sourceType": "script",
		"interpreter": null,
		"body": [
			{
				"type": "VariableDeclaration",
				"start": 0,
				"end": 6,
				"loc": {
					"start": {
						"line": 1,
						"column": 0
					},
					"end": {
						"line": 1,
						"column": 6
					}
				},
				"range": [
					0,
					6
				],
				"declarations": [
					{
						"type": "VariableDeclarator",
						"start": 4,
						"end": 5,
						"loc": {
							"start": {
								"line": 1,
								"column": 4
							},
							"end": {
								"line": 1,
								"column": 5
							}
						},
						"range": [
							4,
							5
						],
						"id": {
							"type": "Identifier",
							"start": 4,
							"end": 5,
							"loc": {
								"start": {
									"line": 1,
									"column": 4
								},
								"end": {
									"line": 1,
									"column": 5
								},
								"identifierName": "y"
							},
							"range": [
								4,
								5
							],
							"name": "y"
						},
						"init": null
					}
				],
				"kind": "var"
			},
			{
				"type": "FunctionDeclaration",
				"start": 8,
				"end": 47,
				"loc": {
					"start": {
						"line": 3,
						"column": 0
					},
					"end": {
						"line": 5,
						"column": 1
					}
				},
				"range": [
					8,
					47
				],
				"id": {
					"type": "Identifier",
					"start": 17,
					"end": 25,
					"loc": {
						"start": {
							"line": 3,
							"column": 9
						},
						"end": {
							"line": 3,
							"column": 17
						},
						"identifierName": "ARenamed"
					},
					"range": [
						17,
						25
					],
					"name": "ARenamed"
				},
				"generator": false,
				"async": false,
				"params": [],
				"body": {
					"type": "BlockStatement",
					"start": 27,
					"end": 47,
					"loc": {
						"start": {
							"line": 3,
							"column": 19
						},
						"end": {
							"line": 5,
							"column": 1
						}
					},
					"range": [
						27,
						47
					],
					"body": [
						{
							"type": "ExpressionStatement",
							"start": 30,
							"end": 45,
							"loc": {
								"start": {
									"line": 4,
									"column": 1
								},
								"end": {
									"line": 4,
									"column": 16
								}
							},
							"range": [
								30,
								45
							],
							"expression": {
								"type": "CallExpression",
								"start": 30,
								"end": 44,
								"loc": {
									"start": {
										"line": 4,
										"column": 1
									},
									"end": {
										"line": 4,
										"column": 15
									}
								},
								"range": [
									30,
									44
								],
								"callee": {
									"type": "Identifier",
									"start": 30,
									"end": 35,
									"loc": {
										"start": {
											"line": 4,
											"column": 1
										},
										"end": {
											"line": 4,
											"column": 6
										},
										"identifierName": "alert"
									},
									"range": [
										30,
										35
									],
									"name": "alert"
								},
								"arguments": [
									{
										"type": "StringLiteral",
										"start": 36,
										"end": 43,
										"loc": {
											"start": {
												"line": 4,
												"column": 7
											},
											"end": {
												"line": 4,
												"column": 14
											}
										},
										"range": [
											36,
											43
										],
										"extra": {
											"rawValue": "Hello",
											"raw": "'Hello'"
										},
										"value": "Hello"
									}
								]
							}
						}
					],
					"directives": []
				}
			}
		],
		"directives": []
	},
	"comments": [],
	
	]
}
  • The tokens[] in the JSON is the main point of interest in this approach since it basically represents the AST (Program) in a JSON format that we need to reparse again in Java.

This approach has been taken by RefDiff which I suspect is one of the main reasons for their performance issues.

2. Native JS Approach for the Parsing part only
An alternative approach could be to parse the source code fully in JS and load it into a database. the Java thread could wait until the JS is thread is returned. This could hugely increase performance and widens the door for clustering of parsing multiple JS files at the same time using multiple threads or processes. Additionally, this increases the debugging capability and more flexibility for visiting nodes

However, this requires the additional effort of maintaining and running a multiple code environment.

Parse expressions

For each statement containing any expressions such as if ( d == 1) we need to extract the variables, literals, operators, etc. participating in the expression. The expression could also contain function calls. Using a visitor we can extract such information as similar to here -
https://github.com/tsantalis/RefactoringMiner/blob/master/src/gr/uom/java/xmi/decomposition/AbstractExpression.java

For composite statements we have a list of expressions belonging to the composite statements. More details can be found in the paper.

Support Function Declarations inside Function declarations

function A(){

  let x = function(){

};

   function B(){

  }
}

Here we know that Operation A has-

functionDeclarations = [B]
anonymousFunctions = [A.1]

We match the anonymous functions however, we need a way to match the functionDeclarations

Allow replacement of variable declaration kinds

JavaScript variables can be declared using var, let or const keyword. Therefore they could be replaced with each other if the replacement reduces the edit distance of two statements. This is similar to the type replacement in the findReplacementWithExactMatching method of the original miner.

However, in type replacement, it seems if the number of types (unique or nonunique) in both statements is the same, unequal types in the same index position are not considered for replacements (strings1, strings2).

Snippet from findReplacementWithExactMatching

	private void removeCommonTypes(Set<String> strings1, Set<String> strings2, List<String> types1, List<String> types2) {
		if(types1.size() == types2.size()) {
			Set<String> removeFromIntersection = new LinkedHashSet<String>();
			for(int i=0; i<types1.size(); i++) {
				String type1 = types1.get(i);
				String type2 = types2.get(i);
				if(!type1.equals(type2)) {
					removeFromIntersection.add(type1);
					removeFromIntersection.add(type2);
				}
			}
			Set<String> intersection = new LinkedHashSet<String>(strings1);
			intersection.retainAll(strings2);
			intersection.removeAll(removeFromIntersection);
			strings1.removeAll(intersection);
			strings2.removeAll(intersection);
		}
		else {
			removeCommonElements(strings1, strings2);
		}
	}

Future Optional Improvements and optimization

  1. Upgrade Nodejs to the latest to support new syntax for writing our javascript side program (such as using import instead of older syntax require for importing a module in node)
  2. Organize directory structure in Gradle by moving JS code from resources to the JS directory or in a separate java library project
  3. Optimize parsing source to JSON in JS side by improving and caching namespaces of function possibly by replacing recursion with an iterative approach for AST
  4. Use multithreading in java to parse group of files separately since for each file we are independently getting a composite JSON. We can alternatively use asynchronous programming in the NodeJS side to batch process all the files which might be a faster process.
  5. Use two threads to build two different UML models independently. This is a simple and reasonable optimization since UML models of 2 versions are independently created

Parse and Represent Object Expression

Object creation / literal

var x = { }; // empty obj literal, creates an empty object to x
var x = { age: 10 } // object creation with property

The following code creates an object with three properties and the keys are "foo", "age" and "baz". The values of these keys are a string "bar", the number 42, and another object.

let object = {
  foo: 'bar',
  age: 42,
  baz: {myProp: 12}
}

property assignments

let a = 'foo', 
    b = 42,
    c = {};

let o = { 
  a: a,
  b: b,
  c: c
}

With ECMAScript 2015, there is a shorter notation available to achieve the same:

let a = 'foo', 
    b = 42, 
    c = {};

// Shorthand property names (ES2015)
let o = {a, b, c}

Method definitions
A property of an object can also refer to a function or a getter or setter method.

let o = {
  property: function (parameters) {},
  get property() {},
  set property(value) {}
}

In ECMAScript 2015, a shorthand notation is available, so that the keyword "function" is no longer necessary.

// Shorthand method names (ES2015)
let o = {
  property(parameters) {},
}

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/Object_initializer

Match Array Creation with Object Creation

Statement 1: Creates an array of length 10

var addresses = new Array(count);

Statement2: Creates an empty array

let addresses = [];

Related code snippet from findReplacementWithExactMatching which match statements by array creation replaced with Data Structure

//check if array creation is replaced with data structure creation
		if(creationCoveringTheEntireStatement1 != null && creationCoveringTheEntireStatement2 != null &&
				variableDeclarations1.size() == 1 && variableDeclarations2.size() == 1) {
			VariableDeclaration v1 = variableDeclarations1.get(0);
			VariableDeclaration v2 = variableDeclarations2.get(0);
			String initializer1 = v1.getInitializer() != null ? v1.getInitializer().getString() : null;
			String initializer2 = v2.getInitializer() != null ? v2.getInitializer().getString() : null;
			if(v1.getType().getArrayDimension() == 1 
					&& v2.getType().containsTypeArgument(v1.getType().getClassType()) &&
					creationCoveringTheEntireStatement1.isArray() && !creationCoveringTheEntireStatement2.isArray() &&
					initializer1 != null && initializer2 != null &&
					initializer1.substring(initializer1.indexOf("[")+1, initializer1.lastIndexOf("]")).equals(initializer2.substring(initializer2.indexOf("(")+1, initializer2.lastIndexOf(")")))) {
				r = new ObjectCreationReplacement(initializer1, initializer2,
						creationCoveringTheEntireStatement1, creationCoveringTheEntireStatement2, ReplacementType.ARRAY_CREATION_REPLACED_WITH_DATA_STRUCTURE_CREATION);
				replacementInfo.addReplacement(r);
				return replacementInfo.getReplacements();
			}
			if(v2.getType().getArrayDimension() == 1 && v1.getType().containsTypeArgument(v2.getType().getClassType()) &&
					!creationCoveringTheEntireStatement1.isArray() && creationCoveringTheEntireStatement2.isArray() &&
					initializer1 != null && initializer2 != null &&
					initializer1.substring(initializer1.indexOf("(")+1, initializer1.lastIndexOf(")")).equals(initializer2.substring(initializer2.indexOf("[")+1, initializer2.lastIndexOf("]")))) {
				r = new ObjectCreationReplacement(initializer1, initializer2,
						creationCoveringTheEntireStatement1, creationCoveringTheEntireStatement2, ReplacementType.ARRAY_CREATION_REPLACED_WITH_DATA_STRUCTURE_CREATION);
				replacementInfo.addReplacement(r);
				return replacementInfo.getReplacements();
			}
		}

Since Array is an object in JavaScript and both statements are creating objects, it should be matched in my opinion. During 1-1-1-1 replacements, however, the count will not be matched.

Match Arguments which are function Declarations

In Refactoring miner, the Arguments are represented as String. However, in JavaScript arguments can be anything such as FunctionDeclaration or expression. For example in the popular Vue.js-

/*!
 * Vue.js v2.5.0
 * (c) 2014-2017 Evan You
 * Released under the MIT License.
 */
(function (global, factory) {
	typeof exports === 'object' && typeof module !== 'undefined' ? module.exports = factory() :
	typeof define === 'function' && define.amd ? define(factory) :
	(global.Vue = factory());
}(this, (function () { 'use strict';

/*  */

// these helpers produces better vm code in JS engines due to their
// explicitness and function inlining
function isUndef (v) {
  return v === undefined || v === null
}


// LOTS OF OTHER CODE
//--------
//-----


})));
  • The brackets () enclosing everything represents an ExpressionStatement
  • Inside the expression statement, an unnamed function with two parameters (global & factory) are declared.
  • This unnamed function is immediately invoked by two arguments this & another ExpressionStatement containing another unnamed function in line 5.
  • In this argument FunctionDeclaration, the program (starting with 'use strict';) is written.

So it is necessary to have a different representation for the arguments since for this project we would probably miss all the refactorings inside of the actual code.

Treat SwitchCase as a statement of SwitchStatement

 switch(x) { 
    case 10:
        int p = 1; 
    break; 
}

In refactoring miner swtich(x) composites have 3 children (case 10:, int p = 1; break;)
In JS we need to do the same and instead of taking the default way of case 10 as a composite, we should handle it as a single line/leaf statement.

Support various patterns of JS function declarations

JavaScript has several ways to declare a function. These could have a direct effect on how we determine the fully qualified name of a function. Currently, I defined the fully qualified name of a function as follows -

[File]|[outer_function].....[function_name]

For example, if function y() is inside function x() and it's inside of file f.js, the fully qualified name of y() is f.x.y f|x.y.

1 Typical

function x() {
}

2 Function Expression (No name of the function declared and assigned to b).

function x() {
   var b = function () {
   }
}

This situation creates a problem since b could be assigned a normal value later therefore we cannot just say that the name of the inner function is b. For this anonymous function declaration, we have to find or generate an identifier or ways to refer to this since refactoring operations could be performed inside these types of function expressions. Moreover, this is also a very common pattern in JS.

For now, we can skip this case 2 since it does not have a name therefore not subject to rename function refactoring.

3 Function Expression (named)

function x() {
   var b = function f1 () {
   }
}

In this case, we have to consider this for rename function detection

https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Operators/function

4 Function Constructor

// Constructor function for Person objects
function Person(first, last, age, eye) {
  this.firstName = first;
  this.lastName = last;
  this.age = age;
  this.eyeColor = eye;
}

// Create two Person objects
var myFather = new Person("John", "Doe", 50, "blue");
var myMother = new Person("Sally", "Rally", 48, "green");

///myFather.age

5 Object Property

Here the object property, i.e. method is get

from - dist/vue.runtime.common.js (commit a08feed8c410b89fa049fdbd6b9459e2d858e912)

var supportsPassive = false;
if (inBrowser) {
  try {
    var opts = {};
    Object.defineProperty(opts, 'passive', ({
      get: function get () {
        /* istanbul ignore next */
        supportsPassive = true;
      }
    })); // https://github.com/facebook/flow/issues/285
    window.addEventListener('test-passive', null, opts);
  } catch (e) {}
}

Refactor ProcessStatement and ProcessExpression in JavaScript

Although the current implementation of processing statements and expression allows adding a new AST node easily, it's duplicating a lot of information. Such as pretty-printed toString() of a node is common for all the leaf statements and is being duplicated on all the nodes. It should be extracted to a common Function. There are many more nodes that are common.

This is not a priority right now but should be fixed in the future.

Decide whether to use Binding info from Babel

The Java Refactoring Miner does not use any binding information. However, Babel automatically provides binding information such as function invocation to the declaration.

We need to test -

  • Is babel using all the files to resolve bindings? RefactorningMiner ideally should work with the modified, added, or removed files.

  • Absence of bindings is a strength of RefactoringMiner. On the other hand, Babel bindings are already available which may be more accurate than manually matching the declaration with an invocation.

Parse and represent statements

  • To match the body of a function - which is made up of a list of single or block statements, they need to be load from the source code and represented in the Java side.

  • One way to do match is by using a special string match way (i.e. the technique described in Refactoring miner 2.0) which utilizes a composite design pattern where each leaf node represents a single statement and a composite leaf represents a block statement which may or may not contain other block or single statements

Handle Default Value of Function Parameters

In JavaScript, function parameters can have default values -

function withDefaults(a, b = 5, c = b, d = go(), e = this, 
                      f = arguments, g = this.value) {
  return [a, b, c, d, e, f, g]
}

As seen from the above example, the default value could be an expression too.

Default Parameters Scope

If default parameters are defined for one or more parameters, then a second scope (Environment Record) is created, specifically for the identifiers within the parameter list. This scope is a parent of the scope created for the function body.

This means that functions and variables declared in the function body cannot be referred to from default value parameter initializers; attempting to do so throws a run-time ReferenceError.

It also means that variables declared inside the function body using var will mask parameters of the same name, instead of the usual behavior of duplicate var declarations having no effect.

The following function will throw a ReferenceError when invoked, because the default parameter value does not have access to the child scope of the function body:

function f(a = go()) { // Throws a `ReferenceError` when `f` is invoked.
  function go() { return ':P' }
}

...and this function will print undefined because variable var a is hoisted only to the top of the scope created for the function body (and not the parent scope created for the parameter list):

From the description above, we must take special care when evaluating default parameters. Although most of the default parameters are likely to be a constant value ( Like in C#), ideally we should evaluate the expression of this default parameter and match by expression

Operation Diff in JSRMiner

During operation diff, we could perhaps treat the default parameters as the 2nd round of unmatched parameter matching.

Currently, we first match parameters by same name and default value (currently which is a string and later it will be an expression). Then the following 'round based' matching for the unmatched parameter list is applied. Note that I considered the default value matching as the 2nd round which is the type matching in Java version. It's not clear to me whether setting 2nd round as index position would yield better accuracy.

  /**
     * For the unmatched parameters, try matching by name, default value, index in parent in rounds
     */
    protected void tryMatchRemovedAndAddedParameters() {
        matchParametersWithSameName();
        matchParametersWithSameDefaultValue();
        matchParametersWithSameIndexPosition();
    }

More Details
https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Functions/Default_parameters

Detect Rename function

Background

  • In JavaScript, a function's signature could be considered as its name only.
  • If there are multiple functions in the same scope with the same name, regardless of the number of parameters, only the lastly declared function will be hoisted up i.e. considered by the environment.

For example, in the following code snippet function, the output would be

World
World

Since the lastly declared function A is hoisted up and called twice.

A(10, 20);
A(5);

function A(p1, p2) {
    alert('Hello')
} 

function A(p3) {
    alert('World')
}
  • Thre is no method overloading like Java however, any function can be passed any number of arguments.

Technique

  • Find added, modified and deleted files using git or a directory

  • Compare function names in the same file and see if only the name has changed (Round 1)

  • Detect change in other files ( Next Round)

  • Compare body (next round)

  • Compare refactoring in the body? (Next Round)

  • In general, it's found that detecting lower level refactoring first than higher-level refactoring would allow more refactorings since empirical studies have found that lower-level refactorings are more prevalent than higher level ones.

Represent source code in Java for analysis

To perform refactoring detection, it is necessary to represent the source code in a preferred format.

  • One way to do it is to present the source code using a composite pattern where each node represents a code element

  • The technical complexity includes parsing the source files using Java with the help of J2V8 plugin from eclipse. Though it's documentation is a bit dated, it's actively being maintained and more performant than other Js Java engine. This bridge between the js code adds additional overhead such as proper releasing of memory etc.

Alternatively, we could parse the source code in JS script and store them in a database. Then these tokens could be represented in Java. This also would allow us to cache the results since commits are permanent of nature.

Implement statement matcher for Function

To detect refactorings one of the first steps is to match statements between two versions. For now, we are focusing on the statements inside of a function.

Note that in javaScript since functions can be assigned to a variable (i.e. FunctionExpression) and can be just be declared both of which could be inside of another function (i.e. nested functions) we need to match the body of the nested functions too.

Code to look into OperationBodyMapper, how RefactoringMiner is handling Annonymous classes and lambda etc.

Match Composite Statements

Composites statements' expressions contain a similar format as single / leaf statements. Therefore they could be matched the same way. The variables, literals, etc need to be extracted from these. Therefore it is needed to move these into a common class (Such as CodeFragment).

The current code needs refactoring by pulling these variables, literals i..e code elements appearing in a statement or expression up to the superclass of Expression and Statement - CodeFragment.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.