ai-se / xtree Goto Github PK

Less is More: Minimizing Code Reorganization using XTREE. ARXIV link: https://arxiv.org/abs/1609.03614

License: MIT License

Python 100.00%

software-engineering decision-trees decision-making prediction

xtree's Introduction

Less is More: Minimizing Code Reorganization using XTREE

                                                                           
                                                                           
                                        __.....__           __.....__      
                                    .-''         '.     .-''         '.    
                      .|  .-,.--.  /     .-''"'-.  `.  /     .-''"'-.  `.  
   ____     _____   .' |_ |  .-. |/     /________\   \/     /________\   \ 
  `.   \  .'    / .'     || |  | ||                  ||                  | 
    `.  `'    .' '--.  .-'| |  | |\    .-------------'\    .-------------' 
      '.    .'      |  |  | |  '-  \    '-.____...---. \    '-.____...---. 
      .'     `.     |  |  | |       `.             .'   `.             .'  
    .'  .'`.   `.   |  '.'| |         `''-...... -'       `''-...... -'    
  .'   /    `.   `. |   / |_|                                              
 '----'       '----'`'-'                                                   

 
              _{\ _{\{\/}/}/}__
             {/{/\}{/{/\}(\}{/\} _
            {/{/\}{/{/\}(_)\}{/{/\}  _
         {\{/(\}\}{/{/\}\}{/){/\}\} /\}
        {/{/(_)/}{\{/)\}{\(_){/}/}/}/}
       _{\{/{/{\{/{/(_)/}/}/}{\(/}/}/}
      {/{/{\{\{\(/}{\{\/}/}{\}(_){\/}\}
      _{\{/{\{/(_)\}/}{/{/{/\}\})\}{/\}
     {/{/{\{\(/}{/{\{\{\/})/}{\(_)/}/}\}
      {\{\/}(_){\{\{\/}/}(_){\/}{\/}/})/}
       {/{\{\/}{/{\{\{\/}/}{\{\/}/}\}(_)
      {/{\{\/}{/){\{\{\/}/}{\{\(/}/}\}/}
       {/{\{\/}(_){\{\{\(/}/}{\(_)/}/}\}
         {/({/{\{/{\{\/}(_){\/}/}\}/}(\}
          (_){/{\/}{\{\/}/}{\{\)/}/}(_)
            {/{/{\{\/}{/{\{\{\(_)/}
             {/{\{\{\/}/}{\{\\}/}
              {){/ {\/}{\/} \}\}
              (_)  \.-'.-/
          __...--- |'-.-'| --...__
   _...--"   .-'   |'-.-'|  ' -.  ""--..__
 -"    ' .  . '    |.'-._| '  . .  '   jro
 .  '-  '    .--'  | '-.'|    .  '  . '
          ' ..     |'-_.-|
  .  '  .       _.-|-._ -|-._  .  '  .
              .'   |'- .-|   '.
  ..-'   ' .  '.   `-._.-´   .'  '  - .
   .-' '        '-._______.-'     '  .
        .      ~,
    .       .   |.   .    ' '-.

Submission

Submitted to Information and Software Technology. ARXIV Link: https://arxiv.org/abs/1609.03614v3

Cite As

@misc{1609.03614,
Author = {Rahul Krishna and Tim Menzies and Lucas Layman},
Title = {Less is More: Minimizing Code Reorganization using XTREE},
Year = {2016},
journal= {Information and Software Technology, submitted},
Eprint = {arXiv:1609.03614},
}

Authors

Rahul Krishna, Tim Menzies
- Com. Sci., NC State, USA
- [email protected]
- [email protected]
Lucas Layman
- Fraunhofer CESE, College Park, USA,
- [email protected]

Data

Jurecko defect data

Latex Source

Information and Software Technology Submission

Source Code

XTREES

License

This is free and unencumbered software released into the public domain.

Anyone is free to copy, modify, publish, use, compile, sell, or distribute this software, either in source code form or as a compiled binary, for any purpose, commercial or non-commercial, and by any means.

(BTW, it would be great to hear from you if you are using this material. But that is optional.)

In jurisdictions that recognize copyright laws, the author or authors of this software dedicate any and all copyright interest in the software to the public domain. We make this dedication for the benefit of the public at large and to the detriment of our heirs and successors. We intend this dedication to be an overt act of relinquishment in perpetuity of all present and future rights to this software under copyright law.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.

For more information, please refer to http://unlicense.org

xtree's People

Contributors

Stargazers

Watchers

xtree's Issues

sorry to fuck up your beautiful dir structure..

... had troubles last submission where the files were in subdirs (stupid build scripts on the journal's side)

so not now, cause i am editting, but the next time you get to this paer's source:

no subdirs
no unused files

Reviewer Tasks (For Rahul)

R1: Does the reached results indicate similar direction?
**Task: (a) Change section 1.2 to section 2; (b) Refer to content from papers 1, 2, 3; (c) Note the use of developers to validate the prioritization, point to section 3 to say why this is not a good idea. **

Respose: You are quite correct that we didnt discuss enuf related work in code smell prioritzation, please see our new section 2.1

Questions regarding defect prediction:

2.1. R1: Why is defect prediction good to support decisions on code reorganization?
Task: (a) Add a section on defect prediction; (b) Remark on the relationship between fault proneness and faults in software systems. Use paper 4, 6.

You're quite correct taht we supplied insufficent info on defect prediction and tgeri value to code smells. Please see new section 3 in the papaer.

The enxt few issues related to similair issues addressed with reviewer1.

2.2. R2: What does it then mean to “reduce defects in our data sets”?
2.3. R2: In case four new modules are developed, is there then also a log history that shows how the total number of modules in the system correlates with the total number of defects? A 
2.4 R2: What’s the connection between whether a smell is bad or not and the threshold values of the various code metrics?

In the above, this reviewer is rasining similar points to reviewer1. Thanks to these issues, we have added the new section 3 to address these issues.

Also, especially for reviewer2, we add the following:

The paper http://dl.acm.org/citation.cfm?id=2821501 explcitly claims that there is a connection between threshold values for static code attributes and bad smells.
Note that we have also added this note to the new section3.

R2: In section 4: “It can be difficult to judge the effects of removing bad smells. Code that is reorganized cannot be assessed just by a rerun of the test suite ...". Isn’t the point about removing bad smells to improve the code without changing the behavior (refactorings)? Why cannot a test suite be run before and after if the behavior is not changed?
Task: Pretty obvious why. Still, make clear.
R2: Figure 4. This is a good example of researcher bias. Your example is the most favorable example from your point of view. It is one of only two of the eight data sets that improved on both pd and pf.
Task: Stress the importance of tuning.
R2: The authors state that if there are no historical records of defects, the results of this paper can be used as a guide (which results?). It is referred to Table 8 in the abstract but there is no Table 8. Is it meant to be Figure 8? In case, it’s very hard to understand how that figure could be used. Or is it Figure 1 as stated in the conclusion, but how that Figure 1 compensate for a lack of historical records?
Task: Explain Fig 8. Note the relevance in case of lack of historical data.
R2: What's the difference between tool, method, and framework?
Task: Find and remove instances appropriately
R2: What are relationship between different code-metrics in RQ2?
Task: More examples
R2: What is the selection criterion for different methods in RQ1?
Task: There's no selection criterion per se. But try rewording
R2: What is the idea of the following statement? Is the point that a log history of defects has shown that modules with more than 100 loc have more defects (per lines of code?) than smaller modules, and then the action is to reduce the size of that module?

“This code reorganization will start with some initial code base that is changed to a new code base. For example, if the bad smell is loc > 100 and a code module has 500 lines of code, we reason optimistically that we can change that code metric to 100. Using the secondary verification oracle, we then predict the number of defects in new.”

Task: Again, pretty obvious. May be reword the statement?

shoudl this other text be in the body text?

shoudl this text be in the body text

Response to Dr. Layman's comments

the following is in the reviewers section... should it be in the body text?

i note that if it was there we could tell this reviewer that in response to their excellent comment we have made the following update to the text

"Whats the difference between tool, framework, and method in this paper?"

why not just choose one term and throw away the other terms from this paper?

Reviewers Comments

Comments from the editors and reviewers

Editor

Evolve the paper to make clear the defect model, hypothesis, limitations, and readability
Prepare a letter of changes explaining all modifications made.

Reviewer 1

How the results of this work contribute to future research activities in the area?

Youre quite correct, there is no future work seciton in the paper. Our research shows is that it is potentially naive to exlore thresholds in statuc code attr in s=isolation to each other. Our woek clearly demonstrates how chaneging one this necessitates changeing somethinf else. So, for suture work we recommend theat researcher look for tools that recommend chantes to sets of code chantes. Candidate tech in this are might include: 1. Association rule learning; 2. thresholds generated across synth dimentions eg. PCA; 3. Techniques that cluster data and looik for deltas between them. (Note: that we offer xtree as an example of the third point);

Adiitionlly, Scaleable solutions. After this, we look at applicaions beyond scode attributes. For example, we have been looking at sentiment analysis in stack overflow exchanges to learn dialog pattern that most select for SO entries.

How can practitioners benefit from the results achieved in the paper?
Thank you for that note. We have added text to the end to section one to address this issue.
What are the main limitations of the work.
Further discussion at the end indicating how this work complements the body of knowledge about code smells. Does the reached results indicate a similar direction? contradictory? Does This work somehow confirm the results of other studies or points to new directions?
@rahlk: see — An approach to prioritize code smells for refactoring
Why is defect-proneness good to support the decision on code reorganization and/or how it could complement other characteristics.
See: http://link.springer.com/article/10.1007%2Fs10664-015-9361-0
Could we use XTREE as a strategy to prioritize the payment of technical debt items?
Is there any reason for not using the GQM template to state the goal of the study and also to define null and alternative hypotheses for the research questions?

Reviewer 2

What’s the connection between whether a smell is bad or not and the threshold values of the various code metrics?
What does it then mean to “reduce defects in our data sets”? Is it the total number of defects in the data set (system?) or the number of “infected modules” in the data set?
What is the idea of the following statement? Is the point that a log history of defects has shown that modules with more than 100 loc have more defects (per lines of code?) than smaller modules, and then the action is to reduce the size of that module?

“This code reorganization will start with some initial code base that is changed to a new code base. For example, if the bad smell is loc > 100 and a code module has 500 lines of code, we reason optimistically that we can change that code metric to 100. Using the secondary verification oracle, we then predict the number of defects in new.”

Fix

end of section 2.2 says

Only sections 3.2.1, 3.2.2, and two-fifths of the results in Figure 6 contain material found in prior papers.

but we have no 3.2.1 or 3.22

please fix

Reviewer comments: Dr. M

Reviewer 1

Is there any reason for do not use the GQM template to state the goal of the study and also to define null and alternative hypotheses for the research questions?

Reviewer 2

Section 2 discusses the idea of why not just ask developers about the effect of code smells when the research literature is contradictory. I find it odd to believe that practitioners should be able to resolve inconsistencies in the research literature. As researchers, we should obviously investigate more into why the results are contradictory. The reason may be varying quality of the research or varying contexts. For example, several papers state that there are more problems in software components with a large number of a smells than components with a low number of smells. One should dismiss such research if the size of the components is not adjusted for. There are certainly more problems with larger components than smaller components given other properties the same. The authors agree that practitioners should not be expected resolve contradictions. My point is that it is so obvious that most of Section 2 should be deleted. Figure 1 as part of a related work section may remain.
It is stated that this paper reports a case study. I know that within the SW community is its common to use the term case study to denote just a demonstration of an example performed by the researchers. But in more mature disciplines, which we should aim to become a member of, “case study” has a certain meaning (e.g. Yin 2003). It would mean an evaluation in a real software development context. This is note the case here. See also the work by Runeson and Host 2009 on case studies in software engineering. Regarding case study, what kind of work remains before the proposed tool could be used by practitioners?

need a clean repo with the source code data to repeat this work

and paste the url for same into the paper

Reviewer Comments

Issues currently being addressed

Evaluating bad smells based on defect prediction is not enough to decide when it should be ignored or not. A bad smell is useful when if it helps to identify and remove a maintenance problem
No mention is made of the risks or mitigation of a less than optimal tree
need table showing all changes between all pairs of leafX-to-better-leafY. note that if attribute Z is statistically indistinguishable between X,Y then don't list Z
Reducing LOC by splitting a long method into smaller ones, may increase other metrics, such as coupling among methods and files, but didn't further explain how XTREE could possibly address this problem.

Reviewer 1

The comparison only focuses on two aspects---effectiveness and verbosity, and XTREE only demonstrated significant improvement of the latter.
How much defect and history data is needed to make a reliable prediction?

Reviewer 2

None of the motivations appear to provide evidence that by removing code smells developers could reduce defects: the principal assumption of the work.
The approach does not propose concrete refactoring strategies. It provides only metric’s thresholds. None of the two research questions presented in the paper are clearly linked to this problem.
The paper does not provide a single concrete example of how XTREE is used to validate developer’s bad smells.
The use of thresholds is linked to bad smell detection. However, this relation must be made more explicit in the text. (Extracting Relative Thresholds for Source Code Metrics)

Reviewer 3

How does XTREE handle overfitting?
tree being built - either in terms of the number of tree levels
backtracked or in terms of final defect probability outcomes.

(Note: The rest of this reviewer's comment was just reiterating our experiment.)

IST paper... over to you

paper is now committed to github

please

fix latex compiles
look for YYY or XXX in the source code and fix
check reference list at back for repeats or anything garbled
proof read
look for bullshit in my new text
Scream loudly cause I dumped the stability study and replaced it with other rhetoric since it did not seem to be a major point. then get over it.
maybe check it all out and latex it locally since my recompiles were getting kinda slow.

growing dark on section 2.1 "about bad smells"

does it really add any value? does the new intro "CYA" enough to dump bad smells

dont quite get this one..

lets talk!

Where is technical debt?

Figure label broken

LaTeX Warning: Reference fig:jur' on page 10 undefined on input line 1236. LaTeX Warning: Reference fig:jur' on page 10 undefined on input line 1280.
LaTeX Warning: Reference fig:jur' on page 11 undefined on input line 1290. LaTeX Warning: Reference fig:jur' on page 11 undefined on input line 1330.
LaTeX Warning: Reference `fig:jur' on page 14 undefined on input line 1650.

ai-se / xtree Goto Github PK

xtree's Introduction

Less is More: Minimizing Code Reorganization using XTREE

Submission

Cite As

Authors

Data

Latex Source

Source Code

License

xtree's People

Contributors

Stargazers

Watchers

xtree's Issues

Comments from the editors and reviewers

Editor

Reviewer 1

Reviewer 2

Reviewer 1

Reviewer 2

Issues currently being addressed

Reviewer 1

Reviewer 2

Reviewer 3

Recommend Projects

Recommend Topics

Recommend Org