CReN
Consistent Renaming Tool
Built On Eclipse
News
Screen Shots

8.02.2007
CReN version 1.0 is released!

We have tested CReN on three examples from literature that show an inconsistent renaming of identifiers in the pasted code fragment.

The first example, published in a paper by Li, et al., [2] is from the file memory.c in Linux version 2.6.6. The original code fragment (on the left) is a for loop that is copied and pasted and then modified. In the modified pasted code fragment (on the right), the programmer intended to change all instances of the array name "prom_phys_total" to "prom_prom_taken". We can see that the programmer unintentionally did not change one instance of the array's name (in the last line) from "prom_phys_total" to "prom_prom_taken". The compiler did not detect this error because "prom_phys_total" is still in scope. In this example, the for loop was copied and pasted within the same function: void __init prom_meminit(void), which begins at line 68 in memory.c (not shown).

The second example, from a paper by Liblit, et al., [3] is code that is part of the GNU command "bc", in the file storage.c. The original copied code fragment (on the left) is a function named "more_variables" that allocates a larger amount of memory for the "variables" array. It then copies the values over from the smaller array named "old_var" to the larger array named "variables" (in the first for loop), and then fills in the rest of the space in the "variables" array with NULL. In the modified pasted code fragment (on the right), which is an entire function in this case, the function's name was renamed from "more_variables" to "more_arrays", the type "bc_var" was renamed to "bc_var_array", and all instances of the arrays "old_var", "variables", and "v_names" were renamed to "old_ary", "arrays", and "a_names", respectively. However, one instance of the variable "v_count" in this function was missed and not renamed to "a_count" (in the second for loop's condition), resulting in a buffer overrun error [3]. Because "v_count" is defined as a global variable, this copy-paste error is not detected by the compiler.

The third example is from a paper by Jiang, et al., [1] and is code in the file dependency.c from the GCC Fortran compiler. In this example, the identifier "l_stride" in the if statement's condition is also used in the if statement's body. However, in the modified code fragment, the "r_stride" identifier was supposed to be left as "l_stride". This is a different type of error than the other two, but is still an inconsistency in renaming that was not caught by the compiler or the programmer during development.

Now we demonstrate how CReN would catch each of the identifier renaming errors in the three examples in the scenario that each of these programs is currently being written in the IDE. The examples have been rewritten in Java.

In the first example, the for loop is copied and pasted from lines 92-99 to lines 111-118 in the memory.c file. CReN detects this and, with support from the ASTs, extracts a rule stating that all occurrences of the identifier "prom_phys_total" in lines 93-98 should be changed to the same identifier in the new copy. With this rule, when the programmer changes any instance of "prom_phys_total" in the pasted code fragment to "prom_prom_taken" all of the other instances (in the group) will also be renamed to "prom_prom_taken" consistently, as shown below. Hence, CReN will be able to prevent the missed renaming shown in the first cell of the buggy column in the examples table.

CP-Miner
CReN consistently renames all instances of "prom_phys_total" to "prom_prom_taken" in the fragment when any one instance of "prom_phys_total" in the fragment is modified.

In the second example, the entire function is copied and pasted from lines 118-150 to lines 152-184 in the file storage.c. CReN detects the copying and pasting and, from the ASTs, extracts a rule that states that all occurrences of the identifier "v_count" in lines 118-150 should be changed to the same identifier in the new copy. With this rule, when the programmer changes any instance of "v_count" in the pasted code fragment to "a_count" all of the other instances (in the group) will also be renamed to "a_count" consistently, as shown below. CReN will be able to prevent the missed renaming that is in the second for loop shown in this example.

Bug Isolation
CReN consistently renames all instances of "v_count" to "a_count" in the fragment when any one instance of "v_count" in the fragment is modified.

The third example is different from the other two. In this example, an if statement was copied and pasted from lines 414-415 to lines 422-423 in the dependency.c file. CReN detects the copying and pasting and, from the ASTs, extracts a rule that states that all occurrences of the identifier "l_stride" in lines 414-415 should be changed to the same identifier in the new copy. With this rule, when the programmer changes any instance of "l_stride" (for example, the bottom "l_stride") in the pasted code fragment to "r_stride", all of the other instances (in the group, for example, the top "l_stride") will also be renamed to "r_stride" consistently. However, according to Jiang, et al., [1] while the GCC developers confirmed that the inconsistency (one "l_stride" and one "r_stride") is a bug, it is not for this reason. In fact, the programmers intended to not rename either of the instances of "l_stride" in this clone at all. We don't focus on this case exactly, since we expect the pasted code to be modified (we consider the type of copy-and-paste where code is reused as a template as opposed to exact duplication), but CReN would still be able to alert the programmer of the inconsistency. When the other instance of "l_stride" is being renamed to "r_stride", programmers should then realize that they didn't intend to make either modification. (This is still different from the case when the programmer intends to rename an instance of an identifier independently from the others. We directly provide the functionality in CReN for the programmer to be able to remove an instance of an identifier from a group that is to be renamed consistently together).

Download the three examples discussed here (written in Java) on the download page.

References

[1] L. Jiang, Z. Su, and E. Chiu, "Context-Based Detection of Clone-Related Bugs", European Software Engineering Conference (ESEC) and ACM SIGSOFT International Symposium on the Foundations of Software Engineering (FSE), 2007.

[2] Z. Li, S. Lu, S. Myagmar, and Y. Zhou, "CP-Miner: A Tool for Finding Copy-paste and Related Bugs in Operating System Code", USENIX-ACM SIGOPS Symposium on Operating Systems Design and Implementation (OSDI), 2004.

[3] B. Liblit, A. Aiken, A.X. Zheng, and M.I. Jordan, "Bug Isolation via Remote Program Sampling", ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI), 2003.