Finding
Paper
Citations: 1
Abstract
Developers may choose to implement a library, despite the existence of similar libraries, considering factors such as computational performance, language or platform dependency, and accuracy. As a result, GitHub is a host to several library projects that have overlaps in the functionalities. These overlaps have been of interest to developers from the perspective of code reuse or preferring one implementation over the other. We present an empirical study to explore the extent and nature of existence of these similarities in the library functions. We have further studied whether the similarity among functions across different libraries and their associated test suites can be leveraged to reveal defects in one another. Applying a natural language processing based approach on the documentations associated with functions, we have extracted matching functions across 12 libraries, available on GitHub, over 2 programming languages and 3 themes. Our empirical evaluation indicates existence of a significant number of similar functions across libraries in same as well as different programming languages where a language can influence the extent of existence of similarities. The test suites from another library can serve as an effective source of defect revealing tests. The study resulted in revealing 72 defects in 12 libraries. Further, we analyzed the source of origination of the defect revealing tests. We deduce that issue reports and pull requests can be beneficial in attaining quality test cases not only to test the libraries in which these issues are reported but also for other libraries that are similar in theme.
Authors
Devika Sondhi, Divya Rani, Rahul Purandare
Journal
2019 12th IEEE Conference on Software Testing, Validation and Verification (ICST)