Chapter 10 Final Words

This book is generated as a part of course IN4334. The goal of this course was to familiarize students with analytics, as it is applied in software development to improve its performance. As a part of their curriculum, students organized themselves in groups of 3 to 4 and studied one aspect of software development in details. The topics covered in the course are testing, build, bug prediction, ecosystem, release engineering, code review, runtime and performance, and app store. Some of these topics are more specific such as bug prediction while other topics have a broad scope such as release engineering.

Each group conducted a systematic literature review to find answers to the following three questions: (1) What is the ‘state-of-the-art’ and/or ‘state-of-the-practice’ in the studied topic? (2) What aspects of the topic are studied in software analytics research? And (3) what aspects of the topic should be studied in future software analytics research? Below we discuss findings from each topic.

In practice, in testing developers update test cases on a need-basis and often indications from test coverage tools are ignored, if used at all. Further, test-driven development while applied in practice is less understood. In research, studies focus on generating test cases, mostly semi-automatically, based on risks and evolving software. In future, more work is required to generate test cases automatically.

Studies on software build, in the recent years focus on continuous integration to identify best practices to improve and predict build success rate and estimate time to build. In practice too, a significant percentage of projects are now adopting continuous integration as supported by popular websites such as GitHub and frameworks such as Travis. There are, however, challenges with adoption of continuous integration resulting from the non-availability of scalable tools, incomplete understanding of user needs, and limits of techniques used for analysis.

For bug-prediction, several models are build using ML techniques and statistics; nonetheless there is little use of these tools in industry. From research perspective, there are still gaps in predicting bugs especially from the perspective of including developer-related factors.

Current research on software ecosystems mostly refers to impact change analysis approaches, developers’ behavior analysis towards breaking changes, investigation of library and API deprecation issues related, and ‘health’ dependency metrics. Many studies show that, in practice, developers are reluctant to update their dependencies. There is room for development of tools that will be able to assist developers to fix dependency issues in their software systems, automatically. Open research questions that can be addressed in the future include the generality of the results of studies dependent on particular ecosystems and the establishment of metrics and approaches able to automatically access the ‘health’ of a system making improvement recommendations to developers.

Advances in release engineering, quite unlike other aspects of development, is driven by industry. Today, several aspects of ‘modern’ release engineering are studied including rapid release, devOps, continuous integration, continuous deployment, branching and merging, build configuration, and infrastructure-as-code. Today, most of the research on release engineering, focuses on the transition from traditional release to rapid release through cases studies. In future, more generalizable findings may better inform the community.

Recent studies on code reviews show that they can improve code robustness and maintainability. It is worth noting that there is a correlation between reviewers’ expertise and their influence during the review process on the members of their teams. In practice, there are many tools, such as Gerrit, Crucible, Stash, GitHub pull requests, Upsource, Collaborator, ReviewClipse, and CodeFlow, used by developers to help them in the comprehension and bug-fixing of software systems. As future research, there are many directions towards the improvement and accuracy of existing tools, as well as the development of new ones that are able to automate code review tasks.

Current research in the field of runtime and performance, especially as far as energy efficiency is concerned, shows that many studies introduce guidelines and tools able to increase developers’ energy awareness while they are programming modern software apps. However, these guidelines are too generic at the time where developers require project-specific energy-related information. Also, available tools, which are responsible for measuring the energy consumption of software programs and providing refinement recommendations, are often poorly documented and impractical. Therefore, there is a need for applying state-of-the-art approaches to practical case studies, increasing the usefulness of the former. Furthermore, studies that compare the energy efficiency of programming tasks written in different programming languages show that commonly used programming languages, such as Java, are not the most energy-efficient ones. Then, there are many open research questions for future studies concerning developers’ choice to keep using energy-consuming programming languages for modern apps.

Regarding app store analytics, there is, currently, an increasing research interest on review analytics for mobile apps. Since the appearance of mobile app stores in 2008, many empirical studies have mined app stores to identify usability issues, software bugs, and security vulnerabilities. An interesting finding from recent studies on review analytics refers to the fact that users tend to positively change their app ratings when developers respond to user comments and requests. Future research directions in the field of review analytics include the use of sentiment analysis and machine learning. Additionally, researchers can consider conducting studies regarding the improvement of the practicality of existing tools that mine and categorize user reviews.