Software maintenance is a significant problem whose solution will save a great deal of money throughout the software industry. In pursuit of these savings much research is being done. However, as with other software engineering areas, there is a concern that our efforts lack hard evidence and critical evaluation, and that without these, we can't develop a deep understanding of what tools and processes work, when they work, or why. Consequently, many people believe that rigorous empirical methods must be one of the cornerstones of research in this field.
The Code Decay Project.
We are conducting a long-term, multidisciplinary project to discover the fundamental causes, symptoms, and remedies for code decay. The project team contains researchers in Statistics, Experimentation, Organizational Theory, Programming Languages, Software Engineering, and Visualization.
Our primary object of study is the AT&T 5ESS™ switching system. It is composed of more than 50 subsystems and contains over 18 million lines of code. Along with the source we have the system's change control history for the past 15 years covering 3.6 million code changes implementing 672,000 change requests. We also have data on its planned and actual development milestones, effort and testing data, organizational history, development policies, and coding standards.
Our goals are to (1) define response variables and document the existence of code decay, (2) develop code decay indices, (3) identify factors causing it, and (4) create and evaluate prevention strategies.
My specific interests lie in the possibility of analyzing the version control systems of large development efforts. These systems contain significant amounts of data that could be, but are not currently being, exploited in the study of system evolution. I along with my student Jung-Min Kim, and Drs. Siy and Thomas Ball are exploring novel uses of this information. Currently, we have derived VCS-related metrics, like connection strength based on the probability that two classes are modified together. We are exploring several extensions to this work including time-series analyses, improved visualization techniques, and automated restructuring.
This project is sponsored by the National Science Foundation and the project team contains researchers in programming languages, software engineering, statistics, and scientific visualization. Our industrial partner is Lucent Technologies.