Although experimentation is an accepted approach toward scientific validation in most scientific disciplines, it only recently has gained acceptance within the software development community. However, each scientific discipline has its own methods for validating its theories. But borrowing the experimental model of the particle accelerator from physics or the human factors studies from psychology may not be appropriate for computer science.

In this study, we are looking at experimental designs that are appropriate for validating claims within the software engineering community. What models have been used, what models are effective, and what data can we collect from them in order to provide for "good science?"


To develop a theory for software engineering experimentation in order to understand how empirical studies can affect the development of new software engineering technology.


Experimental designs, measurement data, data analysis, data modeling, experimental software engineering


Marvin Zelkowitz Dolores Wallace (National Institute for Standards and Technology, Information Technology Laboratory)


In this paper we discuss a 12 model classification scheme for performing experimentation within the software development domain. We evaluate over 600 published papers in the computer science literature to determine how well the computer science community is succeeding at validating its theories.

In this paper we summarize our 12 model classification scheme described above and extend the analysis to over 100 papers from other scientific disciplines. We also consider the role that theory plays in developing validation models. ( Postscript version of paper).

