Projects of the Experimental Software Engineering Group at the University of Maryland

Empirical Modeling:

Optimized Set Reduction

Problem

In order to plan, control, and evaluate the software development process, one needs to collect and analyze historical data from similar projects. Classical techniques for data analysis have limitation when used on software engineering data due to the following inherent constraints:
  • it is very difficult to make valid assumptions about the relationships between the variables and the probability distributions of variables on their ranges;
  • data sets can contain both continuous and discrete explanatory variables;
  • outliers can occur for both explanatory and dependent variables;
  • the interdependencies of explanatory variables can affect the understandability of models but are not always harmful to their accuracy;
  • an independent variable may be a much stronger facotr on a particular part of its range/value domain;
  • missing information is a common problem in software measurement.

Goal

Based on the specific constraints of software engineering, design alternatives to classical techniques for data analysis in order to build more interpretable, more accurate, easier to use empirical models.

Keywords

Data analysis, classification, prediction, empirical modeling, machine learning, stochastic modeling, OSR, quality evaluation

Participants

Lionel Briand, Bill Thomas, Chris Hetmanski, Victor R. Basili

References

Modeling and Managing Risk Early in Software Development.
L. Briand , W. Thomas , C. Hetmanski.
In Proc. of the 15th Int'l Conf. on Software Engineering, pp. 55-65, Baltimore, May 1993.
Developing Interpretable Models for Identifying High Risk Software
L. Briand , V. Basili, and C. Hetmanski
IEEE Transactions on Software Engineering, 19(11):1028-1044, November 1993.
A Pattern Recognition Approach for Software Engineering Data Analysis.
L. Briand , V. Basili, and W. Thomas .
IEEE Transactions on Software Engineering, 18(11):931-942, November 1992.

More Project Info


<-Back to ESEG Home Page
Last updated: January 13, 1997 by Filippo Lanubile

Web Accessibility