Lessons Learned from our Experiments


1) Set realistic goals for the experiment.

During the experimental design phase, our goal was to have the students handle all phases of the software engineering lifecycle, including requirements review, design, design review, implementation, and test. We wanted to gather data at each of these phases. We had the students review the requirements, and then we handed back a "repaired" requirements document where all the defects submitted by the class were removed, but defects not found by the class remained. We had hoped that students would uncover defects during the design and design review phases, and that we could analyze which types of defects were found when. Unfortunately, this goal was too ambitious, and led us down a path that made the class more difficult from a pedagogical standpoint. While witholding from the students the information about the defects remaining in the "fixed" requirements document made sense from an experimental standpoint, it caused some confusion from the students point of view. While we did hold a 30 minute session during class where the students could ask questions about the "fixed" requirements, it was clear that the students were still overwhelmed. A better solution would have been to separate the design review and the requirements review experiments into two smaller experiments, providing the students with closure at the end of each phase. This could improve things from a teaching standpoint and also keep the experiments smaller and more manageable.


2) Do not overestimate the skill level of the test subjects - give them experience with the experimental treatment before the experiment starts and choose reasonable test documents and treatments.

Going into this experiment, we knew that the students involved were mostly undergraduates with little or no professional experience. This was their first class in Software Engineering. While the students were mostly Juniors, Seniors, and first-year Grad students, this was still the first time many of them had seen a real requirements document or UML design techniques. The project in this class may also have been the largest piece of software these students had ever written. To compensate for this lack of experience, we spent a class session training the students on the techniques they would need to apply during the requirements reviews, and another covering the techniques for the design reviews.

The students reported that they had not received enough training in the techniques. In hindsight, we realized that just as the students had never before seen a real requirements document, they had never used any of the experimental techniques. The training lectures are passive experiences for the students. Perhaps they would have performed better during the experiments had they actively practiced the techniques on a sample document prior to the experiment. When the students cannot perform the technique correctly due to lack of experience, the results are biased against the experimental treatment.

An additional difficulty in our case was the length of the checklist during the requirements reading experiment. We used the Volere Template as a foundation for our checklist. We edited it down, but it was still quite long. This checklist also instructed the students to list certain classes of defects (lack of security and maintenance requirements) that were deliberately excluded from the requirements document. Thus the students spent a good deal of time listing as defects items that were outside the scope of the project. In experiments such as this, the checklists should be brief and tailored more carefully to the document being inspected.


3) Keep the documents simple

In previous experiments, subjects had reported that the test documents were unrealistic and obviously designed specifically for an experiment. Partially in response to this, and partially due to our ambitious goals for the experiment, we sought to make the document for the requirements review as much like a "real world" document as possible. We originally sought to get a requirements document from a company in industry, but were unable to do so. A company did give us a requirements document used to train their software engineers, and we used this to flush out our requirements document. Unfortunately neither the students nor anybody on the experimental design team had any domain experience in this area. Additionally, we wrote the requirements document to serve the role of both the experimental document and the requirments document for the system that the students were going to implement. The end result was a requirements document that was significantly larger than the documents used in previous experiments, and one that was difficult to understand in the short period of time the students had for the requirements review.

Compounding the problem, we chose to include defects in the document that we had actually injected ourselves during the writing of the requirements document. The goal of this was to make the defects more realistic. Unfortunately, the result was that the defects were harder to find. Additionally, there is still continued debate as to the list of defects in the document.

In hindsight we probably would have been better served using a shorter requirements document from a domain with which the students had experience.


4) When dealing with student subjects, it is important to carefully choose which aspect of the experiment will be graded, in effort to maximize process conformance.

In a previous iteration of the requirements reading experiment, with graduate students as the test subjects, we found that the students assumed that their grade was directly proportional to the number of defects that they reported. The result was an overabundance of false positives reported by the test subjects. In this experiment we sought to maximize process conformance while simultaneously getting results as close to "real world" results as possible, where the only defects reported are those that the subjects actually believe to be defects. To this end, we emphasized to the students that we would grade the artifacts they produced during the reading experiments and not their fault lists. Additionally, we told the students that only defects they found in the requirements document would be repaired, and that were they to report false positives, they would waste time repairing perfectly good requirements. This last bit was a bluff, we did not ever intend for them to repair the requirements document. However they believed they would have do to so as they performed the requirements reading experiment. The students also knew that the requirements document they were reviewing was the requirements document for the system they would have to build. While the requirements document was too difficult for the requirements review, the students did find that by going through the requirements review process they had a better understanding of the document. It is difficult to judge whether these measures improve process conformance because of the difficulty level of the reading experiements. However this is a key area to address when desigining experiments. When students are the test subjects, the portion of the experiment on which the students will be graded must be carefully chosen.


5) Assign "management" of each portion of the experiment to one experimenter.

In order to run this experiment we had to do a great deal of advanced preparation. The requirements document was created from scratch at the same time that the object-oriented reading techniques were being designed, and multiple people were working on each task. We learned that it is critical to maintain communication between teams when multiple teams are working on different parts of the same experiement at the same time. We learned the hard way that, in order to minimize confusion, it would be a great help for each team to have a leader or manager, with whom the other teams can coordiante. Additionally, it helps to have one person overseeing the entire experiement to help coordinate the various teams.



O-O specific lessons learned


1) O-O reading techniques are a good idea

While we readily admit that we encountered some difficulties with the techniques and with the documents we used, the students nearly unanimously indicated that if they had to read a design document, they would use the techniques they had learned. There is very little guidance available on how to read designs, and any direction we can give students seems to help them.


2) Semantic techniques are better than syntactic techniques

One major reason for the success of the perspective based reading for requirements is that the reading techniques require significant semantic processing by the reader. The reader must understand the requirements document enough to produce an additional artifact (use cases, test plan, data flow diagram). During the course of the semantic analysis, the reader discovers defects.

Our first draft of O-O design reading techniques involved verifying that certain words and attributes appearing in one section of the document also appeared in the proper places in other sections. These techniques are largely syntactic in nature and do not require that the reader do the same level of semantic processing done in the requirements reading.

We discuss this a great deal in the "Detecting Defects in Object-Oriented Designs" paper. (link will be provided shortly)


3) When developing a new technique, test it out before running the experiment.


This may sound obvious. However while we had several people read over the new techniques, we did not have anyone actually implement them. Our techniques involved underlining different words with different colors. However in our reviews, we missed the fact that in one technique, nouns were underlined in one color, and in another technique to be performed by the same person, that same color was used to underline something completely different. The result was that the colors, which we had intended to assist the students, ended up confusing them. This could easily have been avoided by actually performing the tasks before assigning them.


4) Choose the test document carefully.

In our class, we had the student groups each produce a design based on the requirements. We then chose the "best" design from that group and had the students review it using the design review techniques. We then gave each group a choice between using the design that they had reviewed or the design that they themselves created. The downside here was that the "best" design was not very good. After all, this was the first time these students had ever created object oriented designs. One result of this is that it became very difficult to evaluate the effectiveness of the techniques.

We made this choice in part because of our ambitious goals for the experiments as a whole. We wanted to see if the students discovered requirements defects during the design and design review phases. We also overestimated the quality of the designs the students would produce. When we evaluated the designs that they turned in, before turning back the "best" one for review, we realized our error but really did not have much time to correct it. Some of the members of the team who were not as experienced in O-O design had assumed that we could take good portions from multiple designs and create a conglomerate, but that is not possible. At that point we did not have enough time to create a design from scratch for them to evaluate. Additionally, since we did not produce the design ourselves, we did not seed defects, so we had no control over the type and number of defects in the design. In the future, it would be better to run more controlled experiments and provide the students with a design that has been tailored to their skill level, with carefully seeded defects.


Web Accessibility