|
C M S C 2 1 4 C o m p u t e r S c i e n c e I I F a l l 2 0 0 2 |
Since many computer scientists have strong math backgrounds, it's only natural to think that math might help in creating better software. Math is good at many things. In particular, it's good at being precise and unambiguous. If you understand math, then you can be much more precise in your description, at least, more so than in English or other natural languages.
In essence, that's where "design by contract" comes from. The idea is to describe properties of the classes. These properties are called: preconditions, postconditions, and class invariants
Let's define all three, and demonstrate the basics of design by contract on a stack class.
A method is a short name for a member function, which is the name used in another object oriented language: Smalltalk.
A condition is a logical statement that is either true or false, depending on the current values used. It can be written in English (or some other natural language), but is less precise when written in English.
You can write it in first order predicate logic. This gives you precision, but then you have to make sure that it translates to the code you've written.
So, a reasonable solution is to write it, as best you can, in first order predicate logic (or something close) and implement the condition as a helper function. By implementing it as a function, you have the added benefit that you can test whether the condition is being met or not.
If the person calling the method fails to observe the precondition, by passing in a negative argument, the depositMoney() may actually cause the account to become negative. This is actually OK, because that's the purpose of the precondition. It's a warning to the caller of the method to follow the precondition, and if the caller fails to heed the warning, then the object will (potentially) be corrupted.
Sometimes Or, if you have withdrawMoney(), the precondition may affect both the parameter (e.g., money > 0) and the data member (e.g., balance > 0).
You might, however, make a precondition for a method to be "true". This means, you place no restrictions on the input or the data members. Again, consider depositMoney(). You could allow the user to send in money that's negative. If negative, nothing is done (since it wouldn't make sense).
The purpose of a postcondition is threefold:
In a nutshell, the postconditions tell you what happened after calling the method. The reason it's difficult to specify a postcondition is that it can be difficult to describe the underlying data structure.
For example, if you have a linked list, how do you describe it, so that it reflects the nature of the linked list? The answer is often: you don't. You may have to come up with some math notation that describes the essentials of what is happening.
Here's an example. For an unsorted list, you might add an element
to the front of back. So, we can write the following.
list = elt + list'
where + is basically a concatentation operator, and elt is appended
at the front of a list.
There's another reason postconditions are harder to specify. You have to describe what changed. That means, you have to describe what used to be, and what is now.
If you're careful, you may have noticed a quote. Call this mark, "prime". list' (read: list prime) will be the list as it was just as the method was being called. list is the list at the end of the call.
Thus, list = elt + list' means that the new list is the old list with elt attached at front. This becomes even more of a problem to describe for trees.
Ultimately, it's easier to write code to test postconditions than to describe it in English/first order logic. However, we'll do both, when we can, since a person who's looking at the classes will want to read it.
The only times it might not be sorted is while a method is running. For example, if you add an element to a list, it may temporarily be unsorted, but then you sort.
Class invariants are useful because they help you avoid writing unnecessary code, and also help you think about what kind of behavior a class ought to have.
Again, consider the sorted list example. Suppose you implement the sorted list using an STL vector as a data member. STL vectors can be sorted by calling the sort algorithm.
You might be tempted to sort the vector data member all the time. For example, suppose you're trying to determine whether some element is in the sorted list. You decide, "I will sort it first, because I sort it elsewhere". But that's totally unnecessary, and in fact, may not compile.
A class invariant for a sorted list is that it's sorted. That means, you can assume, prior to running any method (except constructors) that the list is sorted. If the method doesn't change the list, then it doesn't make sense to sort it, even if you "just want to be safe". In fact, if it doesn't change the list, then the method is probably const, and if it's const, you won't be able to sort without causing compiler errors.
The class invariant should hold after an object has been constructed, before it has been destructed, and the only time it may not hold is while a method of the class is being run. Prior to and after a method completes, the class invariant should hold.
Let's come up with a list of class invariants. These invariants may change based on the implementation.
Since we're using a linked list to represent the stack, we can use as much memory as needed, to the limits of what the runtime system will provide. Thus, the precondition is simply "true", meaning that there's nothing special required of the stack of the string sent as parameter. If, however, a static array were used, then its size would be fixed, and a precondition would have to be that the stack size is smaller than the maximum stack size.
The postcondition is trickier. It requires that we keep track of the size prior to adding, and also the data elements prior to adding. We'll refer to the old size as size' (size prime) and the previous elements as list'.
Thus, the postcondition is that size = size' + 1 and that the list is list' plus the str just added.
Later on, we'll discuss how to write a testing function for this. In the meanwhile, let's write down pre and post conditions.
The real difficulty with describing the postcondition is to use some math terminology, which is precise and short, so that we avoid discussing code, which can be verbose. When using math terminology, it can be easy to lose "precision". For example, what does list == str + list' exactly mean?
The precondition for pop() is a little trickier. Theoretically, you can add as many elements to a stack as you want (in reality, you worry about the amount of memory available). However, you can't remove as many elements as you want. In particular, if the stack is empty, there's no more elements to remove. What should you do?
We can establish that the precondition is size > 0. A precondition is like a warning label. It says that if the precondition is not met, then the operation may fail. This could be as "harmless" as simply not recording the data, to as serious as corrupting data or core dumping.
But suppose you did the following. You tested, within the pop() code if the size was 0, and if not, you returned back a default "error" object (something the user would be able to tell). In that case, the precondition would be "true", because the code actually takes care of the case when the stack is empty. The precondition indicates when you will correctly handle the call, and "true" is more accurate than size > 0.
The precondition and postcondition for size() is both true. size() is a query method. That means, it doesn't modify the data members. Therefore, there's no postcondition. We can leave it as "true". Or you can say "result == size" which indicates what that "size" was returned.
Like size(), there's not really much of a precondition. You can always call this method. The postcondition only applies to the return value, which returns true if size is equal to 0.
You can only peek when there's something to peek at. The difficult is describing what gets returned as a result.
From a user's perspective, the conditions should be unaffected by the actual implementation. So, either one needs to present a model to the user and give model variables (for example, let the user think there is a size data member, regardless of whether there is one or not), or to describe things purely in terms of the operations provided, which may be harder to do (algebraic descriptions try to do this).
However, from an implementer's perspective, writing down the conditions based on the implementation help specify the implementation, so that it's more likely to be correct.
Generally, we'll favor the implementer's perspective, since the goal is to write conditions that help us code the class, but it helps to realize when writing conditions (or parts of conditions) whether those conditions are from the user's or implementer's perspective.