(** * Datatypes: An Introduction to Basic Coq Datatypes *) (* ================================================================= *) (** ** Booleans *) (** It's worth taking a look at some of the datatypes in Coq's standard library. Note that all of these are manually defined: None of them are primitive. Let's start with booleans, which should remind us of the _coin_ type in the last chapter. *) Inductive bool : Type := | true | false. (** Although we are rolling our own booleans here for the sake of building up everything from scratch, Coq does, of course, provide a default implementation of the booleans, together with a multitude of useful functions and lemmas. (Take a look at [Coq.Init.Datatypes] in the Coq library documentation if you're interested.) Whenever possible, we'll name our own definitions and theorems so that they exactly coincide with the ones in the standard library. Functions over booleans can be defined in the same way as above: *) Definition negb (b:bool) : bool := match b with | true => false | false => true end. Definition andb (b1 b2: bool) : bool := match b1 with | true => b2 | false => false end. Definition orb (b1 b2:bool) : bool := match b1 with | true => true | false => b2 end. (** The last two of these illustrate Coq's syntax for multi-argument function definitions. The corresponding multi-argument application syntax is illustrated by the following "unit tests," which constitute a complete specification -- a truth table -- for the [orb] function: *) (* While we can use [compute] to check that orb behaves correctly, sometimes it's useful to include unit tests as small lemmas denoted by [Example]. *) Example test_orb1: (orb true false) = true. Proof. apply eq_refl. Qed. Example test_orb2: (orb false false) = false. Proof. apply eq_refl. Qed. Example test_orb3: (orb false true) = true. Proof. apply eq_refl. Qed. Example test_orb4: (orb true true) = true. Proof. apply eq_refl. Qed. (** We can also introduce some familiar syntax for the boolean operations we have just defined. The [Notation] command defines a new symbolic notation for an existing definition. *) Notation "x && y" := (andb x y). Notation "x || y" := (orb x y). Example test_orb5: false || false || true = true. Proof. apply eq_refl. Qed. (** _A note on notation_: In [.v] files, we use square brackets to delimit fragments of Coq code within comments; this convention, also used by the [coqdoc] documentation tool, keeps them visually separate from the surrounding text. In the HTML version of the files, these pieces of text appear in a [different font]. The command [Admitted] can be used as a placeholder for an incomplete proof. We'll use it in exercises, to indicate the parts that we're leaving for you -- i.e., your job is to replace [Admitted]s with real proofs. *) (** **** Exercise: 1 star (nandb) *) (** Remove "[Admitted.]" and complete the definition of the following function; then make sure that the [Example] assertions below can each be verified by Coq. (I.e., fill in each proof, following the model of the [orb] tests above.) The function should return [true] if either or both of its inputs are [false]. *) Definition nandb (b1:bool) (b2:bool) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted. Example test_nandb1: (nandb true false) = true. (* FILL IN HERE *) Admitted. Example test_nandb2: (nandb false false) = true. (* FILL IN HERE *) Admitted. Example test_nandb3: (nandb false true) = true. (* FILL IN HERE *) Admitted. Example test_nandb4: (nandb true true) = false. (* FILL IN HERE *) Admitted. (** [] *) (** **** Exercise: 1 star, optional (andb3) *) (** Do the same for the [andb3] function below. This function should return [true] when all of its inputs are [true], and [false] otherwise. *) Definition andb3 (b1:bool) (b2:bool) (b3:bool) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted. Example test_andb31: (andb3 true true true) = true. (* FILL IN HERE *) Admitted. Example test_andb32: (andb3 false true true) = false. (* FILL IN HERE *) Admitted. Example test_andb33: (andb3 true false true) = false. (* FILL IN HERE *) Admitted. Example test_andb34: (andb3 true true false) = false. (* FILL IN HERE *) Admitted. (** [] *) (* Let's prove some theorems about booleans. The first will look familiar from the previous class. *) Definition negb_involutive (b : bool) : negb (negb b) = b := match b with | false => eq_refl | true => eq_refl end. (* Note that the other direction doesn't work, since doing case analysis on [negb b] or [negb (negb b)] won't tell Coq anything about b *) Fail Definition negb_involutive (b : bool) : negb (negb b) = b := match negb (negb b) with | false => eq_refl | true => eq_refl end. (* In general, we would rather do these proofs inside the proof assistant. As shorthand for [apply eq_refl] we can use Coq's built-in [reflexivity] tactic. *) Theorem negb_involutive' : forall b : bool, negb (negb b) = b. Proof. intros b. destruct b. - reflexivity. - reflexivity. Qed. (** It is sometimes useful to invoke [destruct] inside a subgoal, generating yet more proof obligations. In this case, we use different kinds of bullets to mark goals on different "levels." For example: *) Theorem andb_commutative : forall b c, andb b c = andb c b. Proof. intros b c. destruct b. - destruct c. + reflexivity. + reflexivity. - destruct c eqn:Ec. + reflexivity. + reflexivity. Qed. (* ================================================================= *) (** ** Modules *) (** Coq provides a _module system_, to aid in organizing large developments. In this course we won't need most of its features, but one is useful: If we enclose a collection of declarations between [Module X] and [End X] markers, then, in the remainder of the file after the [End], these definitions are referred to by names like [X.foo] instead of just [foo]. We will use this feature to introduce the definition of the type [nat] in an inner module so that it does not interfere with the one from the standard library (which we want to use in the rest because it comes with a tiny bit of convenient special notation). *) Module NatPlayground. (* ================================================================= *) (** ** Numbers *) (** The types we have defined so far, "enumerated types" such as [coin], [play] and [bool] share the property that each type has a finite set of values. The natural numbers are an infinite set, and we need to represent all of them in a datatype with a finite number of constructors. There are many representations of numbers to choose from. We are most familiar with decimal notation (base 10), using the digits 0 through 9, for example, to form the number 123. You may have encountered hexadecimal notation (base 16), in which the same number is represented as 7B, or octal (base 8), where it is 173, or binary (base 2), where it is 1111011. Using an enumerated type to represent digits, we could use any of these to represent natural numbers. There are circumstances where each of these choices can be useful. Binary is valuable in computer hardware because it can in turn be represented with two voltage levels, resulting in simple circuitry. Analogously, we wish here to choose a representation that makes _proofs_ simpler. Indeed, there is a representation of numbers that is even simpler than binary, namely unary (base 1), in which only a single digit is used (as one might do while counting days in prison by scratching on the walls). To represent unary with a Coq datatype, we use two constructors. The capital-letter [O] constructor represents zero. When the [S] constructor is applied to the representation of the natural number _n_, the result is the representation of _n+1_. ([S] stands for "successor", or "scratch" if one is in prison.) Here is the complete datatype definition. *) Inductive nat : Type := | O | S (n : nat). (** With this definition, 0 is represented by [O], 1 by [S O], 2 by [S (S O)], and so on. *) (** The clauses of this definition can be read: - [O] is a natural number (note that this is the letter "[O]," not the numeral "[0]"). - [S] can be put in front of a natural number to yield another one -- if [n] is a natural number, then [S n] is too. *) (** Again, let's look at this in a little more detail. The definition of [nat] says how expressions in the set [nat] can be built: - [O] and [S] are constructors; - the expression [O] belongs to the set [nat]; - if [n] is an expression belonging to the set [nat], then [S n] is also an expression belonging to the set [nat]; and - expressions formed in these two ways are the only ones belonging to the set [nat]. *) (** The same rules apply for our definitions of [coin], [bool], [play], etc. The above conditions are the precise force of the [Inductive] declaration. They imply that the expression [O], the expression [S O], the expression [S (S O)], the expression [S (S (S O))], and so on all belong to the set [nat], while other expressions built from data constructors, like [true], [andb true false], [S (S false)], and [O (O (O S))] do not. A critical point here is that what we've done so far is just to define a _representation_ of numbers: a way of writing them down. The names [O] and [S] are arbitrary, and at this point they have no special meaning -- they are just two different marks that we can use to write down numbers (together with a rule that says any [nat] will be written as some string of [S] marks followed by an [O]). If we like, we can write essentially the same definition this way: *) Inductive nat' : Type := | stop | tick (foo : nat'). (** The _interpretation_ of these marks comes from how we use them to compute. *) (** We can do this by writing functions that pattern match on representations of natural numbers just as we did above with booleans and days -- for example, here is the predecessor function: *) Definition pred (n : nat) : nat := match n with | O => O | S n' => n' end. (** The second branch can be read: "if [n] has the form [S n'] for some [n'], then return [n']." *) End NatPlayground. (** Because natural numbers are such a pervasive form of data, Coq provides a tiny bit of built-in magic for parsing and printing them: ordinary decimal numerals can be used as an alternative to the "unary" notation defined by the constructors [S] and [O]. Coq prints numbers in decimal form by default: *) Check (S (S (S (S O)))). (* ===> 4 : nat *) Definition minustwo (n : nat) : nat := match n with | O => O | S O => O | S (S n') => n' end. Compute (minustwo 4). (* ===> 2 : nat *) (** The constructor [S] has the type [nat -> nat], just like [pred] and functions like [minustwo]: *) Check S. Check pred. Check minustwo. (** These are all things that can be applied to a number to yield a number. However, there is a fundamental difference between the first one and the other two: functions like [pred] and [minustwo] come with _computation rules_ -- e.g., the definition of [pred] says that [pred 2] can be simplified to [1] -- while the definition of [S] has no such behavior attached. Although it is like a function in the sense that it can be applied to an argument, it does not _do_ anything at all! It is just a way of writing down numbers. (Think about standard decimal numerals: the numeral [1] is not a computation; it's a piece of data. When we write [111] to mean the number one hundred and eleven, we are using [1], three times, to write down a concrete representation of a number.) For most function definitions over numbers, just pattern matching is not enough: we also need recursion. For example, to check that a number [n] is even, we may need to recursively check whether [n-2] is even. To write such functions, we use the keyword [Fixpoint]. *) Fixpoint evenb (n:nat) : bool := match n with | O => true | S O => false | S (S n') => evenb n' end. (** We can define [oddb] by a similar [Fixpoint] declaration, but here is a simpler definition: *) Definition oddb (n:nat) : bool := negb (evenb n). Example test_oddb1: oddb 1 = true. Proof. apply eq_refl. Qed. Example test_oddb2: oddb 4 = false. Proof. apply eq_refl. Qed. (** Naturally, we can also define multi-argument functions by recursion. *) Module NatPlayground2. Fixpoint plus (n : nat) (m : nat) : nat := match n with | O => m | S n' => S (plus n' m) end. (** Adding three to two now gives us five, as we'd expect. *) Compute (plus 3 2). (** The simplification that Coq performs to reach this conclusion can be visualized as follows: *) (* [plus (S (S (S O))) (S (S O))] ==> [S (plus (S (S O)) (S (S O)))] by the second clause of the [match] ==> [S (S (plus (S O) (S (S O))))] by the second clause of the [match] ==> [S (S (S (plus O (S (S O)))))] by the second clause of the [match] ==> [S (S (S (S (S O))))] by the first clause of the [match] *) (** As a notational convenience, if two or more arguments have the same type, they can be written together. In the following definition, [(n m : nat)] means just the same as if we had written [(n : nat) (m : nat)]. *) Fixpoint mult (n m : nat) : nat := match n with | O => O | S n' => plus m (mult n' m) end. Example test_mult1: (mult 3 3) = 9. Proof. apply eq_refl. Qed. (** You can match two expressions at once by putting a comma between them: *) Fixpoint minus (n m:nat) : nat := match n, m with | O , _ => O | S _ , O => n | S n', S m' => minus n' m' end. End NatPlayground2. Fixpoint exp (base power : nat) : nat := match power with | O => S O | S p => mult base (exp base p) end. (** **** Exercise: 1 star (factorial) *) (** Recall the standard mathematical factorial function: factorial(0) = 1 factorial(n) = n * factorial(n-1) (if n>0) Translate this into Coq. *) Fixpoint factorial (n:nat) : nat (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted. Example test_factorial1: (factorial 3) = 6. (* FILL IN HERE *) Admitted. Example test_factorial2: (factorial 5) = (mult 10 12). (* FILL IN HERE *) Admitted. (** [] *) (** Again, we can make numerical expressions easier to read and write by introducing notations for addition, multiplication, and subtraction. *) Notation "x + y" := (plus x y) (at level 50, left associativity) : nat_scope. Notation "x - y" := (minus x y) (at level 50, left associativity) : nat_scope. Notation "x * y" := (mult x y) (at level 40, left associativity) : nat_scope. Check ((0 + 1) + 1). (** (The [level], [associativity], and [nat_scope] annotations control how these notations are treated by Coq's parser. The details are not important for our purposes, but interested readers can refer to the "More on Notation" section at the end of this chapter.) Note that these do not change the definitions we've already made: they are simply instructions to the Coq parser to accept [x + y] in place of [plus x y] and, conversely, to the Coq pretty-printer to display [plus x y] as [x + y]. *) (** When we say that Coq comes with almost nothing built-in, we really mean it: even equality testing is a user-defined operation! Here is a function [eqb], which tests natural numbers for [eq]uality, yielding a [b]oolean. Note the use of nested [match]es (we could also have used a simultaneous match, as we did in [minus].) *) Fixpoint eqb (n m : nat) : bool := match n with | O => match m with | O => true | S m' => false end | S n' => match m with | O => false | S m' => eqb n' m' end end. (** Similarly, the [leb] function tests whether its first argument is less than or equal to its second argument, yielding a boolean. *) Fixpoint leb (n m : nat) : bool := match n with | O => true | S n' => match m with | O => false | S m' => leb n' m' end end. Example test_leb1: (leb 2 2) = true. Proof. apply eq_refl. Qed. Example test_leb2: (leb 2 4) = true. Proof. apply eq_refl. Qed. Example test_leb3: (leb 4 2) = false. Proof. apply eq_refl. Qed. (** Since we'll be using these (especially [eqb]) a lot, let's give them infix notations. *) Notation "x =? y" := (eqb x y) (at level 70) : nat_scope. Notation "x <=? y" := (leb x y) (at level 70) : nat_scope. Example test_leb3': (4 <=? 2) = false. Proof. simpl. reflexivity. Qed. (** **** Exercise: 1 star (ltb) *) (** The [ltb] function tests natural numbers for [l]ess-[t]han, yielding a [b]oolean. Instead of making up a new [Fixpoint] for this one, define it in terms of a previously defined function. (It can be done with just one previously defined function, but you can use two if you need to.) *) Definition ltb (n m : nat) : bool (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted. Notation "x plus_O_n. reflexivity. Qed. (** **** Exercise: 2 stars (mult_S_1) *) Theorem mult_S_1 : forall n m : nat, m = S n -> m * (1 + n) = m * m. Proof. (* FILL IN HERE *) Admitted. (* (N.b. This proof can actually be completed with tactics other than [rewrite], but please do use [rewrite] for the sake of the exercise.) *) (** [] *) (* ################################################################# *) (** * Case Analysis on Inductive Types *) (** Sometimes we will need to do case analysis on non-simple inductive types, like the natural numbers. In this case, we will get two subcases: [n = 0] and [n = S x] for some [x]. In this case, we will want to name any new variables that appear. *) Theorem plus_1_neq_0 : forall n : nat, (n + 1) =? 0 = false. Proof. intros n. destruct n as [| n'] eqn:E. - reflexivity. - reflexivity. Qed. (** Before closing the chapter, let's mention one final convenience. As you may have noticed, many proofs perform case analysis on a variable right after introducing it: intros x y. destruct y as [|y] eqn:E. This pattern is so common that Coq provides a shorthand for it: we can perform case analysis on a variable when introducing it by using an intro pattern instead of a variable name. For instance, here is a shorter proof of the [plus_1_neq_0] theorem above. (You'll also note one downside of this shorthand: we lose the equation recording the assumption we are making in each subgoal, which we previously got from the [eqn:E] annotation.) *) Theorem plus_1_neq_0' : forall n : nat, (n + 1) =? 0 = false. Proof. intros [|n]. - reflexivity. - reflexivity. Qed. (** **** Exercise: 1 star (zero_nbeq_plus_1) *) Theorem zero_nbeq_plus_1 : forall n : nat, 0 =? (n + 1) = false. Proof. (* FILL IN HERE *) Admitted. (** [] *) (* ================================================================= *) (** ** More on Notation (Optional) *) (** (In general, sections marked Optional are not needed to follow the rest of the book, except possibly other Optional sections. On a first reading, you might want to skim these sections so that you know what's there for future reference.) Recall the notation definitions for infix plus and times: *) Notation "x + y" := (plus x y) (at level 50, left associativity) : nat_scope. Notation "x * y" := (mult x y) (at level 40, left associativity) : nat_scope. (** For each notation symbol in Coq, we can specify its _precedence level_ and its _associativity_. The precedence level [n] is specified by writing [at level n]; this helps Coq parse compound expressions. The associativity setting helps to disambiguate expressions containing multiple occurrences of the same symbol. For example, the parameters specified above for [+] and [*] say that the expression [1+2*3*4] is shorthand for [(1+((2*3)*4))]. Coq uses precedence levels from 0 to 100, and _left_, _right_, or _no_ associativity. We will see more examples of this later, e.g., in the [Lists] chapter. Each notation symbol is also associated with a _notation scope_. Coq tries to guess what scope is meant from context, so when it sees [S(O*O)] it guesses [nat_scope], but when it sees the cartesian product (tuple) type [bool*bool] (which we'll see in later chapters) it guesses [type_scope]. Occasionally, it is necessary to help it out with percent-notation by writing [(x*y)%nat], and sometimes in what Coq prints it will use [%nat] to indicate what scope a notation is in. Notation scopes also apply to numeral notation ([3], [4], [5], etc.), so you may sometimes see [0%nat], which means [O] (the natural number [0] that we're using in this chapter), or [0%Z], which means the Integer zero (which comes from a different part of the standard library). Pro tip: Coq's notation mechanism is not especially powerful. Don't expect too much from it! *) (* ================================================================= *) (** ** Fixpoints and Structural Recursion (Optional) *) (** Here is a copy of the definition of addition: *) Fixpoint plus' (n : nat) (m : nat) : nat := match n with | O => m | S n' => S (plus' n' m) end. (** When Coq checks this definition, it notes that [plus'] is "decreasing on 1st argument." What this means is that we are performing a _structural recursion_ over the argument [n] -- i.e., that we make recursive calls only on strictly smaller values of [n]. This implies that all calls to [plus'] will eventually terminate. Coq demands that some argument of _every_ [Fixpoint] definition is "decreasing." This requirement is a fundamental feature of Coq's design: In particular, it guarantees that every function that can be defined in Coq will terminate on all inputs. However, because Coq's "decreasing analysis" is not very sophisticated, it is sometimes necessary to write functions in slightly unnatural ways. *) (** **** Exercise: 2 stars, optional (decreasing) *) (** To get a concrete sense of this, find a way to write a sensible [Fixpoint] definition (of a simple function on numbers, say) that _does_ terminate on all inputs, but that Coq will reject because of this restriction. (If you choose to turn in this optional exercise as part of a homework assignment, make sure you comment out your solution so that it doesn't cause Coq to reject the whole file!) *) (* FILL IN HERE *) (** [] *) (* ################################################################# *) (** * Optional Exercises *) (** **** Exercise: 2 stars, optional (andb_true_elim2) *) (** Prove the following claim using [rewrite] and [destruct] *) Theorem andb_true_elim2 : forall b c : bool, andb b c = true -> c = true. Proof. (* FILL IN HERE *) Admitted. (** [] *) (** **** Exercise: 1 star, optional (indentity_fn_applied_twice) *) (** Use the tactics you have learned so far to prove the following theorem about boolean functions. *) Theorem identity_fn_applied_twice : forall (f : bool -> bool), (forall (x : bool), f x = x) -> forall (b : bool), f (f b) = b. Proof. (* FILL IN HERE *) Admitted. (** [] *) (** **** Exercise: 1 star, optional (negation_fn_applied_twice) *) (** Now state and prove a theorem [negation_fn_applied_twice] similar to the previous one but where the second hypothesis says that the function [f] has the property that [f x = negb x].*) (* FILL IN HERE *) (* The [Import] statement on the next line tells Coq to use the standard library String module. We'll use strings more in later chapters, but for the moment we just need syntax for literal strings for the grader comments. *) From Coq Require Export String. (* Do not modify the following line: *) Definition manual_grade_for_negation_fn_applied_twice : option (nat*string) := None. (** [] *) (** **** Exercise: 3 stars, optional (andb_eq_orb) *) (** Prove the following theorem. (Hint: This one can be a bit tricky, depending on how you approach it. You will probably need both [destruct] and [rewrite], but destructing everything in sight is not the best way.) *) Theorem andb_eq_orb : forall (b c : bool), (andb b c = orb b c) -> b = c. Proof. (* FILL IN HERE *) Admitted. (** [] *) (** **** Exercise: 3 stars, optional (binary) *) (** We can generalize our unary representation of natural numbers to the more efficient binary representation by treating a binary number as a sequence of constructors [A] and [B] (representing 0s and 1s), terminated by a [Z]. For comparison, in the unary representation, a number is a sequence of [S]s terminated by an [O]. For example: decimal binary unary 0 Z O 1 B Z S O 2 A (B Z) S (S O) 3 B (B Z) S (S (S O)) 4 A (A (B Z)) S (S (S (S O))) 5 B (A (B Z)) S (S (S (S (S O)))) 6 A (B (B Z)) S (S (S (S (S (S O))))) 7 B (B (B Z)) S (S (S (S (S (S (S O)))))) 8 A (A (A (B Z))) S (S (S (S (S (S (S (S O))))))) Note that the low-order bit is on the left and the high-order bit is on the right -- the opposite of the way binary numbers are usually written. This choice makes them easier to manipulate. *) Inductive bin : Type := | Z | A (n : bin) | B (n : bin). (** (a) Complete the definitions below of an increment function [incr] for binary numbers, and a function [bin_to_nat] to convert binary numbers to unary numbers. *) Fixpoint incr (m:bin) : bin (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted. Fixpoint bin_to_nat (m:bin) : nat (* REPLACE THIS LINE WITH ":= _your_definition_ ." *). Admitted. (** (b) Write five unit tests [test_bin_incr1], [test_bin_incr2], etc. for your increment and binary-to-unary functions. (A "unit test" in Coq is a specific [Example] that can be proved with just [reflexivity], as we've done for several of our definitions.) Notice that incrementing a binary number and then converting it to unary should yield the same result as first converting it to unary and then incrementing. *) (* FILL IN HERE *) (** [] *) (** NEW NAME: The next line is a temporary hack to allow [zero_nbeq_plus_1] to be used as a synonym for the "more up-to-date" (i.e., consistent with the Coq library) name [zero_neqb_plus_1]... *) Notation zero_neqb_plus_1 := zero_nbeq_plus_1 (only parsing).