On this page:
8.1 Insertion Sort
8.17

8 Sorting🔗

    8.1 Insertion Sort

We now turn to a study of functional algorithms. The next example will focus on sorting, beginning with an examination of the formal specification of sorting.

We’ll be working on our familiar lists.

Dafny

datatype List<X> = Nil | Cons(head: X, tail: List<X>)

function Length<T>(xs: List<T>): nat {
  match xs
  case Nil => 0
  case Cons(_, tail) => 1 + Length(tail)
}

function Append<X>(xs: List<X>, ys: List<X>): List<X>
  ensures Length(Append(xs, ys)) == Length(xs) + Length(ys)
{
  match xs
  case Nil => ys
  case Cons(x, tail) => Cons(x, Append(tail, ys))
}

predicate Member<X(==)>(x: X, xs: List<X>) {
  match xs
  case Nil => false
  case Cons(y, xs') => x == y || Member(x, xs')
}

function At<X>(xs: List<X>, i: nat): X
  requires i < Length(xs)
{
  if i == 0 then xs.head else At(xs.tail, i - 1)
}

In general, we want to define sorting in a generic way—that is, for lists of any type. We can achieve this by using an lt function, as introduced in the previous lecture. However, to keep our attention on the algorithms rather than the properties of lt, we will, for now, restrict ourselves to integer orderings.

type IntList = List<int>

To check whether a list is sorted, we must compare elements in pairs. One way to do this is as follows.

Dafny

predicate Ordered(xs: IntList) {
  match xs
  case Nil => true
  case Cons(_, Nil) => true
  case Cons(x, Cons(y, _)) => x <= y && Ordered(xs.tail)
}

Because match cases are evaluated in order, we can write the same function more concisely as follows:

Dafny

predicate Ordered'(xs: IntList) {
  match xs
  case Cons(x, Cons(y, _)) => x <= y && Ordered'(xs.tail)
  case _ => true
}

When writing a specification, it’s possible to make mistakes—just as we can when writing a program. One way to check a specification is to prove certain properties about it, ensuring it behaves as expected. Here are two such properties:

Dafny

// The smallest element of a list is at its head
lemma HeadIsSmallest(xs: IntList, y: int)
  requires Ordered(xs) && xs.Cons?
  requires Member(y, xs)
  ensures xs.head <= y
{
}

More generally, we can say that any two elements in a list are ordered according to their positions.

Dafny

lemma AllOrdered(xs: IntList, i: nat, j: nat)
  requires Ordered(xs) && i <= j < Length(xs)
  ensures At(xs, i) <= At(xs, j)
{
  if i != 0 {
    // let's move closer to the elements i and j
    AllOrdered(xs.tail, i - 1, j - 1);
  } else if i == j {
    // easy
  } else {
    AllOrdered(xs.tail, 0, j - 1);
  }
}

A function isn’t a sorting function simply because it returns an ordered list. A true sorting function must return exactly the same elements it was given, only rearranged into order. There are several ways to formalize this “same elements” property. One common approach is to compare the counts of each possible element, requiring that the input and output lists match for every value.

Dafny

function Count<X(==)>(xs: List<X>, k: X): nat {
  match xs
  case Nil => 0
  case Cons(x, tail) => (if k == x then 1 else 0) + Count(tail, k)
}

8.1 Insertion Sort🔗

With the notions of Ordered and Count, we can now formally define what it means for a function to be a sorting function. As a first example, let’s implement the classic insertion sort algorithm.

Dafny

function InsertionSort(xs: IntList): IntList {
  match xs
  case Nil => Nil
  case Cons(x, tail) =>
    var sortedTail := InsertionSort(tail);
    Insert(x, sortedTail)
}

Dafny

function Insert(x: int, xs: IntList): IntList {
  match xs
  case Nil => Cons(x, Nil)
  case Cons(y, tail) =>
    if x <= y then Cons(x, xs) else Cons(y, Insert(x, tail))
}

Notice that Insert is defined for any IntList, not only for sorted ones. However, we will only apply it to sorted lists. Therefore, when proving properties about Insert, we will restrict our lemmas to that case.

To prove that InsertionSort is correct, we must show two things: it produces an ordered list (using Ordered), and it preserves the number of elements (using Count).

Here is the lemma for the first of those two tasks:

Dafny

lemma InsertionSortOrdered(xs: IntList)
  ensures Ordered(InsertionSort(xs))
{
  match xs
  case Nil =>
    // easy
  case Cons(x, tail) =>
    // When functions get complicated, it can give us peace of mind
    // to start by writing down what the function we're trying to
    // prove something about does.
    var sortedTail := InsertionSort(tail);
    var result := Insert(x, sortedTail);
    assert result == InsertionSort(xs);
    // this assertion makes sure we copied the previous two lines
    // correctly

    // We can start writing out the proof in detail. But if we
    // think the situation looks pretty simple, we can start by
    // calling the induction hypothesis in a way analogous to
    // how the function is used.
    InsertionSortOrdered(tail);

    // This was not enough. Not too surprising, actually. We'll need
    // to know something about what Insert does. We'll define a
    // lemma InsertOrdered (below) and call it from here.
    InsertOrdered(x, sortedTail);
}

Dafny

lemma InsertOrdered(x: int, xs: IntList)
  requires Ordered(xs)
  ensures Ordered(Insert(x, xs))
{
  // This proof is completed by automatic induction.
}

Here is a lemma for the second task, showing that InsertionSort preserves element counts. The easiest way to write such a lemma is to parameterize it by an arbitrary element.

Dafny

lemma InsertionSortPreservesCounts(xs: IntList, k: int)
  ensures Count(xs, k) == Count(InsertionSort(xs), k)
{
  match xs
  case Nil =>
  case Cons(x, tail) =>
    var sortedTail := InsertionSort(tail);
    var result := Insert(x, sortedTail);
    assert result == InsertionSort(xs);

    calc {
      Count(InsertionSort(xs), k);
      Count(Insert(x, sortedTail), k);
      { InsertCount(x, sortedTail, k); }
      (if k == x then 1 else 0) + Count(sortedTail, k);
      { InsertionSortPreservesCounts(tail, k); }
      (if k == x then 1 else 0) + Count(tail, k);
      Count(xs, k);
    }
}


lemma InsertCount(x: int, xs: IntList, k: int)
  ensures Count(Insert(x, xs), k) == (if k == x then 1 else 0)
                                     + Count(xs, k)
{
}

This concludes the proof of correctness for InsertionSort, which has a time complexity of O(n²). In class, we focused exclusively on Insertion Sort. If you’re curious, try to write and verify MergeSort, which has a more efficient time complexity of O(n log n).