There are three procedures involved in HeapSort: Heapify, Buildheap, and HeapSort itself, which calls the other two. Buildheap also calls Heapify, so Heapify is key to the everything.
First, though, we need to know what a heap is. A word of caution: ``heap'' is used to mean more than one thing in computer science; the use we will make of it is the original one.
A heap is a special data structure with two key properties, both related to (binary) trees. A picture will help:
level
a 0 2^0 = 1 node
b c 1 2^1 = 2 nodes
d e f g 2 2^2 = 4 nodes
h i j k l 2 5 nodes
Here we have a binary tree of some height
The second property has to do with the relation between each parent
node and its children. For this we need each node to have associated
with it a numerical value. That is, each node can be thought of as a
location with a stored value. The standard way to do this is to form
an array with one location for each node, and the node's value being
stored in that array location. In the example there are 12 nodes, so
we'll need an array
of length 12. The nodes can be thought of as the
array locations, named by the indices
, and their values
are then the data items in
.
If we use the letters a,b,...,l as above for the node values, then the array is this:
----------------------------------
A: a b c d e f g h i j k l
----------------------------------
1 2 3 4 5 6 7 8 9 10 11 12
where we have indicated the positions below the array. Thus the root
value, a, is in location 1, and the leaves are in locations 7 through
12. The root is always in location 1, and the leaves are always in the
rightmost array locations (though of course not necessarily in
locations 7 through 12).
Now at last we can state the parent-child property: the value stored in a leaf node cannot be larger than the value stored at its parent node. (This is for so-called max-heaps; the opposite relation gives a min-heap; unless we say otherwise, we will always mean max-heap when we talk of a heap).
So, a heap is a complex structure satisfying the shape property and
the parent-child property. It is, in a sense, both a tree and an array
at the same time. Typically it is represented in a computer as an
array (how might one store a tree in a computer?) together with
procedures for finding the children and parent of any node. There are
very nice formulas for this: the left child of any parent node
is simply at location
; and if there is a right child,
it is at location
. Similarly, the parent of any child node
is at location
.
Note that the left and right subtrees below the root of a heap, are also heaps. Now at last we describe Heapsort, beginning with Heapify.
Given a heap, the root value is automatically the maximum value, due to the parent-child property. This is the key to using heaps for sorting, as well as other applications. We can take out the root (max) value and store it safely somewhere. This reduces the heapsize (number of nodes), but leaves the root empty; if we replace it with some other value then we might no longer have a heap. But it would almost be a heap (we might call it a near-heap) with the only ``trouble'' being at the top (ie, at the root) That is, the left and right subtrees below the root would still be heaps. So we need a way to ``fix up trouble at the top''. That is the job of Heapify, which checks the parent-child property at the root, moves the larger child up if need be, then recurses to lower levels.
HEAPIFY(A,i) 1 left <- left(i) 2 right <- right(i) 3 if node i has a larger child 4 then swap A[i] and largest child 5 HEAPIFY(A,location largest child was in)
This is fine, but how do we get a heap (or near-heap) to start with? This is the job of BuildHeap, which calls Heapify again and again starting at the lowest parent (rightmost in the array) to make sure that that little piece (parent and children) is a heap and working up to the actual root fixing things as it goes.
BUILDHEAP(A) 1 heapsize <- length(A) 2 for i <- floor(length(A)/2) downto 1 **lowest parent up to root** 3 do HEAPIFY(A,i)
Finally, Heapsort works as follows: it calls Buildheap once, to get things started in heap form. Then it swaps the root value and the lowest (rightmost) value, which puts the max value at the end of the array where it belongs; and the heapsize is decremented so that the max value is no longer considered as part of the heap. Then it calls Heapify at the top (root) to fix things up into a heap again; which means the root value is again the (new) max of the remaining heap values. So again the max is swapped out, and so on until the heap has just one node left.
HEAPSORT(A) 1 BUILDHEAP(A) 2 for i <- length(A) downto 2 **start at bottom and work up** 3 do swap A[1] and A[i] 4 heapsize <- heapsize - 1 5 HEAPIFY(A,1)
Here is an example: sort the array 4 1 3 5 9. Here is the corresponding tree (not yet a heap, though):
4
1 3
5 9
Now we run through the steps of Heapsort:
4 1* 3 5 9 the original array; 5 and 9 are children of the lowest parent (*) 4 9 3 5 1 Buildheap has called Heapify at lowest parent, swapping 9 and 1 9 4 3 5 1 Heapify is called at the next higher parent (happens to be the root) 9 5 3 4 1 Heapify recurses to where larger child was; Buildheap is now finished 1 5 3 4 (9) swap root and lowest value, and take it off the (near-)heap 5 1 3 4 (9) Heapify begins work at root 5 4 3 1 (9) Heapify recurses lower 1 4 3 (5 9) swap root and lowest value, and take it off the (near-)heap 4 1 3 (5 9) Heapify works at root and stops since no lower parents 3 1 (4 5 9) swap root and lowest value, and take it off the (near-)heap 3 1 (4 5 9) Heapify works at root, nothing to do 1 (3 4 5 9) swap root and lowest value, and take it off the (near-)heap 1 3 4 5 9 end of execution
Drawing the tree diagrams for the above steps is a very helpful, and highly recommended, exercise!
Now we turn to the runtime analysis of Heapsort, and in particular, to
the number of data comparisons it makes, in the worst case, for an
-item array.
We begin with Heapify, which is where all the data comparisons are done, in line 3. That line decides which is largest of A[i] and its children (if it has children). The worst case occurs when it has two children, and it then takes two comparisons to find the max of the three values. So each execution of line 3 of Heapify will make at most two comparisons.
Moreover, Heapify is recursive, so the number
of comparisons it makes on
items is
where
is
the number of items in the recursive call. But that call is to the
sub-heap rooted at the location where the large child was; and the
worst case will occur when that subheap is as large as possible. How
many items
can a subheap of a heap with
items have? It is easy
to see that the largest
can be is
; this occurs when the
bottom level is exactly half full, with all its nodes in the left
subheap:
a
b c
d e f g
h i j k * * * *
For then the tree breaks into three parts that are almost exactly
equal: the right subheap, the left subheap minus the bottom level, and
the bottom level. This is because in a binary tree, the maximum
number possible of nodes down to level
We then have the recurrence
, and
.
Solving by the Master Theorem, we get
; on a closer analysis the constant turns out to be 2, i.e.,
.
Now we turn to the runtime for Buildheap. Since the only comparisons
Heapify does are in its
calls to Heapify, then it does at most
comparisons. In fact, since most of those calls involve
only subheaps with fewer then
elements, one can get a tighter
upper bound, namely
; but we will not need this fact.
Finally, the data comparisons done by Heapsort consist of those done
by its one call to Buildheap (with
comparisons), together
with those done by Heapsort's own
calls to Heapify (each of
which makes
comparisons), for a total of
We will find out soon that
is the best possible
lower bound on the number of comparisons a so-called comparison-based
sorting algorithm must make in the worst case. Since Heapsort does that
well, then it is in some sens as good as it can be.
Another application of heaps is priority queues. A (max-)priority queue is a data structure with nodes and values, operations for removing the node with the maximum value, and inserting a new node and value. Uses of priority queues include job-scheduling, where a node represents a job waiting to be executed, and its value represents its priority among other waiting jobs, where the highest-valued node is the next to be removed from the queue and then executed, and possible new jobs are inserted.
Heaps are a common way to implement a priority queue. As an example, here is pseudocode for removing the max node:
REMOVEMAX(A) 1 maxval <- A[1] 2 A[1] <- A[heapsize] **last value is placed at root** 3 heapsize <- heapsize - 1 4 HEAPIFY(A,1) 5 return maxval