To implement an efficient Dijkstra's algorithm you will need a priority queue, which the JAVA API sadly does not provide. This can be implemented through clever use of a TreeSet/Map and Comparators (the problem with a plain set is that you can't have two elements with the same priority, so you need a way to always break ties). However, since we had the Fibonacci heap in Part 1, everyone has access to at least one working F-heap program.
Note that insertion/deletion from this structure is O(log(n)log(m)), where n is the number of nodes in the graph and m is the degree of the graph (the max number of edges incident to a single vertex). This represents a binary search to find the correct row of the list to find the starting vertex, followed by a binary search to check for existence of the ending vertex.
You'll use the graph to implement shortest path using dijkstra's algorithm. Using your handy Fibonacci Heap, the run time of this algorithm is O(V*lg(V)+E) (where E is the number of edges and V is the number of vertices). It would be nice if you understand where this bound comes from, so I will explain it below, but it suffices that if you implement the algorithm correctly, that is the runtime you will have.
Allow me to sketch the algorithm to explain the running time(I'm not trying to teach the algorithm here- see your book/class/newsgroup/google). Every iteration of this algorithm you are guaranteed to find the correct shortest path to exactly one node- so we know right away there will be V iterations. At each state your graph is split in two sections- the 'solved' section, for which you know the correct distances, and the rest, which have some distance values associated with them which may or may not be accurate- this set is stored in some kind of priority queue. An iteration begins by selecting the node (call it 'N') from this queue with the best distance value, adding it to the 'solved' set, and 'relaxing' all it's edges. Relaxing is when for each node adjacent to N we see if it is faster to get to that node through N than it's current best known path. We update distances and back-pointers appropriately (so we know what the shortest path actually is when we finish), and that ends the round. Note that if a node's distance value is changed, its position in the priority queue has to be fixed up somehow. (this is where the magical fibonnaci heap would come into play, it's got an advantage in this 'fix up' step). One way to do this is just to have multiple copies of the same node in the queue and ignore them when they come back up (this is the approach I will use in the explanation), or else to remove the old value before reinserting it. Either works.
Now, how long does all this take. There are V rounds, and in every round we have to
pull something out of the front of the
priority queue using the F-heap is amortized
.
Rather than try and deal
with how many elements are added to and removed from the queue in any single round, it
is easier to think about how many such operations can occur in the life of the algorithm.
Every single edge in the graph has exactly one opportunity
to add a vertex to the queue (during a relax operation), so there are O(E) possible insertions.
If we allow duplicates, the size of the queue can grow to O(E), so we may treat each
That gives (amortized)
running time for all queue operations
during the life of the algorithm. Together that gives a running time of
O(VlgE+E), which is the required running time of your search.
Please be careful with your implementation details. In particular, do not use a linear time priority queue. This nailed a lot of people last semester. We will test your project on *very* large inputs.