next up previous
Next: Part 1 Command Specification Up: Part 1: Crash Course Previous: Java- last part, comparators

Adjacency List

The only data structure actually have to implement for project 1 is and adjacency list. You'll recall that an adjacency is conceptually a 'list of lists', ie:
A->C->D
C
D
F->K->J
H->K
J
K
You should implement this with TreeSets, TreeMaps, Comparators, and any other java classes you find handy. (You of course may not use an existing graph class if one already exists. You are also admonished to not use HashMaps/Sets over TreeMap/Sets.)

      
      To implement
      dijkstra's algorithm you will also need a priority queue which the Java
      api sadly does not provide.  This can be implemented through clever use of a 
      TreeSet/Map and Comparators(the problem with a plain set is that you can't
      have two elements with the same priority, so you need a way to always break
      ties).  


      Note that insertion/deletion from this 
      structure is O(log(n)log(m)), where n is the number of nodes in the graph and m is the 
      degree of the graph (the max number of edges incident to a single vertex).  This represents
      a binary search to find the correct row of the list to find the starting vertex, followed
      by a a binary search to check for existance of the ending vertex.  

              
      You'll use the graph to implement shortest path using dijkstra's algorithm.  If you have
      a fibbonacci heap handy the run time of this algorithm is O(V*lg(V)+E) 
      (where E is the number of edges and V is the number of vertices).  And a fibonacci heap
      is what you ask?  Well, I have no idea, but those of you who go on to take algorithms
      classes will no doubt hear about fibonacci heaps again, and I just wanted to be the first
      to share.  Let's just say that they are magical, and that you will probably *never* learn how
      they work.  If you've printed this spec on paper, feel free to X is this paragraph to avoid 
      reading it ever again.  End digression.  


      So anyway, you will be required to do shortest path in O(ElgV).  It would be nice if you
      understand where this bound comes from, so I will explain it below, 
      but it suffices that if you implement the algorithm correctly, that is the runtime you
      will have.


      Allow me to sketch the algorithm to explain the running time(I'm not trying to 
      teach the algorithm here- see your book/class/newsgroup/google).  
      Every iteration of this
      algorithm you are guaranteed to find the correct shortest path to exactly one 
      node- so we know right away there will be V iterations.  At each state your graph is 
      split in two sections- the 'solved' section, for which you know the correct distances,
      and the rest, which have some distance values associated with them which may or may not
      be accurate- this set is stored in some kind of priority queue. An iteration begins by selecting the node (call it 'N')
      from this 
      queue with the best distance value, adding it to the 'solved' set, and 'relaxing' all it's
      edges.  Relaxing is when for each node adjacent to N we see if it is faster  
      to get to that node through
      N than it's current best known path.  We update distances and back-pointers 
      appropriately (so we know what
      the shortest path actually is when we finish), and that ends the round.  Note
      that if a node's distance value is changed, its position in the priority queue has to be 
      fixed up somehow. (this is where the magical fibonnaci heap would come into play, it's got an 
      advantage in this 'fix up' step).  One way to do this is just to have multiple copies
      of the same node in the queue and ignore them when they come back up (this is the approach
      I will use in the explanation), or else to remove the old value before reinserting it. Either works.


      Now, how long does all this take.  There are V rounds, and in every round we have to 
      pull something out of the front of the 
      priority queue- which with a treeMap is an O(lgE) operation, 
      so each round takes O(VlgE) time.  Rather than try and deal 
      with how many elements are added to and removed from the queue in any single round, it 
      is easier to think about how many such operations can occur in the life of the algorithm.
      Every single edge in the graph has exactly one opportunity
      to add a vertex to the queue (during a relax operation), so there are O(E) possible insertions.
      If we allow duplicates, the size of the queue can grow to O(E), so we may treat each 
      queue operation as O(log(E)).  That gives O(ElogE) running time for all queue operations
      during the life of the algorithm.  Together that gives a running time of O(VlgE+ElgE), which
      since E is bounded by $V^2$, is O(ElgV).  And that's the required running time of your search;)


      Please be careful with your implementation details.  In particular, do not use 
      a linear time priority queue.  This nailed a lot of people last semester.  I will time your
      project on *very* large inputs.  Check out:

http://www.cs.umd.edu/class/spring2003/cmsc420/p2graphs/

              
      If you scroll down you'll runtimes for the large shortest path test from last semester.
      A negative time indicates that the test was failed.  The bottom third of students  
      timed out.  They may have found the right answers eventually, but all the same
      they got no credit for that test.  The middle third did not find the correct paths. Only
      one third of the class got the correct output in the 60 second time limit, and you can see
      that most of them were much slower than the best solutions.  Lots of people like to 
      blame java for the slowness of their own code, but before  you start to pick on 
      java you need to make sure you implement your code intelligently ;) That is all, 
      you've been warned.

    
next up previous
Next: Part 1 Command Specification Up: Part 1: Crash Course Previous: Java- last part, comparators
Brian Krznarich 2003-05-24