Nearest Neighbor Algorithm

The nearest neighbor query ranks all objects in terms of their distance from a query object which can be one of the following types:

  1. Point: Specified by positioning the mouse over the point and clicking it with the left button.
  2. Rectangle: Specified by positioning the mouse over a point that serves as one vertex, clicking it with the left button, and then dragging the mouse to the position of the diagonally opposite vertex, and releasing.
  3. Polygon: Specified by positioning the mouse over a point that serves as one vertex, clicking it with the left button, and then repeatedly drag and click, until reaching a position for the final vertex at which a click operation is performed with the right button and the polygon is closed by the implicit creation of an edge joining it with the first vertex.
  4. Path: Specified by positioning the mouse over a point that serves as one vertex, clicking it with the left button, and then repeatedly drag and click, until reaching a position for the final vertex at which a click operation is performed with the right button.
  5. Sector: Specified by positioning the mouse over a point that serves as the root of the sector, clicking it with the left button, using the middle button to set the starting angle, and using the right button to input the extent of the sector.
The objects are displayed in the order of their distance from the query object along with their position in the ranking.

The neighboring objects are found in an incremental manner. In other words, having found the k nearest neighbors, in order to find the k+1st nearest neighbor, the algorithm does not recompute the set of k+1 nearest neighbors; it just finds the additional neighbor. The incremental nearest neighbor algorithm (see G. R. Hjaltason and H. Samet, Ranking in spatial databases in Advances in Spatial Databases - 4th Symposium, SSD'95, M. J. Egenhofer and J. R. Herring, Eds., Lecture Notes in Computer Science 951, Springer-Verlag, Berlin, 1995, 83-95; G. R. Hjaltason and H. Samet, Distance browsing in spatial databases, ACM Transactions on Database Systems, 24(2):265-318, June 1999; and H. Samet, Foundations of Multidimensional and Metric Data Structures, Morgan-Kaufmann, San Francisco, pp. 490-501) makes use of a priority queue where the queue elements are the blocks of the underlying data structure as well as the objects themselves. The priority queue is ordered on the basis of the distance of its elements from the location of the query object which is a point in our implementation. In case of a tie between two spatial objects (i.e., two non-block objects have the same distance from query point p ) and if the distance is zero and if the object has extent and area (i.e., a rectangle), then use the distance from p to the nearest boundary of an object that contains p (if such an object exists) as the discriminator for the ordering.

The algorithm works in a top-down manner in the sense that as elements are removed from the queue, they are checked if they correspond to blocks that are not at the lowest level of the hierarchy (i.e., nonleaf nodes). If this is the case, then their immediate descendants (i.e., the sons) are inserted in the queue ordered according to their distance from the query object. Otherwise, the objects that they contain are inserted into the queue ordered according to their distance from the query object. If the element e that has been removed from the queue is a data object, then e is reported as the next nearest neighbor of the query object.

In order to be able to visualize the behavior of the incremental nearest neighbor algorithm, at any instance of time we distinguish between the following entities by using different colors in a consistent way for all of the data structures and data types:

  1. Blocks in the priority queue denoted by light blue.
  2. Objects in the priority queue denoted by green.
  3. Objects that have not yet been processed (i.e., entered explicitly into the queue or output into the ranking) denoted by red.
  4. Objects that have been processed and hence have been ranked denoted by blue. The numeric position of the object in the ranking is displayed in blue next to the object, unless the "Blend" option was selected at the time at which the query object was specified in which case the ranking is displayed in colors ranging from red to green.
  5. Blocks that have been processed (although their objects may still be in the queue) denoted by gray.
  6. The next item in the queue to be processed (could be a block or an object) denoted by yellow.
  7. The query object denoted by orange.