DYNAMAPS:   DYNAMIC QUERIES ON CHOROPLETH MAPS

 

Gunjan Dang, Chris North, Human - Computer Interaction Lab,

University of Maryland, College Park MD 20742

Draft 1.0   5/4/2000

 

Introduction:

 

One of the biggest challenges faced by organizations that make an increasing amount of data available to the users nowadays, is the need to represent it in a comprehensible form.

The Census Bureau is a perfect example of an organization that first collects an enormous amount of data, and then needs to disseminate this information to the public. It is necessary that this information be represented in a way that is both convenient to visualize and easy to analyze.

 

Most conventional database systems demand that the users possess the required skill set to be able to formulate queries and also presumes the users’ familiarity with the structure of the database and other details. This expectation of the end-users can hardly be considered reasonable.

 

What is required is a way that enables the users an easy grasp of the content of the data in its entirety or maybe just portions that might be relevant, without having to make large effort. The users might be looking for answers to specific questions, or might be interested in noting trends or exceptions inherent in the data. Studies have shown that representing data in an animated visual form and allowing dynamic user control over it, is an efficient way to facilitate easy comprehension of the data.

 

The Census summary data is mainly represented region-wise, each region being associated with a large amount of data in the form of its attributes. For e.g., data about counties might typically consist of a list of all the 3148 counties of the USA, with related information about each of them such as population, per capita income, distribution of various ethnic groups in that region, population distribution among various age groups, median rent, median value, area etc. There are a plethora of applications that this kind of data would be advantageous for. For instance, consider a senior citizen looking for a place to settle in after she retires, or a business considering relocation, or maybe even a foreigner interested in learning more about the country. 

 

The limitless applications of the Census data and the wide mix of users that it attempts to cater to, have been the main motivating factors for the creation of Dynamap, a map-based generalized tool built for the Census Bureau to facilitate convenient representation of their data.

 

Related work:

 

The concept of dynamic queries was developed and applied to resolve a number of practical problems by the Human Computer Interaction Lab. The inception of the Dynamic Query method was with the development of a tool at HCIL called the Dynamic HomeFinder. This tool consisted of a Map of DC, with homes displayed as dots on the map. Sliders were used to represent the query graphically, where each slider represented the possible range of values that the parameter represented by the slider could take. Dragging a slider was equivalent to entering a parameter to complete the query. The results of the query were displayed as the filtering out (or in) of dots representing the houses.  Hence a visual display of both the query formulation part and the real-time update of the display was facilitated by this tool. The main limitation of the Dynamic HomeFinder was that it was not easily scalable to add more sliders or attributes. It was custom-programmed to deal with a particular database.


 

 


Fig 1: The HomeFinder tool

 

 

 

An improvement over the HomeFinder was the FilmFinder, tool designed to explore the film database, that generalized dynamic queries to non-spatial databases. It consisted of a starfield display, a double box range selector to make dynamic queries, and tight coupling among all the components. The FilmFinder needed to be extended so as to be able to deal with larger databases, and varied kinds of information.

 

 

 

 


 


Fig 2: The Dynamic FilmFinder Tool

 

The Census web site also presents a number of data access tools, such as the American Fact Finder, which also supports thematic maps.  ESRI arcview is a popular desktop mapping tool that provides mapping and spatial analysis techniques.

 

 

 

Fig 3: The American FactFinder on the Census website

Other examples of related work include the work done on the health statistics map on data exploration using dynamic queries for the National Center for Health Statistics. The interface consisted of a choropleth map of the US that allowed dynamic querying on the map to filter out areas in accordance with the parameters of the query. The restriction in this interface was that only a limited number of queries could be made, also it had limited zooming capability.

 


 


Fig 4: Dynamic Queries on a Health Statistics Map

 

Dynamap borrows some of its features from these tools, especially from the health statistics map. It is however a generalized distributable tool, which has the flexibility that it can be used with any kind of data that might be associated with geographic regions (basically the elements of a shape file.). Its main feature is that it allows dynamic querying on a choropleth map, it also contains other features that are described in detail in the next section.  It overcomes some of the limitations of the interfaces developed previously, for instance, the number of sliders is not limited, the attributes they correspond to are not fixed, neither is there a constraint on the size of the database that can be loaded using Dynamap.

 

Dynamap:

 

Dynamap is a generalized map-based tool designed to allow easier viewing and better analysis of map-related  Census summary data. The tool handles two kinds of tasks especially well, open ended exploratory tasks and specific scenario tasks.

 

Interface description:

 

The interface of Dynamap consists of a dialog box to load the shape file input. As the file is loaded into the tool, the map is displayed, and the attributes related to each of the elements of the map appear at the side in the form of adjustable widgets. Each slider represents the range of values (min. to max. ) associated with the attributes. Moving a slider enables the formulation of a query  and the map-elements are then filtered ( in or out) depending on the parameters of the query.  The real advantage lies in the presence of  multiple sliders, the user can formulate complex queries by adjusting more than one slider and view the results on the map. It also possesses a feature to color the map-elements in accordance with any of the available attributes.

 

For example, consider a situation in which  a senior citizen, about to retire, is looking for a suitable location to move to. One of her primary concerns could be to look for places with low- median rent. Dynamap can be colored in accordance with this attribute as shown.

 

 

 

Fig 6: The Dynamap loaded with the States Map colored according to the attribute Median Rent

 

The darker regions indicate regions with low value of Median rent and the lighter ones higher values.

 

Besides low median rent, the user might also want to stay at a place where there are more people of  her age-group. Hence she could then use Dynamap to find out the states with a larger number of people of her age-group. All she needs to do it adjust the slider to filter regions with large values for the attribute “ AGE_ 65_UP “  as shown in the next figure.

 

 

 

Fig 7: Dynamap showing the combined results of coloring by MEDIAN RENT and filtered by the attribute AGE_65_UP.

 

 

The regions “gray-ed” out are the ones that do not fall into the specifications provided by the slider. The ones that still remain colored by the attribute are the ones currently under consideration.

 

 

Hence, as shown, the search has been narrowed down to a few states. Let us suppose, she prefers to live closer to the East coast, we zoom into the state selected.. The tool supports zooming and panning capabilities to observe data patterns in smaller or denser regions. The refresh and coloring is much faster when the user has zoomed into a particular region. We click on the region to find out all the details about it, which are displayed in the text-box at the bottom right hand corner.

 

 

 

 

Fig 8: Zooming into state selected and clicked to view other details in the text-box.

 

 

            An additional feature of Dynamap can be taken advantage of if there are multiple datasets in one folder. If this is the case, then a second map, when loaded, zooms in by the same factor as the previous one. So now, if the user wants to decide what county to move in, in the state she has selected, all she has to do is load the county map in the tool. The attributes of the county map now load as widgets on the side and can be manipulated to further explore county-specific patterns/ data.

 

 

 

 

                       

 

 

 

 

Fig 6: Counties map opens in the same scale as the States map, multiple dynamic querying  and coloring by attribute on the county map.

 

 

 

Besides handling polygonal geographic regions, Dynamap also has the ability to handle

map-elements of different types such as lines or points on a map. The next figure shows the Highways map of the US and we can just explore the patterns and note that most of the long highways are towards the Central and Western parts of the country. Most of the highways in the Eastern region are divided. Or Figure 8 gives us the information that State of New Mexico is the one in which most females of age above 65 reside ! Hence this tool besides providing a way to get specific information also makes casual exploration of data more interesting, owing to the animation afforded to the dynamic querying process due to  the presence of sliders.

 

 


                       

                                    Fig 8: Display of map-elements that are in the form of lines

 

                         

                                    Fig 9: Display of map-elements in the form of points

 


Design and Implementation:

 

Dynamap has been built using ESRI  Mapobjects 2.0 and Visual Basic 6.0. The tool loads map-data from a Shapefile workspace. The shapefile specification consists of the following: a main file (.shp) containing GIS spatial vector data, an index file(.shx), and a dBASE file (.dbf) containing data in the form of attributes of the map-elements

 

The ESRI Mapobjects software provides a lot of functionality, which can be easily exploited using VB.  However, Mapobjects basically focuses on and provides the means for static representation of the map-data. To extend this functionality to support an efficient dynamic representation was the major challenge in this project.

 

The first task that we faced was the formation of a composite query, every time the slider moved. which was quick enough so that update seemed dynamic.

The naïve algorithm just read all the slider values whenever any slider was scrolled and then formed the query. However this method was unacceptably slow. So an event was added to the sliding widget that indicated that it had been touched. The new algorithm, worked as follows: Every time a slider was touched for the first time, all the sliders that had been touched were read to form the query. However, now when the same slider was further scrolled, only that part of the query, which related to that particular slider, was re-computed. This showed considerable improvement in performance.

 

After the efficient query formulation algorithm was put in place, the only bottleneck to performance was the map coloring / refresh mechanism. The need to color the map at a rate fast enough to keep it in sync with the slider motion and maintain the semblance of a dynamic action, led us to try several algorithms each one improving upon the previous one. Two map layers were used for this purpose, one gray to indicate the items that had been filtered out, and a yellow layer on top of it, that corresponded to the results of the query. The query acted only on the top (yellow) layer, however all the layers needed to be redrawn, this caused the map display to seem flashy and a little sluggish.

 

The optimization used to improve this algorithm was to refresh and redraw only what was necessary instead of the entire map. To implement this, the two map layers were used in such a way so that one acted as a positive query and one as negative query. A positive query indicated an increase in the number of selected elements (to be displayed on the map) and the negative query, the filtering out of map-elements, so that together both the layers cover all the elements of the map. The negative query is simply the complement of the positive query. Now, whenever the user tightened the query (moved sliders inwards) only the negative query layer needed to be redrawn, and if the query was loosened (sliders moved outwards) only the positive query layer was redrawn. This led to a significant improvement of performance.

 

However, this approach pointed us to a further optimization, namely, the computation of a differential query. Every time a slider is moved, a query is created that computes the difference between the previous view of the map and the current view. This query lists only those elements that were just filtered (or unfiltered) by the most recent slider change. This change is simply incorporated into the corresponding layer, which is then redrawn. Every time there is panning and zooming however, the entire map needs to be refreshed.  Also the coloring by attribute implementation has to be coordinated with filtering part, so that if the user first colors by attributes and then does filtering, the elements remain colored by attribute if they haven’t been filtered out (in which case they are gray).

 

 

Performance:

 

            The goal of a dynamic representation of an interface is that the updates/ results appear in sync with the change in query.  Dynamap showed a considerable improvement in performance with the new algorithms. With the query optimization algorithm in place, and the naïve algorithm where we played with just the top layer of the map, although the update speed increased, the display was too flashy. After the algorithm was optimized to include two layers, one each corresponding to the positive and negative layer respectively, the flashiness disappeared and the update speed increased two-fold. With the final optimization in place, which required only one layer to be redrawn, the update when observed with the States map ( all 51 states visible ) was in real time, i.e., in complete sync with the slider movement. However  with the Counties map ( with over 3000 counties ), the update speed drops to about a single update per second. Dynamap gives near real-time performance for up to almost 500 counties. These measurements were made on a GATEWAY Pentium II Processor, 450 MHz, 128MB RAM.

 

Limitations and Future Work:

 

            As Dynamap moves closer to becoming a complete working tool, there are a number of enhancements that can be go towards making it more practical and efficient.

One of the immediate things that should be worked on is to add to Dynamap is the ability to add more data (in the form of attributes associated with the geographic elements) from other sources. (E.g., EXCEL Database). Since ESRI Mapobjects2.0 supports this functionality, this task is achievable. Other possible enhancements could include adding eccentric labeling to Dynamap, adding a search feature, better legends, enabling multi-select, displaying details of selected regions in a table instead of a text box. Of course, further refining the algorithms to make the update process even  faster/better would be a plus. It would also be of practical advantage to make this tool portable to the Web.  

           

            Usability studies need to be conducted to get feedback on the efficiency of the tool, i.e., to ascertain whether Dynamap actually delivers what it promises to.

 

Conclusions:

 

Dynamap appears to be a promising tool and might prove to be yet another success story for the dynamic querying approach. Besides that it also provides the flexibility that most of the previous applications lack. It seems to have enough potential to be able to assist in the Census Bureau’s gargantuan task of convenient data representation for the general public. We are working to put Dynamap on the Census CDRom alongside their data to make the data more intelligible for the end-user. There is also a motivation to put it on the Census website where it can be accessed by an even greater cross section of the population.

Acknowledgements:

 

This project is supported by the Census Bureau.