The Democratization of GIS and Applications in Community Planning: Development of the GURTIE System



Steven Philip Helfand
shelfand@wam.umd.edu
University of Maryland
April 2004


Special Thanks to Mr. Benjamin Trahan, Co-developer of the GURTIE System



I. Motivation

I am currently a member of an interdisciplinary research team with interests in urban planning and revitalization. This team, the Gemstone Urban Revitalization Team (GURT) has partnered with neighborhoods in East Baltimore with the objective of developing plans for revitalization. This objective, in combination with the developer's interests in data structures and algorithms, suggests that some sort of information technology tool might be effectively applied within the realm of community planning and urban revitalization. However, there are many challenges originating from the lack of access to such technology. Such challenges become immediately apparent, as is evidenced by Ghose, who states that Geographic Information Systems (GIS) have emerged as "an elitist, anti-democratic technology by virtue of its technological complexity and cost" [Ghose, 144]. Clearly, there are issues surrounding technology in community planning and urban revitalization that need to be understood and addressed.

II. Background

Within the field of urban planning and revitalization, knowledge is truly a very powerful tool. With knowledge comes the ability to form community plans, the ability to observe neighborhood trends in real time, and the ability to pursue grant money for community projects. Just as importantly, knowledge brings stories about revitalization projects from other communities, and explanations of why they failed or why they succeeded. Armed with such knowledge, communities can pursue their revitalization projects with a much better understanding of how to develop and enrich their communities. Ultimately, such knowledge speeds up the process of revitalization and helps to avoid years wasted on projects that are either useless or leave the community in worse shape than when they started.

Knowledge is also important for another reason. One of the great revelations in the field of urban revitalization over the past several decades is that community projects simply do not work without the cooperation of the community. While the residents of a city often have a keen knowledge of the problems in their community -- crime, illegal waste dumping -- they may not have a perspective on broader trends, or the motivation behind city-sponsored plans. For a city to gain the cooperation of its citizens in revitalization projects, it must be able to convince them of the wisdom of its plans. This requires the residents to have at least a limited understanding of the relevant theories of urban planning, and of the city wide trends that motivate the revitalization plans. Again, knowledge is the key to improving the quality of life within the community.

Disseminating knowledge within a community, however, is an extremely difficult task [Ghose2001, 142]. Communities are generally fairly decentralized, and academic knowledge is scattered. Low-income communities may not have easy access to government planning resources or academic urban planning texts. And often, those that do have such access may be unfamiliar with productive research methodology. However, even working under the assumption that all of these problems are overcome, the language and resources that are generally used to describe urban planning theory or the status of a neighborhood may well be useless to the average community leader. As an example, most any community leader can effectively extract information from a large printed map, but a CD-ROM filled with computer files describing the city would, in the majority of instances, be largely unhelpful. The inherent challenge lies in the fact that community members frequently have little to no experience dealing with data in this format [Ghose2001, 142]. If it were possible for community leaders to more effectively access and utilize existing data related to their neighborhoods, it might become much easier for community leaders to effectively manage and improve their communities' planning processes. It would be easier to inform the members of the community and give them pertinent background.

The obvious question that arises is how community leaders might be able to more effectively utilize existing data. More specifically, one can speculate as to whether or there is some effective way of adapting or utilizing information technology in order to develop tools to aid in community planning and revitalization.

III. Design Motivations

Geographic Information Systems in Planning

At first glance, Geographic Information Systems (GIS) technology seems very well suited to community planning. A GIS essentially is a tool that allows for the storage, visualization, and analysis of spatial data. But as is evidenced throughout the works Ghose and Chrisman, there is more to the GIS paradigm than just a technology [Ghose2001 and Chrisman1999]. Rather, there is an entire realm of political and socio-economic considerations that keep this technology from being accessible or ultimately beneficial to community residents and groups. Chrisman suggests that the definition of a GIS should be as follows: "Organized activity by which people measure and represent geographic phenomena then transform these representations into other forms while interacting with social structures" [Chrisman1999, 166]. Clearly there are social and societal implications associated with the use of GIS systems. The problem according to Ghose is that people in blighted communities do not have the resources or expertise to effectively access and utilize these systems.

Conversely, city government organizations generally do have the expertise and resources to implement information technology solutions for city organization and planning. The city of Baltimore, Maryland serves as a strong model for how Geographic Information System technology can be effectively used by a city for planning and internal management. Project 5000 is a Baltimore city initiative to acquire and demolish 5000 abandoned housing units in an attempt to promote redevelopment. In order to track properties and mark them as being identified, acquired, or demolished, a GIS system is utilized. The Mayor's Office of Information Technology (MOIT) has developed a large body of GIS data in order to perform the desired mapping. The effect of being able to easily view the target properties and the various stages of the project is dramatic, and suggests the power of data visualization with respect to urban planning and city management.

Input from Community Members

Of paramount importance in the development of a tool is the input from the individuals who it is targeted towards. Evidence suggests that effective visualization of data is key to a community leader. As such, an immediate goal is to create some sort of tool that can effectively visualize the data describing a neighborhood. Simply having access to electronic maps in this form provides community leaders and groups with a useful tool. But it also becomes clear that community leaders would benefit from fairly simple features and functionality beyond mapping of physical attributes such as streets and buildings. For example, being able to store and retrieve contact information or event-based data such as crimes or Citizens-on-Patrol (COP) would be extremely useful for a community leader to identify problem areas or important contacts within the neighborhood. The ideal tool has been described as a personal digital assistant tied to a map of the neighborhood being managed by the leader. The essential point presented here is that the incorporation of non-spatial or fairly simplistic spatial data would result in great functionality and use for community groups and leaders.

Data Sets


No model or tool has the potential to be effective if data is not available for it to store and manipulate. In order to create a model of urban neighborhoods, the question arises immediately as to whether there is data available to describe these neighborhoods. As previously described, the city of Baltimore has developed a large wealth of GIS representing the city. Typical data sets include the street grid centerline, pavement edges, buildings, and land parcels, all partitioned in one-square mile blocks, with the two-dimensional spatial coordinate system being centered at the city's Washington Monument. In addition to these geometric data sets, the city also has aerial orthophotography describing the same regions. The GIS data files are stored in a standard file format referred to as an Environmental Systems Research Institution (ESRI) shape file [ESRI Spec1998, 1]. Licensing issues surrounding these data sets greatly complicate the basic architecture required of the model. Essentially, the data is licensed on an individual basis. Due to the large time and monetary resources required to obtain the data sets, the MOIT does not freely distribute these data sets to its constituency. According to MOIT's Bill Ballard, the licensing practices employed are primarily intended to prevent commercial interests from profiting from the city's efforts [Personal communication, 2004]. Nonetheless, the accountability and extreme cost associated with licensing of the MOIT's GIS data makes it very difficult for community groups to gain access to it.

Data Licensing


The researcher has obtained an academic license for the use of this data. Such a license allows usage of the data with the following restrictions. It is permissible to create maps and analyses of the data, and to distribute these items to neighborhood groups, whether they be in electronic form or hard copy. However, distributing the data sets to anyone is a violation of the licensing agreement. Thus, it becomes necessary to somehow deploy the desired tool to neighborhood groups and community leaders without distributing data sets.

VI. The GURTIE Tool

The Gemstone Urban Revitalization Team Interactive Environment (GURTIE) is a tool designed to overcome the political, social, and technological barriers that often prevent community leaders and residents from effectively utilizing information technology tools in neighborhood planning and management. GURTIE utilizes traditional GIS spatial data indexing, combined with non-traditional, non-spatial data capabilities for handling things such as contact information and event-based data.



Fig. 1: The GURTIE tool. Displayed is a list of available data sets.

GURTIE Architecture


The GURTIE system utilizes a client-server architecture. Essentially, all of the data is stored and manipulated on one machine, while individual users can access this machine using a software client. This document focuses primarily upon the development methodology and architecture of the data server, the component of the tool developed by the researcher. The benefits of this architecture are numerous, and particularly powerful with respect to the design motivations of the GURTIE tool. Communication between client and server occurs over a TCP/IP network connection, which allows client and server to speak to one another as long as both have Internet access. Utilizing a common protocol for communication is important in ensuring that users of GURTIE will not require excessively expensive or sophisticated hardware.

Though not currently implemented, it is planned that in actual use the client will use some sort of secure authentication scheme in order to connect to the data server. Additionally, the data server only transfers a geometric representation of the GIS data, rather than the actual data sets. As a result, the use of the client-server architecture aids in preserving the terms of the data licensing agreement.

Data Format


As stated previously, Baltimore's GIS files come in standard ESRI shape file format. However, the developers of GURTIE felt that the ESRI format has certain shortcomings and difficulties associated with its lack of consistency, and internal utilization of both big-endian and little-endian values [ESRI Description, 2). As a result, GURTIE uses a format referred to as .gurt files. This is essentially a derivative of the ESRI format, utilizing separate files for record indices, attributes, and geometries. A simple conversion written in PERL converts ESRI Shapefiles to .gurt files.

The Data Server


The data server, developed in C++, is the heart of the GURTIE system. Located on a central machine controlled by the developers, the server has the ability to load selected data sets, stored and indexed according to attributes described in a configuration file for each set. Data is stored utilizing a scheme referred to as layering. Essentially, each data set sits in memory without any sort of awareness of what other data sets are present. This is a powerful feature, in that it makes no assumptions about which data sets might be loaded at one time. At a more abstract level, the concept of layering allows the definition of a "neighborhood" to be a dynamically entity. When modeling anything, it is necessary to clearly define the item being modeled, in order to insure that the effort captures the desired properties. By layering, GURTIE makes no assumptions about what data sets are being stored, and thus the working definition of a neighborhood can be limited or expanded to whatever data sets are available.

Layer Structure


It is important that each layer be able to determine the type and structure of data set that it is to store, through analysis of a configuration file associated with the data set. A data layer should be able configure itself in order to index the data being stored according to its unique attributes. Essentially, each data layer has access to all of the data structures and indexing tools that might be necessary, and is able to select which tools and structures are necessary for the specific data which it will store.

Every data set, for example, is initially indexed into dictionary structures. This allows all of the data to be read into memory and organized in such a way as to be accessible. For a given layer, each data object has attribute fields. Streets for example have attribute fields containing address range, length, and name. The configuration file for the street grid can specify for example that street data should be initially keyed by street name. As the streets are read into the model, they are stored in a B+ Tree according to the specified key [Shaffer2001, 321]. This allows efficient, logarithmic time access to data objects based on their key. A B+ tree is also optimal in the following manner. In the B+ tree, data objects are only stored in the leaves, which are also associated with one another as an ordered linked list of data objects. When it is necessary to transfer an entire data layer to the client, having this linked list provides a very easy linear time traversal of all the objects in the tree. A regular binary tree traversal would have the same asymptotic time, but in trials proved to be less time efficient.

Objects are also dynamically assigned identification numbers as they are read in, and are keyed by these numbers as well. The identification numbers are utilized so that each object within the entire model, irrespective of layer, has a unique identification number. This feature is necessary should we ever desire GURTIE to perform analyses that involve interaction of different data layers. These ID numbers are stored in an AVL tree, with pointers to the actual objects, which are still contained in the B+ tree [Cormen2001, 296]. The AVL tree has the benefit of being balanced, which is necessary, as the ID numbers are generated in ascending numeric order. If a more na¸ve structure such as a Binary Search Tree were utilized, the structure would resemble a linked list once insertion was complete, and thus search time would end up being linear instead of logarithmic.

A data layer in GURTIE also contains many structures for dealing with various sorts of spatial objects. Every object in an ESRI shape file contains a bounding box, which is simply a rectangle bound tightly to the extremum points of a spatial object. Each data layer has an R-Tree, a spatial data structure designed to store rectangles [Guttman1984]. This makes an R-tree the ideal structure with which to store these bounding boxes. The R-tree is a strong choice for generic spatial data, in that it is not overly complex or intricate structure. This results from the fact that the R-Tree makes no further assumptions about the shape of a spatial object beyond the fact that it has a bounding box.

Certain layers have more specific spatial properties, such as the street grid. Curved streets are captured by a geometric construct called a polyline. This is essentially a collection of line segments connected end to end, which can then be interpreted as a linear interpolation of the actual shape of the street. The street grid is actually stored as a collection of street centerlines, with intersections thus being represented geometrically as points. We can view the street grid as a network of vertices and edges, where the vertices are intersections and the edges are streets. In the street grid layer, we might want to store just the intersections. This is done using a PR, or Point Region Quadtree [Shaffer2001, 441]. The PR Quadtree implementation utilized here contains only one copy of each point, as we are making the assumption that there can be only one intersection at a given point. However, it is also important to store the actual street centerlines spatially. Suppose we want to locate the street closest to a given location. In order to accomplish this sort of analysis, we utilize a structure referred to as a first degree Polygonal Map (PM1) Quadtree [Samet1990]. This structure allows GURTIE to capture the spatial location of the streets and their endpoints within the same spatial construct.

This assumption that there exists only one street intersection as a given point is important, as it allows GURTIE to input the street objects and determine adjacency, specifically which intersections a street is incident upon. By comparing the endpoints of a street to those points already in the PR Quadtree, we can easily resolve the issue of adjacency, and use a graph structure to encapsulate this property [Cormen2001, 525]. The graph structure utilized by GURTIE is implemented using an adjacency list, as the graph that we are creating will be extremely sparse. Specifically, each vertex will have a constant number of edges incident upon it, so we need O(n) storage space for the graph, where n is the number of notes or intersections. Hence, an O(n^2) structure such as an adjacency matrix would be inappropriate.

In addition to these important spatial constructs, GURTIE's data layers also employ data structures for dealing efficiently with non-purely spatial, non-traditional GIS data. Naively, GURTIE employs an R-Tree to store events associated with time intervals. Essentially, it is possible to regard a time event as a rectangle with height one, such that the width of the rectangle represents the duration of the time event. An Interval Tree has been implemented for more effective storage of objects with time ranges, but an interface for this structure has not yet been developed [Cormen2001, 311]. For events with a single time point, a B+ tree can be utilized, as its linked list attribute allows very efficient time range queries. GURTIE is also able to store information such as phone numbers and email addresses, tied to locations in the map. These objects are regarded as points, and can be stored in a PR Quadtree structure allowing for multiple points to be stored at any given spatial location. The attribute fields of contact information objects contain the actual information, while the spatial component is simply the point to which the information is tied.



Fig 2: The GURTIE interface for accessing and creating contact information and event data.

Data Queries


The various data structures employed in each layer allow for a wide variety of queries to be performed. The most basic sort of query is a range query, which is available in every layer. A user can specify a region in a particular layer, either a rectangle or circle. The user can then select an attribute field and range of values for this field. GURTIE will then display all items within the specified region whose attribute matches the desired range. For example, a community leader could utilize this feature to identify all schools within a certain radius of a specified point.



Fig 3: A shortest street route, highlighted in black across the center of the image.

A special sort of range query is available on layers whose objects represent events with time ranges rather than single points. Suppose that GURTIE contains a layer representing crimes in a neighborhood, and a user wants to see all of the crimes occurring within a specific time window. Using either the Interval Tree structure or a B+ Tree of these time stamps, it is possible to efficiently compute and return the answer to this query.

Layers with an adjacency property, such as the street grid, allow us to perform shortest paths queries. Within the street grid, a user of GURTIE can select two points. The data server will then calculate the shortest route between these points and display it for the user. The calculation is performed using Djikstra's shortest paths algorithm [Cormen2001, 595-601]. The current implementation of this algorithm naively utilizes an array as the priority queue structure, though certainly the running time might be improved with a Min-heap or Fibonacci-heap [Cormen2001, 476]. However, due to the relatively small size of the street grid data set, (~5000 intersections), this optimization is not seen as being crucial.

The Client


Though developed by the researcher's colleague Mr. Benjamin Trahan, a brief discussion of the client and its features is necessary in order to fully grasp the abilities of the GURTIE system. The client, written in Java, allows a user to access the data server described above. The user can access a list of all data layers that are available from the server, and then choose to visualize any or all of these data layers. Once these data sets are displayed within the client, it is then possible to perform any of the queries described above. It is also possible for a user to load data files locally, and have them appear as a layer in the model. However, this feature is thus far limited to data visualization, and as such no queries will be available. An interface is currently under development that will allow a user to input contact information, and for the actual contact information data sets to be generated in ESRI Shapefile format by the client. Future development will likely allow for a more limited version of the client that can be accessed via the World Wide Web as a Java Applet.

V. Proposed Uses of GURTIE

The architectures and features associated with the GURTIE system suggest several powerful applications for this tool. On the most basic level, GURTIE can be utilized by community leaders who desire to store, visualize, and analyze basic data sets describing their neighborhoods. But the client-server architecture also provides a powerful set of capabilities for the communication and broadcasting of information. If a community leader desires to broadcast information regarding crimes in their neighborhood, it would suffice to place such data sets on a data server. Community members could then use a web-based version of the client to access this information. GURTIE also has the potential to allow a city to communicate with its community groups.

A notable example of such communication arises from current circumstances within East Baltimore. The city has developed plans for the creation of a broad boulevard along a corridor called Orleans Street that leads from the downtown district to the Johns Hopkins Biomedical Park, currently under development. This project involves the demolition and removal of several blocks of housing. However, affected community leaders and residents have not been provided with details regarding the project, creating a great deal of distrust and tension between the involved neighborhoods and the Mayor's office. Now suppose the Mayor's Office of Information Technology were to run a GURTIE data server containing data files precisely detailing the plans for the Orleans Street boulevard. Community groups could then use a GURTIE client to access these plans and have a clear idea of what the proposed project entails.

VI. Conclusions

The most important theme here is that GIS technology can be molded and adapted in various ways to have extremely relevant uses within community planning and urban revitalization. A client-server architecture can transform a GIS into a tool for effective intra- and inter-community communication, as well as government-community interaction. The researcher believes that the incorporation of an interface and data structures for dealing with both spatial and non-spatial data such as contact information and Citizens on Patrol data is the key to making a GIS-like tool powerful and effective within the desired domain. Part of the power of GURTIE lies in the fact that it is not designed exclusively for the neighborhoods in Baltimore with which the Gemstone Urban Revitalization Team is associated. Rather, it is a tool scaled to function most effectively when dealing with data sets describing a dense urban area of roughly one to four square miles. As a result, GURTIE has the potential to be a very powerful and useful tool in the realm of community and neighborhood planning.

Research Citations:

Baxter, Richard S. Computer and Statistical Techniques for Planners. London: Muethen & Co Ltd, 1976.

Chadwick, George. A Systems View of Planning: Towards a Theory of Urban and Regional Planning Progress. New York: Pergamon Press, 1988.

Chrisman, Nicholas R. What Does 'GIS' Mean? Transactions in GIS, 1999, 3(2): 175-186. Malden, MA: Blackwell Publishers.

Cormen, Thomas H, et al. Introduction to Algorithms. Cambridge, MA: The MIT Press, 2001.

Environmental Systems Research Institute. ESRI Shapefile Techincal Description: An ESRI White Paper. http://www.esri.com. ESRI, 1998.

Ghose, Rina. Use Of Information Technology for Community Empowerment: Transforming Geographic Information Systems into Community Information Systems. Transactions in GIS, 2001, 5(2): 141-163. Malden, MA: Blackwell Publishers.

Guttman, Antoinin. R-Tree: A Dynamic Index Structure for Spatial Searching. ACM SIGMOD International Conference on Management of Data. Boston, MA, 1984, pp. 47-57.

Iwerks, Glenn S. and Samet, Hanan. Visualization of Dynamic Spatial Data and Query Results Over Time in a GIS Using Animation. VISUAL '00 166-177 Conference Proceedings. Lyon, France, November 2000.

Lee, Sang Yong. An Integrated Model of Land Use/Transportation System Performance. University of Maryland College Park: Ph.D. Thesis, 1995.

Samet, Hanan. Applications of Spatial Data Structures: Computer Graphics, Image Processing, and GIS. Addison-Wesley, Reading, MA, 1990.

Samet, Hanan. The Design and Analysis of Spatial Data Structures. Addison-Weseley, Reading, MA, 1990.

Shaffer, Clifford A. A Practical Introduction to Data Structures and Algorithm Analysis. New York: Prentice Hall, 2001.