Mario Field and Ed Meckel

From Start to Finish: Steps in a Comprehensive Technical GIS

Implementation (Paper #132)

Abstract

Baltimore County has developed an enterprise GIS program to introduce, promote, and dispense GIS technology to county departments and organizations. As part of this effort, the County is obtaining planimetric, topographic, and cadastral data layers at 1" = 200' scale. To successfully establish this new technology, Baltimore County staff have developed and refined several technical implementation stages, as follows: 1.) Database Development; 2.) Data QA/QC; 3.) Data Preparation; and 4.) Data Loading and distribution. Baltimore County anticipates future implementation stages, such as application development, ORACLE integration, and data maintenance. To date, these stages are still in the design phase.

I. Introduction

When introducing a Geographic Information System (GIS) in an organization, a sound implementation plan must be in place for success. GIS development involves significant resources, staff, and expertise over a long period of time. Successful development is an evolutionary process, which learns from previous mistakes and incorporates educated and savvy planning into the process (Wiley 1996). This ensures that it is a one-time effort that fulfills organizational expectations. Furthermore, an organization must possess comprehension of the GIS implementation process, GIS technology knowledge, various levels of GIS expertise, willing participants, and an adequate budget to realize a cost-effective GIS (Somers 1996).

GIS implementation steps occur at various levels. At the administrative level, organizations design planning stages that address broad managerial, administrative, and technical issues. The administrative level examines applications, users, data, and organizational and budget issues, and fits the GIS program into an organization. The stages have been categorized into planning, analysis, design, acquisition and development, and operation and maintenance (Somers 1996).

At the technical level, the generality of the administrative steps are filled with technical details that generate hardware, software, and data specific parameters. Technical implementation steps are classified into the database design, pilot project, data conversion, GIS procurement, training, and data maintenance stages (Warlass 1996). Others are classed into the data creation (data entry, digitizing, and data conversion), coverage processing, QA/QC, and documentation steps (Riggs and Krumm 1996).

GIS database and application requirements determine many of the intricate steps contained in the development stages. Montgomery and Schuch (1993) identify four critical elements that an organization needs to address in GIS development. These are: 1.) data requirements (features, accuracy, software, etc.); 2.) data sources; 3.) database population techniques; and 4.) implementation steps (risk management, political support, realistic schedule, etc.).

This paper discusses Baltimore County's technical implementation plan, consisting of database development, QA/QC, data preparation, and data loading and dissemination stages. It outlines the database requirements and issues integral to Baltimore County's GIS program. The paper does not discuss particular technical details. Instead, it provides a more general technical discussion of the implementation steps and issues.

II. Project Background

In 1994, Baltimore County initiated an enterprise GIS for county departments. Baltimore County Public Works introduced this effort to get more out of less, after the County experienced downsizing in the early 1990s.

In early 1995, the county hired four GIS professionals to coordinate, design, and carry out the county-wide GIS program. The county created a GIS Services Unit in the Office of Central Services. Later, the GIS group was placed into the newly formed Office of Information Technology.

The GIS staff developed database and hardware/software designs. From there, the County selected ArcInfo as the primary GIS software, with ArcView for desktop PCs, and ORACLE as the relational database management system (RDBMs), running on IBM RS/6000 workstations.

For the data conversion project, the county was divided into three geographic phases. The first two phases, mapped first, encompassed the majority of county development and covered all of the public water and sewer service area.

The county contracted out ground control monumentation and aerial photography, using airborne global positioning systems (GPS), for Phase I. The aerial triangulation was completed by midsummer 1995 and the county released the data conversion contract in early fall. The data conversion project collected planimetric, cadastral, and topographic data, and orthophotographs. The County awarded the contract in the fall of 1995 and data capture followed in December. The pilot phase, consisting of two tiles of data, was finished in early spring 1996 and by midsummer, Phase I was fully underway.

As of April 1997, two-thirds of the Phase I data conversion project is complete. The county has released part of this on-line for county departments and organizations. Staff is also working on future implementation stages, including application development and data maintenance.

III. Stage I: Database Development

DATABASE DESIGN

The most important elements that influence the database design of a GIS system are:

Baltimore County stresses that a database should be designed around the supported applications and requisite database components. In addition, the commonality of the data sources and maps used by county personnel, along with how county staff use and apply geographic data, influence the database design.

The GIS staff conducted interviews with management, users, and existing support staff from each department to determine their GIS and mapping activities. These interviews focused on their everyday activities, not wishful tasks. In these interviews, the GIS staff performed the tasks that each department was required to complete on a daily basis. The county learned that the data used by these groups spanned departments. For instance, each department: 1.) used data that relied on address information; 2.) was interested in street networks; and 3.) utilized topography to examine elevation of particular sites. Only the way in which the data was applied differed.

Consequently, the county concentrated on the layers that affected most of the departments. These layers (buildings, roads, centerlines, hydrography, and topography) became the most detailed because of their importance to multiple departments.

DATABASE REQUIREMENTS

The development of these layers was application driven and some unique specifications, as follows, were required:

It is important to ensure that the relationship between features is recognized and to produce a database design that considers data consistency, redundancy, and management. For example, Baltimore County carefully identified features that have corresponding arcs, such as buildings, roads, or shorelines. In the county's design, features of one layer that correspond to features in other layers are 'cut out' and coded as null. This design prevents data redundancy and makes data maintenance between layers consistent.

Graphic and annotation requirements were included in the database creation, besides specifications related to layers and features. The county specified and provided symbology specifications to ensure aesthetic presentation of displays and plots.

Annotation placement was designed and developed so that it is readable and interpretable if plotted or displayed at a scale of 1" = 100'. The annotation associated with certain features is: 1.) placed to obscure the minimum amount of annotation between other planimetric features; 2.) splined along linear features; and 3.) marked at least once on each map sheet on which the feature exists.

DATABASE DICTIONARY

The database dictionary is the most important document of the database design and implementation. This documentation includes layer names, attribute table layouts, annotation subclass definitions, feature definitions, and attribute parameters. The database dictionary becomes a working document that the conversion vendor uses to create the database (Chambers 1989). Baltimore County has placed its database dictionary on-line to assist in the QA/QC process by providing input to programs performing automated checks of coverage content and parameters. Applying the database dictionary as part of the QA/QC process ensures accurate and consistent data sets that conform to the database design.

The primary task involved in creating the database dictionary is to identify specific features and match them with appropriate layers. The feature types include:

In the database dictionary, each feature is assigned a four-digit feature code. The first two digits of the feature code designate the layer (IE. 10 - Control, 20 - planimetric, 30 - topography etc.). The next two digits designate the feature. This convention facilitates the inclusion of other features into the layers.

Figure 1 contains a summary of the feature capture in Baltimore County's database dictionary.

IV: Stage II: Quality Assurance/Quality Control (QA/QC)

The County's (QA/QC) program, developed in ArcInfo and ArcView using Arc Macro Language (AML), consists of manual, automated, tile interaction, and monitoring routines. The county performs these tasks throughout the data conversion process. .

AUTOMATED ROUTINE

The automated routines, consisting of the data report and data integrity tasks, examine the content of the data and perform automated checks of data completeness and usability. To match the data content with documented specifications, the automated portion uses the County's GIS database dictionary. When finished, these routines generate a report documenting data substance and integrity, and flags data discrepancies with the database dictionary.

However, the automated checks capture only part of the potential mistakes. Most of the data quality checks must be performed with a "geographic eye." Certain erroneous spatial patterns and behavior can only be identified in this manner.

MANUAL ROUTINE

Consequently, the manual portion (Visual Inspection), using checkplots and on-screen mapping, inspects visual characteristics, patterns, and content. These inspections address data quality checks that cannot be done in automated routines. Moreover, the visual inspection allows county personnel to utilize the orthophotos for enhanced data integrity and accuracy checks.

The contractor delivers initial checkplots that staff compare to aerial photographs for correct and complete data capture. In addition, the county corrects visible feature coding and capture errors. The checkplots supply an easy mechanism to inspect initial mapping of the county.

After the contractor delivers digital data, staff generate internal checkplots and data integrity reports, and perform on-screen QC analysis. Staff execute more comprehensive checks on the digital data. Besides data capture and coding, the county examines coverage specifications and requirements.

The county considered performing geographic checks on the data, for both horizontal and verticle accuracy. For this check, county surveying crews would go out and use GPS to sample certain locations in the field. Staff would compare these coordinates to identical features in the GIS database and note discrepancies. However, given the county's very precise geodetic control and the lack of available resources, this QC check was not included in the final design.

Figure 1: Baltimore County Database Dictionary

200' SCALE ArcInfo COVERAGES IN THE DATABASE

MAP THEME COVER NAME MAP FEATURE

Control CNTRL Horizontal control points, Verticle Control points,
H/V control points, Analytical points

Map Layout INDEX Mapsheet Index (200' scale grid)
Grids
SPGRID State Plane Coordinate System
Grid Lines, Mapsheet grids
Planimetric BLDG Residential,commercial/
industrial, Institutional
buildings, Garages, Other
structures,Buildings under
construction,Tool booth
plazas, Rail Stations, Water towers,
Storage tanks
ROADS Paved roads, Unpaved roads,
Curbs, Hidden roads, Unnamed Roads
Roads Under Construvction,
Paved alleys, Paved Parking lots,
Driveways, Runway/Taxiways, Bridges,
Overpasses, Tunnel portals,
Road intersections
CLINE Street centerlines, Alley centerlines,
Ramp centerlines
TRANS Rail lines, Hidden rail lines,
Abandoned rail lines, Metro rail,
Light rail, Transmission lines, Pipelines
CULT Junkyards, Quarries, Gravel/Sand pits,
Landfills, Cemeteries, Areas under
construction, Power substations,
Race tracks
VEG Wooded areas, Tree rows, Orchards/Nurseries

REC Commercial pools, Golf

Courses, Baseball diamonds/athletic fields, Tennis courts, Parks/recreation areas, Bike/Hike trails, Playgrounds

HYDROP Streams, Rivers, Lakes/Ponds, Reservoirs, SW Retention ponds, Bay area, Boat ramps, Piers, Docks, Dams, Drainage connectors, Culverts, Hiddenhydrography, Wetlands/swamps

HYDROL Bulkheads, Floodwalls/Headwalls

HYDROC Hydrologic centerline network

COMM Radio towers, Transmission towers, Microwave towers

Topography TOPO Index, Intermediate, Depression, Obscured, Hidden contours

SPOT Spot elevations, Water surface elevations, Bridge elevations, Rooftop elevations

DTM Masspoints and breakpoints

Cadastral PARCELP Property parcels, Road ROW parcels, Rail ROW parcels, Utility ROW parcels

PARCELC Parcel centroids

SUBDIV Subdivision boundaries

Orthoimages ORTHO 1:2400 Digital orthophoto images

TILE INTERACTION ROUTINE

Since the data is delivered in 4000 by 6000 foot tiles, it undergoes a tile interaction routine. This routine examines the relationship of the GIS data in one tile with pertinent GIS data in neighboring tiles. This ensures feature coding and capture is consistent across tile boundaries.

MONITORING ROUTINE

The county documents errors on discrepancy reports. This permits county and contractor personnel to track and resolvedata errors, throughout the various stages of data review. In addition, these reports provide a documented record of errors and their resolution.

County staff locate discrepancies on the checkplots, mark them with a number, and place the number on a discrepancy form with a detailed description. This prevents the checkplots from being marked up extensively, especially when little white-space exists on a checkplot.

Additional monitoring tasks include QA/QC process sheets, that record the dates certain QA/QC chores are completed, and status maps, that show data conversion status on a tile basis. Status maps provide a key guide to upper management on the progress of the GIS project, both on a spatial and project status measure.

V: Stage III: Data Preparation

After the GIS data completes the QA/QC phase, staff prepare the data for import into the county GIS database. This stage involves tile edgematching, data repair, and final QA/QC tasks. Typically, the data is processed and loaded in manageable blocks of tiles, usually around 16-20, to reduce processing and edit time lags.

TILE EDGEMATCHING

Tile edgematching joins the data layers for a specified number of tiles and ensures data aligns and connects at tile boundaries. The county uses a one foot tolerance, the accuracy of the data capture, to snap features along tile boundaries. For each layer, this process first identifies the tile boundaries edgematching will occur on. Then it performs the edgematch along these boundaries. The result is a workspace containing all of the GIS data layers for a multiple-tiled area of the County.

DATA REPAIR

The coverages still contain neatlines and possible dangle nodes. Therefore, the county performs data repair and cleanup on the coverages for final data loading. This process entails the following steps:

1.)The county removes interior neatlines and repairs anyconsequent dangle nodes.

2.)Staff perform noted QC edits on the arcs, remove pseudonodes between arcs with the same feature code, and fix arcs feature-coded as zero.

3.) Staff then build polygon topology for layers, identify and repair labelerrors, perform noted QC edits, and code newly created null polygons.

4.) For certain layers, the county does additional data dependent checks (data connectivity and pseudonode existence).

Data repair fixes new errors introduced in the edgematching process. It also repairs some minor edits, noted in the QA/QC stage, which were not resolved in previous steps. Correcting data in larger blocks is more manageable than on a tile basis since less maps and folders are shuffled around. In addition, edgematched coverages make cross-tile edits easier to complete.

FINAL QA/QC

The data undergoes several final QC tasks. Since errors can be introduced in any of the processing stages, the county generates final QC plots using a feature-code based color scheme to mark coding inconsistencies. This step is especially useful for high maintenance layers (such as roads, buildings, and hydrography) which contain numerous features and possess high coding error potential.

VI. Stage IV: Data Loading

ARCSTORM DATABASE DEVELOPMENT

ARCSTORM is the County's database management tool because it provides a way of managing multi-user displays and edits on data at the feature level. ARCSTORM contains useful historical tracking and data access management tools. In addition, the data is viewed in the various ArcInfo modules as a seamless data set stored in one location.

The ARCSTORM design is almost as important as the database design. There are many factors that influence the ARCSTORM design. As with the database design, the county determines user needs, data requirements, types of interaction (select, query, display), and data content.

Data interaction and content prove to be the most important factors to consider. These elements directly affect which layers are incorporated into each library. Layers are placed into specific libraries based on their use. Most of the users in Baltimore County will query and display data. Therefore, choosing a library layout and tiling scheme that facilitates quick select and display of the data is important. Placing comparable layers into a single library makes access and management simple. For example, vegetation, recreation, and cultural features are all stored in the same library since they have similar feature density and are often used together for analysis.

The tiling scheme chosen for each library is also an important factor to consider. There are two methods of defining a tiling scheme: an evenly spaced grid based on levels and a grid based on feature density. For instance, the building layer is quite dense in urban areas and sparse in rural areas. To make access to this layer quick and efficient, a density based tiling scheme is used instead of a level based tiling scheme.

The county evaluates each layer individually to decide the best tiling scheme. These issues combine to produce Baltimore County's ARCSTORM database layout, as shown in Figure 2.

ARCSTORM DATABASE REQUIREMENTS

ARCSTORM provides the county with another level of quality control on data being incorporated into the database. ARCSTORM requires that all data being loaded in the library meet certain requirements. These requirements, as follows, have been incorporated into the County's QA/QC process:

These requirements must be addressed and corrected, if necessary, before loading data into ARCSTORM. The data will not be loaded into the database if any one of these is not met. ARCSTORM ensures that the data loaded into the database is consistent, spatially accurate, and topologically correct.

Baltimore County's ARCSTORM database is designed so that only one user, with DBA access, can load data into the database. The GIS staff review and release all data before it becomes a part of the main county database. Only the Database Administrator and the QA/QC Coordinator have write access to the database, because they are special. All other users have read and execute access.

Figure 2: Baltimore County's ARCSTORM Layout

1. INDEX: PHASE1 SPGRID

2. STRCTURE: BLDG

3. ROAD: STREET

4. ADDRESS: CLINE

5. TRANSCOM: TRANSCOMM

6. OPSPACE: REC CULT VEG

7. HYDRO: FACILITY HYDROP HYDROL

8. TOPO: CONTOUR SPOT

9. PROP

VII. Conclusion

To date, Baltimore County has refined and solidified many of the implementation stages discussed in this paper. They were all part of a GIS plan, described on paper. Now, through rigorous development and testing, many technical components are in place and the data conversion effort is in full production mode.

Future implementation steps include application development, data maintenance, and ORACLE integration. Baltimore County has an application development plan that defines the types and order of application development. Data maintenance will also be initiated as more data becomes on-line. The county will assign data layers to responsible agencies for updates, yet the data will always pass through the GIS Unit before being placed in the county database. ORACLE integration will become an important step since the county possesses numerous tabular databases that can be linked to the GIS data.

Mario Field, QA/QC Coordinator
Ed Meckel, GIS Database Administrator
Baltimore County Government
Office of Information Technology, GIS Services Unit
400 Washington Avenue, Room 32
Towson, MD 21204
Telephone (410) 887-4963
Fax (410) 820-8024
mfield@co.ba.md.us
emeckel@co.ba.md.us


Bibliography

Chambers, Don. 1989. "Overview of GIS Database Design." ARC News. 11(2) Reprint.

Montgomery, Glenn E. and Harold C. Schuch. 1993. GIS Data Conversion Handbook. GIS World Inc. Fort Collins, Colorado.

Riggs, Matthew H. and Robert J. Krumm. 1996. "Procedures for Implementing a Sensitive GIS Project." Illinois State Geological Survey. http://www.isgs.uiuc.edu/isgshome/html

Somers, Rebecca. 1996. "How to Implement a GIS." Geo/Info Systems. 6(1) pp 18-21.

Warlass, Jeffrey. 1996. "GIS Implementation and Application in Civil Engineering." Bringham Young University.

Wiley, Loy. 1997. "Think Evolution, Not Revolution, For Effective GIS Implementation." GIS World. 10(4) pp 48-51. --------------167E2781446B--