Gregory W. May

Using GPS to Keep Your GIS Database Up-to-Date

As GIS databases begin to mature, emphasis shifts from the collection of new data to the maintenance of existing data. This technical session discusses ways to maintain a GIS database using a GPS system. Topics include identifying and extracting data from the GIS; visiting, updating, and verifying data in the field using GPS; and merging this data back into the GIS.

Introduction

Until recently, the focus for GIS developers has been obtaining spatial and descriptive data for their GIS systems. Sources of data range from "flat" databases to aerial photography, satellite imagery, and digitizing. Another method that has been gaining popularity in recent years, is the collection of position and attribute data using the Global Positioning System (GPS).

Regardless of the source, GIS data can become rapidly out-of-date. For example, new urban and suburban infrastructures are being built all the time, while storms, hurricanes, fires and floods strive to change or destroy the same.

Decision-makers rely on up-to-date GIS data to make evaluations and decisions that affect our lives intimately, from long-term planning of new subdivisions to emergency response strategies. Out-of-date data can lead to incorrect conclusions and possibly life-threatening situations.

To remedy this problem, GIS administrators are increasingly focussed on the challenge of keeping their data as current as possible. This paper discusses how GPS can be used to facilitate this process, which is known as data maintenance.

Overview of GPS

The Global Positioning System is a constellation of satellites orbiting the Earth. An Earth-based GPS antenna and receiver can track information from these satellites and calculate the antenna’s position accurately in three dimensions, anywhere on Earth and at any time of the day or night.

Several GPS vendors make systems especially designed for the collection and maintenance of GIS data. These systems are designed to organize GPS positions into GIS-style points, lines, and areas, along with associated attribute information. For example, a GPS data collection system user can quickly and accurately store the coordinates and attributes of new features, such as trees, roads, coastlines, and power poles, or relocate existing objects with known coordinates.

Until May 1, 2000 an intentional degradation of the GPS signal, known as Selective Availability (S/A), meant that an unaugmented GPS position could be inaccurate by up to 100 meters. A process known as differential correction was used to remove errors from the position, enabling GPS receivers to achieve accuracies of 50 cm or better. Now, with the removal of S/A, unaugmented GPS positions can be as accurate as 10 meters or better, however for most applications differential correction is still required to obtain better accuracies.

For more information on GPS, refer to http://www.spacecom.af.mil/usspace/gps.htm.

The Data Maintenance Cycle

The data maintenance cycle represents repeated phases of revision and expansion that a GIS database must go through to remain up-to-date. The cycle can be broken down into a series of steps as follows. The diagram to the right shows this cycle graphically.

Pre-planning.
Extraction of GIS data into a GPS field system.
Relocation of features in the field, and verification and update of these features (and collection of new features, if required).
Return of the updated data to the GIS.

The length of each data maintenance cycle and the interval between each cycle will depend on the requirements of your particular datasets. For example, in forestry applications, your data may need updating every six months or a year, but in expanding urban and suburban neighborhoods, updates may need to happen every few weeks.

The concepts of data maintenance can also be used to populate a flat information database, such as a spreadsheet, with GPS positions. This flat database can then be converted into a spatial GIS layer while keeping all of the columnar, or attribute, information intact.

Pre-Planning

Before commencing the data maintenance cycle, it is important to identify all of the requirements for your particular project. Communication should be established between the GIS administrators and the GPS data collection coordinators to ensure that all expectations are clearly understood. Some of the questions to be asked are:

Which layers of the GIS are to be maintained, and how frequently?
Is each layer to be maintained in its entirety, or broken into smaller pieces and split among several field crews?
Will the field crews be able to update all the information for a given feature, or should some items be read-only?
Is data maintenance limited to verifying the existing features in the GIS, or can new features also be added?
Are you planning to add more attribute information to your existing features?
Determine if maintenance status or date/time-visited information for individual features is important, and create fields in the GIS layers to accommodate this information.
In some circumstances, a key or unique ID is essential in the data structure to ensure that data returned from the field can be matched to the original source data.
Can the actual positions be modified, or just attribute information?
How will the returned data sets be managed and integrated back into the GIS?
How will the original data sets be treated? Will they be archived or incorporated as temporal layers for comparison with new data?
Is data-locking and multi-user access of your database an issue?

Considering all of these questions in advance can save repeated excursions to the same locations to recollect data. Remember that your data may be in the hands of field crews who are not intimately familiar with the scope and design of your GIS database.

Extraction of GIS Data into a GPS Field System

The first step in a given instance of the data maintenance cycle is to identify and extract a set of GIS data. You should thoroughly understand the scope of the data maintenance cycle for your project.

It is important to consider questions 1 and 2 outlined in the Pre-planning section above. GPS systems often use a small handheld computer to store information while in the field, and these devices generally do not have as much storage space as a workstation. In addition, by clearly breaking GIS layers into smaller manageable pieces, field crews will have a much better idea of the limits of their particular region of interest, to avoid overlap.

As indicated by questions 3 through 7 in the Pre-planning section, if you intend to add information to your data sets (as opposed to just verifying what is currently there) it is important to modify the structure of the GIS layers before you upload the data to the GPS system. This is because not only the data from your GIS, but the structure of that data as well, will be uploaded to a GPS system. Your data maintenance will not be successful if the data structure does not match the information you intend to gather as you verify and update your GIS features.

Once you have identified the GIS layers to be taken to the field, and modified the structures of those layers as appropriate, it is a relatively simple process to convert those layers into the format of your GPS data collection system and upload them to the field device. In most cases, an import and communications program will be standard with the GPS system.

It is also possible to upload data to a GPS system for reference only. For example, street centerlines may provide valuable to field crews trying to locate features in the field.

Extraction example

In the example below, the applicable question from the Pre-planning section is denoted in parentheses after each statement.

As the GIS administrator for a county, you have decided to update your database of fire hydrants (question 1). Your GIS contains the fire hydrants for your entire county in a single point layer. For each fire hydrant, you have recorded the number of spouts, the make and model of the hydrant, the flow rate, the body color, and the condition.

You intend to assign the update task to eight field crews. The crews will cover roughly equal sections of the county (question 2). Their main task will be to verify and confirm the flow rate and body color for each hydrant, and update the condition if necessary. Items such as the make and model will be available for reference, but will be flagged as read-only (question 3). Recently the county assigned and attached ID plaques to the base of each fire hydrant, so they must also collect this ID (question 5). You also want to store the date and time that each hydrant was visited, so that you can plan future maintenance cycles appropriately. To this end, you add an ID column and a date/time column to your hydrant layer before taking it into the field (question 6).

Because all new hydrants are reported and added to the GIS, you are confident that no new hydrants will be discovered (question 4). However, the position information you have for some of the older hydrants is of dubious quality, so you will be using the GPS system to compare and replace the position of each hydrant if outside a certain tolerance (question 8).

After modifying the structure of the layer, you split it into eight files and upload those files onto eight GPS systems.

Relocation, Verification, and Update

The sequence of events that a field crew will generally follow is:

Relocate each feature in the field, using GPS positions or by identifying unique attributes.
Verify the attributes and/or positions of the feature.
Make updates to the attributes and/or positions as appropriate, and collect any new information per requirements.
Save the feature and go to the next feature in the list, until all features have been visited.

GPS systems designed for data maintenance generally have comprehensive functions for guiding field crews to each feature in an efficient manner. Often the handheld computer will have a map display showing all features in relationship to each other, overlaid with your current position as computed by the GPS receiver. You may also have background layers visible, such as road centerlines, as a visual aid. Navigation functions give distances and directions from your current position to a given feature, and may give audible warnings as you approach the feature.

There are two ways to approach relocation of features in the field using a GPS system:

Use the map and navigation displays to guide you from feature to feature. Once you are at a feature, most GPS systems use a point-and-click approach from the map to access the feature’s attributes for verification and update. You can also compare the map position with your GPS position to determine if the map position is inaccurate. This method is good for features that are sparsely located or where position is the only way to determine which feature is which.
Locate each feature manually and search for that feature in the GPS system, using some unique identifier. This method does not use GPS positions to assist in locating the feature. Rather, you are simply visiting a string of features, such as power poles, where you don’t need GPS navigation assistance. At each feature, a search function allows you to locate the correct feature on the handheld and verify its attributes and/or position, or you can simply sort the list of features on proximity to your current location and choose the first one. This method must be used if you are converting a flat database into a GIS layer, i.e.: if you are adding positions to positionless data, because no map display will be available.

GPS systems designed for data maintenance will sometimes assign an update status to each feature in the data being updated. This status will normally be one of three values:

Not updated: This value will be assigned by default to data uploaded from a GIS to a GPS system. It indicates that the data on the GPS system is unchanged when compared to the GIS.
Updated: This value will be assigned to originally non-updated features once either their positional or attribute data has been changed. A field crew can use this value to make sure that they do not visit or update a given feature more than once.
New: This value will be assigned to new features collected as a part of the data maintenance process. It indicates that there is no representation of this feature in the GIS. This value is a flag to GIS administrators that the feature should be appended to the GIS layer and not used as a replacement for another feature.

Returning Data to the GIS

When the field crews are finished maintaining the GIS data on the GPS system, that data needs to be downloaded from the field computers and reintegrated into the GIS. As with upload when preparing for fieldwork, download is normally handled by the GPS system export and transfer programs.

As with all data being introduced to a GIS, the downloaded data should be held in a temporary location and checked for quality before being integrated back into the main GIS. How this integration is done is up to the GIS administrator, but there are a couple of factors to consider. Again, the appropriate question from the Pre-planning section is referenced in parentheses.

The first factor is the actual mechanics of integrating the returned field data with the original data (question 9). Two ways of doing this are:

Using the "cookie cutter" approach of deleting the original GIS data en masse and replacing it with the data from the field. This is the simplest method of integration.
By using an automated script or macro to examine each feature from the field and take appropriate action depending on the status of the feature. For example, a feature flagged as new should be appended to the end of the GIS layer, and a feature flagged as updated should replace an older instance of that feature in the GIS layer. A feature flagged as not updated should be discarded, as no change was made to that feature in the field. Note that the status flag, updated, not updated, or new, needs to be downloaded with the field data, normally as an attribute of the feature. The GIS administrator may want to have the script discard this attribute before integration with the main GIS data, using it just as a trigger for the functioning of the script.

Note that a feature-by-feature comparison of this nature requires that each feature have some unique identifier, such as an ID (question 7). This ID is used to match features that have been returned from the field with their original counterparts in the GIS. Without a unique ID the script may not be able to form a match successfully, unless you are able to match on spatial information such as the XY coordinate.

Another factor to consider is the treatment of the original GIS layer before integration (question 10). For legal or archival reasons, you may want to keep a backup copy of the original layer.

A third factor is how to update multi-user databases, or databases where records are locked during a transaction (question 11). Choosing a time to reintegrate the data may be as simple as running the process during downtime, but if the database is accessible 24 hours a day, such as occurs with many internet-accessible databases, the only option may be to use the second script-based approach for integrating data. In this approach, the script must test each feature to see if it is locked, and keep track of those which cannot be updated immediately and retry the update later.

Conclusion

GPS systems designed for GIS data collection and maintenance are invaluable tools for keeping GIS databases up-to-date. The cycle itself is as simple as uploading GIS data to a field computer, using GPS to relocate features, verifying and updating features (and collecting new features as appropriate), and downloading and integrating the updated data back into the GIS. Careful pre-planning is essential for the successful implementation of a data maintenance cycle.

Gregory W. May
Product Manager, Mapping & GIS Division
Trimble Navigation, Inc.