Larry Wolfson and Katherine Douglas

Addressing Addressing - The Science of Eclectic Interpolation

You’ve got a great road network. The streets that you’ve spent months digitizing from the latest aerials have been GPS verified for +/- 25 feet. It puts the TIGER (Topologically Integrated Geographic Encoding and Referencing) cover to shame, making missing arcs, curves, and attributes obvious. There’s only one problem. You can’t find half the addresses for your voters with any degree of reliability so you have to fall back on the old flat files to get your work done. The Assessor, Planning and Development, the Fire and Sheriff’s Departments are asking, "What use is this GIS if it can’t improve the response time that our customers demand?"

Accurate geocoding depends, not only on an accurate roadbase, but also on an accurate address database. In our experience, we have found at least nine databases with address ranges at our disposal including: 911; elections; postal; TIGER; GDT, BLR; assessor; planning; and Maricopa Association of Government data. All contain some resemblance to their real-world counterparts. All contain disparate data in relationship to each other.

As one of the largest (9,226 square miles), most rapidly growing counties in the nation, with 33 separate political jurisdictions (including county, state, federal, 25 city, and 5 Indian community), we feel that lessons we have learned may be instructive for other rapidly growing counties and/or jurisdictions.


The Problem:

One of the most challenging problems in any Geographic Information System is to provide an accurate and addressable road network. If a computer aided dispatcher needs to direct a 911 call to an officer in the field; if a permit inspector needs to find a new building site; if the human resources office needs to find viable carpool corridors; if the health department needs to track a potential epidemic; if a constituent needs to know where to vote; all these are just a small sampling of daily geocoding uses. But how do you create and maintain a reliable base road network in the first place? How do you deal with the changing address ranges and naming conventions as you cross the borders in a multi-jurisdictional environment? And when can you say that you are confident in your road network while 45,000 new addresses are being added yearly?

The Process

The first instinct in GIS is to seek out existing data in the hopes that it will be accurate enough to suit your business needs. For Maricopa County, we had a base network developed in cooperation with the Arizona DOT. Though it didn't have address information, it did have some name attribution and due to its stereo pair digital origins was spatially accurate to +/- 25 feet. We began updates through annual aerial photography, digitizing accuracy spot checks through GPS intersection verification, and attribute update information from our operations field crews. This process was slow and laborious and outdated the day after the aerials were flown.

The county was growing at an incredible rate and new roads along existing alignments were being given new names in an environment where independent jurisdictions were not set up to share and synchronize data. The local association of governments, Maricopa Association of Governments (MAG), continued to maintain their own alignments, based on old DIME files, and updated this line work with interpolated locations based on new building permits. Though short on geographical accuracy, this road network was rich in attribution, especially addresses. So, we conflated the attributes from the MAG road network onto our own with over an 80% match rate on arcs from one network to the other.

The conflation routines added their own peculiar errors into the updated network. Where new intersections split an arc, its attributes were duplicated rather than recalculated. Many of the "to" and "from" nodes became reversed. New roads that didn't exist in the latest, yet already outdated, aerial images were dropped as mismatches.

In order to allow multi-user editing on the road network, we chose to use ArcStorm. While we felt this was the best tool in the UNIX environment, allowing us to roll back to a transaction in case of error, the platform migration from UNIX to NT brought out many undocumented "features".

In order to "address" the problems produced in the conflation routine as well as the problems produced by ArcStorm "features" and to provide continual update capability, we had to create semi-automatic routines to detect errors and conflicts in both the spatial and the attributal integrity of the road network. This has been accomplished through the use of error checking AMLs run on ArcStorm arc, section, route, and node "copyouts" as well as SQL data integrity routines, which check for mismeasures such as gaps, overlaps, duplicate measures, etc. (please refer to the Bo Guo/Maricopa County database presentation at this year's Esri conference). The routines produce lists of conflicts which are then, in the majority of the cases, hand scrubbed. This portion of the network improvement has proven to be non-automatable due to the unique nature of each error instance.

The final scrubbed network is then converted for use in a Map Objects/Visual Basic application, where discrepancies between existing field data and implied GIS data may be compared through yet another error checking routine. The list of theoretical vs. existing arc measurement and directionality are then resolved and the road network is edited accordingly, then rechecked in the next "copyout".

Continued Impedance to Success

In becoming a jurisdiction, an area puts up an invisible boundary. In doing so, a new wall of bureaucracy is added as a natural variable. Business processes don't always consider the sharing of data between agencies a priority. Phoenix, the major city within our borders, is not necassarily the historical choice of other cities for centering their addressing schema. With 33 jurisdictions operating in our own boundary, we have at least 7 different addressing origins. Thus we recognize the need to include a jurisdictional zone field and/or zip left/ zip right. This can now accommodate the fact that our boundaries hold up to 14 instances of the same road name, prefix, and suffix combination. Roads change names as they pass from one jurisdiction into another and then to another. Assessor's records for a road name may be entirely different than how it is signed or used by the postal service. Some roads are of limited access and were never intended to be addressable. Some, like the interstates, use mile marker fractions as their addressing schema. Circular roads create multiple intersections of the same designation, distinguishable only by route measure or directional differentiation. Available data sets carry conflicting attributes, fields and geometry depending on their source. Errors are often made when addresses are designated during their planning stages. The lines of interagency communication and sharing are beginning to open, but changes to systems come slowly. The sheer magnitude of project vs. available resources and the multi-jurisdictional scope of the road network extend the duties of an addressing project to include the administrative tasks involved in creating cooperative internship programs with local universities and attending public meetings to develop new community relations and contacts.

The Progress

We have been fortunate in that the small staff which is concentrating on the road network improvements is extremely talented. In an atmosphere of a constantly multiplying network, improvements to existing geometric problems and attribute short fallings have been possible due to in-house program creation and processing. We have also established lines of communication with the majority of jurisdictional players, within the county organization as well as with most of the 33 other stake holders. We've developed new in-house business processes for collecting data through use of our signing and stripping crews, our sheriff department's full time and volunteer personnel, emergency response teams, local chambers of commerce, and part time employees dedicated to field verification. We have created automated processes which point out nearly every conceivable error type and have dedicated the time necessary to correct those errors. We are achieving a 98% hit rate on county maintained routes and have increased the overall county geocoding matches from 61% to 80%. We have identified new sources for information and added GIS-friendly steps to the business data exchange processes. We have entered into cooperative data exchanges with commercial mapping interests in mutually beneficial updates of attributes. Measures have been taken to ensure that address data coming into the county is collected electronically for more fluid conversion into GIS products. We are honing all of our collection methods to get the best use of knowledgeable field personnel without adding to their workload and letting various accident location GPS and departmental AVL projects provide the double benefit of improving our network while fulfilling their mandated duties.

Lessons Learned

There is no single answer to the question "How do I find an accurate, addressable road network?" Some start with a complete control and land information system and are able to geocode based on original situs address. Many use the TIGER files and wait for the census bureau to improve their final coverage. We feel that our county is too large and too dynamic to wait. We've tried all the available networks available and found the discrepancies too big to ignore. Route building and ADD files have their place as a starting point. From there, error routines using AML, SQL, MO, VB, and Powerbuilder can find errors. Avenue scripts can help the novice user correct a variety of those attributal errors. Aerials can point out gaps in the network. GPS can help to fill them in. Legislation can pursuade others to cooperate in data collection and sharing. But don't discount the value of commercial enterprises. They're in business to get the data to best serve their customers. They have the manpower and a customer base, giving them the ability to improve the network product through comment cards and 800 lines from those customers. Use whatever sources are available to collect the data - fire departments, libraries, E911 tables, elections departments, assessors, planners, permits, sign crews, stripping crews, law enforcement, pavement management, USGS, FEMA, TIGER, chambers of commerce, the USPS, and last but not least, your entire customer base. These are all valuable resources in your efforts to update an ever growing addressable road network.

REFERENCES OR ACKNOWLEDGMENTS

I would like to express my appreciation to both Katherine Douglas and Charlene Howard McDonald for their efforts in providing invaluable backround and resource information in the composition of this paper. Any ommisions or lack of clarity in content is due to my own procrastinatory compositional style. LTW - 12:24/6/2/00

Larry Wolfson
Senior Decision Support Analyst, Infrastructure Technology Center
Maricopa County, Arizona
2901 W. Durango St.
Phoenix, AZ 85009
PH# 602.506.4686
FAX: 602.506.8594
larrywolfson@mail.maricopa.gov