Margo M. Blosser

Modeling Address Ranges

Abstract

A fundamental database for many Urban GIS applications is a roads centerline file with address ranges. Geocoding events to a range based system is essential for CRIME ANALYSIS, TRANSPORTATION STUDIES AND EMERGENCY DISPATCH. Early address matching efforts at Clark County used Census TIGER files and the county parcels centroids layer. The number of correctly matched addresses was low. Missing address ranges were responsible for most of the unmatched addresses. Clark County is currently assigning address ranges to the roads centerline file.

This new layer is being developed as a modeling process rather than a data entry or conflation process. The model assigns theoretical address ranges based on intersecting block numbers. Before address ranges are assigned all arcs are flipped based on an address datum. This model uses a turntable as its primary data structure.

Background

Many applications at Clark County are driven by geocoding events through an address matching process. Crime Analysis is based on taking tabular crime data recorded by address and creating spatial coordinates for these incidents.

Transportation studies need to know the location of employees to facilitate ride sharing. Emergency dispatching requires the ability to dynamically locate events by address. A variety of data sets have been applied to these applications with varying results. Crime Analysis used the TIGER file for the spatial component of addressmatching. To keep false positives at a minimum this application required all components of an address be matched, i.e., number, street name, street type and direction. A minimum score of 100 was used in the scoring table and sixty-seven percent of the incident addresses were matched.

Rejects from this process were then addressmatched to a parcel centroids layer matching an additional 13 percent. Employees were matched to Employer locations using the counties parcels centroids layer. Keeping a minimum score of 100 percent 163 employees out of 5757 were matched to their employer locations (McCarley, 1995). Only through relaxing the components of an address to be matched was a 57 percent match rate achieved. Emergency dispatch is a future project at Clark County but clearly a match rate of 57 percent is not acceptable for such an application. The need for more accurate addressmatching is the driving force for the creation of address ranges.

Clark County's Approach

A quick review of the GIS Journals shows a wide variety of approaches for creating a range based system from a roads centerline file. Some have treated this problem as a data entry task by examining building permits, emergency dispatch files or obtain ranges from field research (McCarley, Brandt, 1992).

Others have taken centroids parcels coverages and "spatially related" centroid address to arc intersections to created ranges (Sosinski, 1992). Conflation of the Tiger file to a large scale roads centerline file has also been applied as a means to spatially transfer Tiger file range attributes (Bosworth, 1995).

Clark Counties approach has been somewhat different. The roads centerline file is updated on a weekly basis. This is in part due to the rapid amount of growth that is currently underway as well as the GIS department's commitment to accurate data. A visual comparison of the Tiger file to the Roads centerline file demonstrates the temporal differences between the two data sets. The Clark County roads network is a much denser network than the Tiger file.

Thus the counties approach to this problem has been to create address ranges for its roads centerline file modeled after the TIGER file. Given the large number of numeric street names in the county it was feasible to approach this problem an a modeling exercise. Clark county has approximately 7200 unique street names, of these 5000 have a numeric value and 2200 have an alpha name. The hundred block model, developed by the County, is based on calculation of 100 block numbers at street intersections.

The Hundred Block Model

The hundred block model calculates 100 block numbers for every street intersection. For example, Winters street that runs between 10th and 11th, would have its 100 block number calculated as 1000 for its from node (fnode) and 1100 for its to node (tnode).

This method is applicable only for street intersections that fall between numeric streets. Once 100 block numbers have been determined for each street intersection, these numbers are then used to calculate non-overlapping odd and even address ranges. The directionality of and arc must conform to address assignments in the county.

The rules that govern the assignment of addresses are based on a datum that divides the county into four quadrants. Addresses increase in different directions based on which quad it is in. In the NE quad roads that trend in a north/south direction will have addresses increase from south-to-north. Roads trending in a west/east direction have addresses that increase from west-to-east (see figure 1). Even numbers are located on the north side of west/east trending streets and on the west side of north/south trending streets. Odd numbers are located on the south side of west/east trending streets and the east side of north/south trending streets.

Thus an even or odd address may be on the left or right side of an arc depending on which quad an arc is located in. The left and right side of an arc are determined by "standing on an arcs fnode and looking toward its tnode.

DATA STRUCTURES and PROGRAMS

Program Flipper

The Flipper program orients arcs based on the quadrant of the address datum in which it occurs. The program uses the coordinate values of an arcs fnode and tnode. For arcs trending in a west/east direction, x coordinate values are checked for each arcs fnode and tnode. If an arc is in the NE quad of the address datum and its fnode has the highest x coordinate value then it does not require flipping. If the tnode has the highest coordinate value then the arc needs to be flipped. For arcs trending in the North/South direction y coordinate values are checked and incorrectly oriented arcs are flipped.

Program Nodewalk

The Nodewalk program looks at an arc's fnode and tnode to determine the street names of intersecting arcs. At each node the program "walks" to every arc that crosses that node. The data structure that supports the nodewalking program is a turntable.

Turntables

Turntables are designed to model information about street networks for routing and allocation applications. Its use as a data structure for this application is less common. The turntable models information about nodes. Its basic structure is used to represent all possible turns that can occur from a node. In this application it is used to find all arcs that are associated with a given node. Where an ATT stores information about objects that share common geometry with an arc, i.e., from and to nodes, a TURNTABLE stores information about objects that share common geometry with nodes, i.e., arcs.

It is different than a node attribute table (NAT) in that it has many records per node. Every node has many arcs associated with it and a record exists for every turn that would cross-over a node. A node with four arcs would have sixteen possible turns that could occur. Each possible turn from a node exists as a record in the turnfile. The from and to arc of the turn are represented in the ARC1# and the ARC2# attributes. The angle or azimuth of the turn is also contained in the turntable.

Nodewalk Algorithm

To find intersecting, arcs several tests are required to find the correct record in the turntable. A relation is formed between the Roads AAT and the Roads Turntable. A one-to-many relationship exists between the AAT and Turntable, that is for every node in the AAT many records exist in the Turntable for a given node. To manage this relationship the AAT is related to the Turntable file thus creating a many-to-one relationship. The following tests are used to determine the correct intersecting ARC:

Cursor through the coverage and visit every arc.

1) For a given arc find its from and to nodes (fnode and tnode respectively).
2) Perform the following checks on both the fnode and tnode.
3) For the fnode in the AAT find the matching nodes in the turntable.
4) Find the nodes in the turntable that are turns originating from the processing arc.
5) For the processing arc find arcs that are at a right angle to it.
6) For the arc that is at a right angle to the processing arc find the street name of this intersecting arc.
7) Calculate a 100 block number for this intersecting arc.
8) Repeat steps 3 through 7 for the tnode.

These tests make more sense in an actual example. For instance in figure 2 the processing arc is arc number 7434 which is 103rd Ave. It has a fnode number 7434 and a tnode number 6471. To find the intersecting arc for the fnode, node number 6315, select the records in the turntable that have the same node number (step 3). This results in the records being selected that are shown in table 1. These records represent all possible turns from node number 6315. For this application only the arcs that are turns from the processing arc (number 7434) are of interest (step 4).

This test resulted in the first four records of the turntable being selected. Of the four arcs that are selected only those that are at a right angle are candidates (step 6). Two arcs meet this requirement; arc numbers 7269 and arc number 7272. Arc number 7272 is selected from the turn table because it is the first record of the selected set. Tenth Ave is found to be the intersecting street and arc number 7434 or 103rd street. The 100 block number is calculated to be 1000 (10 * 100). This process is then repeated for the tnode of the 103rd Ave arc.

The final part of the process is to take the 100 block numbers and create non-overlapping address ranges for the right and left side of an arc. The left and right sides of an arc have a from and to attribute. Thus, four attributes are created for an arc, L_from, L_to, R_from and R_to. These attributes are modeled after the address ranges in the TIGER file. Odd and even addresses may be contained in either the left or right from and to ranges. A simple calculation either increments or decrements the 100 block values based on whether the node is a from or to node and whether a value is to be an odd or even address.

Results

This model is being run as this paper is being written. The success of this model depends on the number of streets that are intersected by numeric streets. For the roads tile with the highest number of named streets the model is able to correctly identify 50 percent of the ranges. Out of roughly 8000 arcs, 4000 have correct ranges assigned. In more rural areas of the county the proportion of numbered streets to named streets is much higher. In these areas a larger number of address ranges could be expected.

Acknowledgements

I would like to thank my boss Bob Pool, GIS Coordinator, for his wealth of technical tricks, arc wizardry, ideas and encouragement for this project.

Appendixes


References

Bosworth Mark, 1994. Informal Conversation. Sr. GIS Specialist. Metro. Portland Or.

Brandt, 1992. Using ArcInfo TO DETERMINE REQUIREMENTS FOR A POSTAL TIGER FILE. Proceedings of the Twelfth Esri User Conference. pg. 475 to 486.

McCarley Cliff, 1994. Informal Conversation. GIS Analyst. Clark County Department of Assessment and GIS. Vancouver Wa.

Sosinski, 1992. STREET CENTERLINE COVERAGE ADDRESS RANGE PROPAGATION. Proceedings of the Fourteenth Annual Esri Conference - May 1994. pg. 1134.


Author
Margo M. Blosser
GIS Analyst
Clark County Department of Assessment and GIS
Vancouver, Wa. 98666-5000
Phone (360) 699-2391
Fax (360) 737-6046