Gary W. Johnson

Beyond Zip Codes: Outcome Measuring in Human Services Programs Using Geocoding

Abstract

The field of human services (education, health, welfare) is in the midst of a radical transformation characterized by a movement toward what are often termed "comprehensive, integrated services" (CIS). Across the nation, service delivery systems are being reorganized into smaller units that are more neighborhood-based and multi-agency in nature, and which focus on the impacts of the services provided. These new decentralized, collaborative, outcome-driven service delivery systems pose special challenges to evaluators who historically have monitored narrowly-defined programs operated jurisdiction-wide by a single agency, and whose focal point has been to track inputs or processes. Arguably, the most formidable of these challenges is the wide-spread use of zip codes as the traditional basis for data collection. In the new service delivery paradigm, reliance on zip code areas as the primary outcome measuring element is ill-advised: they are usually too large to produce the discrete data needed; their boundaries are subject to frequent change; they do not coincide with "neighborhoods" targeted for services as defined by residents and program planners; and their use requires assuming that socio-economic and demographic characteristics are evenly distributed within their boundaries. The use of GIS and Geocoding allows for reengineering the existing zip code data sets, and provides a means for integrating different data sets into a study area defined in human terms. This paper presents the results of Fresno County, California’s use of its Family Preservation and Support Program (FPSP) as the vehicle to establish such a system, and cites the experiences gained using GIS and Geocoding in developing a "Spatial Evaluative Perspective" for CIS outcomes.


Introduction

By definition, the data needed for measuring CIS outcomes involve a large number of agencies and institutions that gather and maintain information within an equally large number of timeframes and formats. Therefore, any outcome measuring approach needs to possess tools to manage these differences, integrate the data, ensure their integrity, and analyze them once they are integrated.

Fresno County's existing GIS system contains an abundance of geographic data from several hundred coverages, spanning 6,000 square miles, with extensive data attributes associated with them. However, the primary developers and users of this data are traditional "map-based" entities such as the Public Works, Assessor, and Elections Departments. The efficient and effective integration of GIS with the development of outcome measurements for CIS required a gateway (Geocoding) to provide access to this spatial-based information by agencies with data that were person or family- based.

 

Need

FPSP is a five-year federal initiative established by the Omnibus Budget Reconciliation Act of 1993. It has as its goals the reduction of child abuse, delinquency, teenage pregnancy/parenthood, substance abuse and other serious social pathologies among families with minor children through the development and testing of new community-based, collaborative, outcome-driven service delivery systems.

Based upon extensive community input, a comprehensive five-year FPSP operational plan was developed. As a result, seven high-need "neighborhoods" (defined by elementary school attendance boundaries) were selected as FPSP’s pilot sites. A school-linked "Neighborhood Resources Center" was established in each area to provide residents with a user-friendly "one-stop" access point to a broad range of human services and programs jointly provided by multiple agencies. Reflecting the requirement that services be "outcome driven," thirteen specific, well-defined and objective measures were developed to gauge the effectiveness of this new service delivery approach (e.g., as a result of pregnancy-related services, the incidence of low birth weight/drug-exposed babies a target area will decrease by .25 percentage points per year over baseline).

The choice of school attendance areas as target sites neatly defines the problem at hand. First, they are subject to modification from year to year based upon population changes. Second, they seldom coincide with the zip boundaries that serve as the traditional basis for data collection. Third, experience has shown that even when a school attendance area is located entirely within one zip code, its population does not necessarily reflect that of the zip code as a whole. Finally, school boundaries can often cross over a number of different zip code areas as dramatically illustrated for one of Fresno County's FPSP target sites in Figure 1.

Addams School Boundaries with Zip Codes Overlaid

Figure 1. Addams School Boundaries with Zip Codes Overlaid.

 

Setting

In 1973, the County of Fresno began developing an in-house mainframe GIS called the Environmental Management Information System (EMIS). EMIS was used until 1992 when Esri’s "ArcInfo" was selected as the backbone for a broader regional GIS. Currently, three units of local government within Fresno County are members of this regional system: County of Fresno; City of Fresno, and City of Clovis. Each member is responsible for developing and maintaining the data needed for its own use, and for sharing theses data with the other agencies.

Within county government, GIS is enterprise-wide with seven different departments sharing data and data-maintenance responsibilities. The system uses "ArcInfo Librarian." It is made up of three primary libraries distinguished by the number of "tiles" each contains (one, few or many) and called, appropriately enough, "PDONETILE," PDFEWTILE," and "PDMANYTILE." The library PDMANYTILE has over 750 tiles, which contain data on over 250,000 parcels of land in Fresno County.

 

Solution

Recognizing the spatial component of addresses, which are linked to parcels (six different coverages) by a primary key called an "Assessors Parcel Number (APN), a geocoded point address file needed to be developed. This required the merging of two different tabular address files maintained by the Fresno County Assessor's Office, which have that common APN key. The initial review of these files quickly identified a major problem that would have to be rectified: parcels with multiple addresses, such as apartment complexes, were entered by range (e.g., 1010 First St. #’s1-10). A program was written which identified those range-based records and then produced a separate record for each individual address within the range. This file was then verified using "FINALIST" software resulting in over 320,000 valid addresses.

A second problem that arose was the compunding of an otherwise minor negative influence on tile boundaries produced by the addition of over 200,000 parcels as an ArcView theme. Specifically, if a parcel (polygon) was split by a tile boundary, it produced new polygons for that parcel each with the same APN number (Figure 2). In the project under discussion, over 12,000 new pieces so resulted. Initially the effort to consolidate the multi-piece parcel back into one parcel using "ArcView 2.1a" was unsuccessful as this resulted in a fatal system error. This was overcome by drafting an ArcView script (Appendix 1) that took each multi-piece parcel sequentially and "summed" it into one complete parcel. It should be noted that "ArcView 3.0" seems to have resolved the system error problem.

Parcels Broken by Tile Boundary Line

Figure 2. Parcels Broken by Tile Boundary Line

An additional problem proved to be the time needed to merge the aforementioned six parcel themes into a single one. A second script (Appendix 2) was written that attached each of the individual parcel themes on to the end of the largest associated parcel theme. This composite theme of over 250,000 parcels was then geocoded against the tabular address file by APN, resulting in a point address theme of over 320,000 different addresses. These 320,000 address points reflect about 97% of the known addresses in Fresno County. (The remaining 3% are within mobile home parks and coverage is currently under development). In addition, as additional non-verified addresses are identified and verified, they will be added to the theme.

 

Outcome Measurements

The biggest challenge to developing and generating valid and reliable outcome measures in human services is the lack of a centralized data system for client population(s). As noted earlier, clients can receive services from any number of different agencies, sequentially or simultaneously, within separate or overlapping timeframes. This is then compounded by the fact that client/case information is maintained in multiple databases each with a different system/data format that utilizes a different primary identifier (e.g., Social Security Number, Student Identification Number, birthday etc.).

Due to cost, time constraints and technological limitations, most traditional program evaluation efforts have been limited to using a small number of data sets linked by the most convenient common denominator; a zip code. Consequently, these efforts often fail to define or describe the real impact of services on a study area or target population unless they have been defined by zip code boundaries (the shortcomings and rarity of which have been previously discussed). Most often, zip code-indexed data is extrapolated to "fit" the area of study or the target sub-population.

One way to deal with these limitations is to utilize technologies that can exploit address information that is contained in most existing data sets but, for numerous reasons, is very difficult to systematically access. The application of geocoding provides a means to coordinate and define data attributes (e.g., birthweight, crimes) down to that address level. It allows different types of data to be layered on top of one other, regardless of their database format or primary identifier. The resulting spatial data can then be expanded from a discrete address into a larger study area, and any required coordinate changes to that study area can be made seamlessly. Due to the dynamic nature of the study areas, geocoding also provides tools to maintain data consistency over time. In addition, this provides the ability to evaluate the data from any point in the data history as it compares to the study area. Fresno County has termed this new technique for coordinating addresses spatially with their associated data for use in program evaluation the "Spatial Evaluative Perspective."

 

Example

The 1995 birth information (15,976 live births) for Fresno County was geocoded against the address point theme. As stated, one of the outcomes for FPSP is to reduce the incidence of low birthweight/drug exposed infants by .25 percentage points per year. Figure 3 reflects the occurrence of low birthweight infants (2500 grams or less) within the Addams School boundaries. The SEP of this data reveals that in zip code 93728 (the largest zip code in the study area) birth data are not homogeneous in nature. The incidence of low birthweight infants decreases at the School's eastern border (defined by railroad tracks).

Addams School Boundaries with Low Birthweight Occurrence

Figure 3. Addams School Boundaries with Low Birthweight Occurrence

A simple analysis (table) of the birthweight data for the Addams School area provides some insight into how zip codes can bias the data. Zip code 93706 has a 10.7 percent incidence of low birthweight (84 babies) yet only one of these babies was born in the Addams attendance area.

Table of Birth Weight (Grams)  
Study Area

Addams

93705

93706

93722

93728

County

Count

89

678

788

1,063

351

15,976

Mean Weight

3,335

3,326

3,236

3,355

3,303

3,326

<= 2500

5

41

84

73

27

1,140

% <= 2500

5.6

6.0

10.7

6.9

7.7

7.1

 

Conclusions

In spite of the time needed to develop a countywide address point theme for evaluation purposes, this approach represents a major cost effective advance over the use of zip code-based data in program evaluation. It provides the evaluator with the freedom to access and layer any address-related data feature without regard to database type, primary identifier or the number of data sources. It reduces the cost and the time required to perform data manipulation(s), eliminates data limiting boundaries, and streamlines the evaluation process while increasing the validity and reliability of its products. By providing a spatial evaluation environment with the ability to integrate data while protecting its integrity (both in form and content) evaluators can provide timely program outcome information to help policy makers and service providers to structure/modify the delivery of services to be more effective and efficient.

Any program evaluator analyzing data for human services programs should consider using GIS and Geocoding as a means of developing baseline data and measuring program outcomes. The amount of effort invested in developing a GIS system with its geocoding components will be recovered many times over.

 

Acknowledgements

The author would like to acknowledge the Fresno County Board of Supervisors (Stan Oken, Chairman), and Fresno County Department of Social Services Director Ernest E. Velasquez, whose continued support made this effort possible; the staffs of the Fresno County Computer Services Department, the Fresno Interagency Council for Children and Families, and all the agencies involved in Fresno County's FPSP for their valuable technical assistance.

 

Appendix

Appendix 1

'Script to Sum Parcel
theview = av.getactivedoc
theftab = theview.findtheme("Filename.shp").getftab
thetheme = theview.getthemes.get(0)
thevtab = thetheme.getftab
thetable = av.getproject.finddoc("Sum_Filename.dbf")
the2vtab = thetable.getvtab
thebitmap = thevtab.getselection
 
anftab = ftab.makenew("Add_Point".asfilename,point)
Apn = field.make("No-apn",#field_char,10,0) anftab.addfields({apn})
shapefield = anftab.findfield("Shape")
napnfield = anftab.findfield("No-apn")
apnfield = the2vtab.findfield ("No-apn")
apnfld = theftab.findfield ("No-apn")
shpfld = theftab.findfield ("Shape")
 
for each r in the2vtab
searchstr = the2vtab.returnvaluestring(apnfield,r)
thequery = ("[No-apn] = " + searchstr.quote)
thevtab.query(thequery, thebitmap, #vtab_seltype_new)
thevtab.setselection(thebitmap)
thevtab.updateselection
newftab = theftab.summarize( "1file".asfilename, shape, apnfld, {shpfld},{#vtab_summary_sum})
 
for each rec in newftab
apnadd = newftab.findfield("No-apn")
sfield = newftab.findfield("shape")
xpoint = newftab.returnvalue(sfield,rec)
apoint = xpoint.returncenter
fieldvalue = newftab.returnvaluestring(apnadd,rec)
newrecnum = anftab.addrecord
anftab.setvalue(shapefield,newrecnum,apoint)
anftab.setvalue(napnfield,newrecnum,fieldvalue)
end
thebitmap.clearall
end
v = av.getproject.finddoc("Address")
v.addtheme(ftheme.make(anftab))

Appendix 2

'Script to Merege Theme to End Of Other Themes
theview = av.getactivedoc
theftab = theview.findtheme("Filename.shp").getftab
the2ftab = theview.findtheme("2Filename").getftab
theftab.seteditable(true)
 
shpfld = theftab.findfield("Shape")
apnfld = theftab.findfield("No-apn")
cntfld = theftab.findfield("Count")
shp2fld = the2ftab.findfield("Shape")
apn2fld = the2ftab.findfield("No-apn")
cnt2fld = the2ftab.findfield("Count")
 
for each rec in the2ftab
shp = the2ftab.returnvalue(shp2fld,rec)
apn = the2ftab.returnvalue(apn2fld,rec)
cnt = the2ftab.returnvalue(cnt2fld,rec)
if(shp <> Nil) then
newrec = theftab.addrecord
theftab.setvalue(shpfld,newrec,shp)
theftab.setvalue(apnfld,newrec,apn)
theftab.setvalue(cntfld,newrec,cnt)
else
exit
end
end
theftab.seteditable(false)

Gary W. Johnson
Systems and Procedure Analyst
Fresno County Department of Social Services
2600 Ventura Avenue
Fresno, CA 93721
Telephone: 209.453.6761
FAX: 209.453.6100
gjohnson@fresno.ca.gov