Shih-Lung Shaw and Phillip Lall Dass

A GIS Analysis of Geographic Variations in Travel Characteristics

Abstract:

Traditionally, a travel characteristics study collects and analyzes data at the zonal level (e.g., traffic analysis zones). This zonal approach precludes the possibility of examining some important variations in travel characteristics at a finer geographic scale. With the address-matching and map overlay capabilities available through a geographic information system (GIS), transportation planners are no longer constrained by the traditional zonal approach.

This paper discusses the procedures of creating ArcInfo GIS databases of household and individual trip log data for the Treasure Coast Travel Characteristics Study recently conducted in south Florida. In this study, both household and individual trip log data were geocoded to the street-address level and point coverages of their locations were created. The derived ArcInfo coverages can show detailed geographic patterns of various travel characteristics using the ArcView 2 software. In addition, statistical analysis of these geographic variations can be performed on the data generated from GIS spatial analysis functions.


Introduction.

Travel characteristics surveys provide transportation planners with important baseline data for model validation and model calibration. Since transportation movements are a reflection of the spatial interactions among different places, geographic location therefore plays a critical role in modeling travel patterns. Traditionally, travel demand modeling and analysis are based on the data collected at a zonal level (most frequently by traffic analysis zones, TAZs). With an increasing use of geographic information systems (GIS) in the transportation planning community, it now is easier to deal with the geographic locations of travel characteristics at a finer scale than the traditional zonal approach.

This paper discusses the procedures of using the ArcInfo GIS software to create GIS databases of the individual point locations of both household and trip log data collected in the Treasure Coast Travel Characteristics Study. The Treasure Coast Region consists of Indian River, St. Lucie, and Martin Counties located in south Florida. Unlike the continuous urban development pattern found in the Miami/Fort Lauderdale, and the West Palm Beach metropolitan areas, the Treasure Coast Region has five major urban centers (i.e., Sebastian, Vero Beach, Fort Pierce, Port St. Lucie, and Stuart) separated by less developed areas between them. Additionally, three major highways (US-1, I-95, and Florida's Turnpike) provide the key transportation linkages connecting these five urban centers which are linearly distributed within the Region as well as their connections to other major urban centers located north of and south of the Region.

The main objective of this study is to create GIS databases of the Treasure Coast Travel Characteristics Study such that the geographic variations of travel characteristics within the study area can be examined in detail. The next section discusses some important issues related to a travel characteristics survey design and its GIS database creation. It is followed by examples of analyzing the geographic variations of residential trip generation rates with respect to the proximity to urban centers and the proximity to the major highways within the Treasure Coast Region.

Design Issues of Questionnaire Survey and GIS Database Creation.

The Treasure Coast Travel Characteristics Study consists of two parts in its survey (FDOT, 1995). The first part is a "household survey" that collects background data about each sampled household. The second part is a "trip log survey" that records every trip made by each individual (6-year-old or older) in a sampled household. In order to create GIS databases that can show the point locations of each household and individual trip ends, there are some important issues related to the questionnaire design and the GIS database creation processes.

1) Geocoding.

For the household survey, street address is used as the geocode to derive household locations. Since survey respondents readily know the street addresses of their residences, this geocode does not present a problem. On the other hand, the trip log survey requires alternative geocodes for the respondents since they may not know the exact street addresses of their trip ends (e.g., a park or a grocery store). The questionnaire design thus needs to allow respondents to fill in either street address, the closest street intersection (e.g., ABC Street/XYZ Avenue), or the place name with its associated street and city names (e.g., ABC Bank on XYZ Street in City X). Both street addresses and street intersections can be easily handled with the ArcInfo address-match function on an address coverage created from sources such as TIGER/Line, ETAK, GDT files. Place names, however, require an additional step of finding their corresponding street addresses from a phone directory (either paper copy or digital copy) first. The ArcInfo ALIAS command then can be used during the address-match process to match the place names to their street locations (Esri, 1992).

Another important consideration for trip log data is some trip ends may be located outside of the study area. If address coverages are unavailable for those areas, these trip ends will be rejected in the address-match process. One alternative to taking care of these rejected records is to group them by their city codes and then match them to the representative point location of each city. Although this alternative approach does not result in the exact locations of those rejected records, it gives a reasonable geographic distribution pattern of their approximate locations.

2) Unique ID's for Linking the Household Survey and the Trip Log Survey Databases.

Many analyses of travel patterns require the access of data collected from both the household survey and the trip log survey. It therefore is critical to establish unique identification (ID) items to link these two data sets together. There are three key ID items that must be built into the GIS databases. The first item is the "Household ID" that establishes a one-to-many relation between the household survey database and the trip log survey database. Within the trip log survey database, another two ID items ("Individual ID" and "Trip ID") also need to be built into the database. These two data items keep track of the records for each individual and each trip made by every individual in a household, respectively. In the case of a multiple-day trip log survey, a fourth item is needed to record the "Travel Date". With these ID items built into the two databases, it is possible to retrieve, analyze, and display the geographic variations of travel characteristics for any combination of households, individuals, trips, and travel dates.

3) Individual Trip End Locations versus Trip Links.

A trip is normally considered as a link connecting an origin location and a destination location. Unfortunately, this "trip link" approach is incompatible with the GIS address-match function that deals with individual point locations. If both the origin end and the destination end of each trip are address-matched in ArcInfo, it will result in redundant records in the Point Attribute Table (PAT) since the destination end of a trip is always the origin end of the next trip. To overcome this problem, only the origin locations of all trips and the destination location of the last trip made by an individual on a given date should be address-matched. All data related to a particular trip are recorded with the origin location of that trip. The destination location of the last trip made by an individual, which has a "Trip ID" of zero and no trip-related data, is included to represent the complete trip sequence made by an individual on a given date. Since each record has its unique "Household ID", "Individual ID", "Trip ID", and "Travel Date", it is easy to re-assemble the trip sequence made by any individual on a given survey date from the database.

One major shortcoming of such "TRIP ENDS" databases is that only the trip origin locations are taken into account when the geographic pattern of a particular travel characteristics is displayed or analyzed. This could generate a misleading result for certain travel pattern analyses. It is possible to write an Arc Macro Language (AML) program to GENERATE a line coverage that shows the trip links. This can be accomplished by linking the trip end locations with reference to their "Household ID", "Individual ID", "Trip ID", and "Travel Date" items stored in the database. The derived "trip links" coverage gives a better representation of the geographic distribution of individual trips, but it could also result in a cluttered graphic display.

4) Multiple Records at the Same Geographic Location.

It is a common pattern for an individual to visit the same location more than once on a given date. For example, a parent drives a child from home to school in the morning and picks up the child from the school in the afternoon. In this case, the same school location represents two separate trips. A visual display of the database therefore will show fewer point locations than the actual number of trips. This problem also will persist with the creation of a "trip links" coverage. However, this drawback in visualization of the data does not present a problem when the spatial query and analysis functions in ArcInfo are performed. For example, "point-and-click" on a specific point location will retrieve all of the trip records associated with that point. Similarly, spatial search and map overlay functions will also identify all of the records associated with the same point location.

GIS for Analyzing Geographic Variations in Travel Characteristics.

The traditional method of geocoding travel characteristics data by traffic analysis zones presents a major limitation in the creation of GIS databases. An ArcInfo polygon coverage of traffic analysis zones creates one record for each TAZ. When there are multiple records (i.e., multiple households or multiple trip ends) associated with the same TAZ, there exists no appropriate way of storing the data except for the creation of a separate table with the TAZ ID as one of the data items. This prohibits the display of any geographic variation of travel characteristics within a traffic analysis zone. In addition, since all records located within the same TAZ are assigned to the polygon label point (i.e., TAZ centroid), many GIS spatial analysis functions cannot be properly performed. For example, a spatial search or a polygon overlay will generate a result that either includes or excludes all of the records associated with each TAZ. Individual data records associated with different household locations or different trip end locations within a TAZ cannot be properly retrieved and analyzed.

This limitation becomes even more obvious when we would like to analyze the trip-chaining behavior using the trip log data. A trip chain is defined as a connected sequence of trips that has the origin and the destination of the chain located at the same point in space (Meyer and Miller, 1984). One example of a multiple-stop trip chain could be a home-work-shop-home trip sequence. If the trip ends are geocoded by TAZ, trips made within the same zone are represented by the same centroid location. Therefore, it becomes difficult to analyze the trip-chaining behavior within a GIS environment. This defeats the purpose of using a GIS to create a trip log database for spatial analysis.

On the other hand, an ArcInfo point coverage created from the address-match function overcomes the above shortcomings. For example, simple queries based on the "trip start time" data item in the "TRIP ENDS" point coverage can easily show the trip origin locations for both the morning peak hours as well as the afternoon peak hours (see Figures 1 and 2, respectively). Additionally, GIS analysis functions, such as spatial search or map overlay, can be performed on this point coverage to identify individual trip locations for further geographic analysis.

Figure 1. Distribution of Trip Origins 

during the Morning Peak Hours.
Figure 1. Distribution of Trip Origins during the Morning Peak Hours.

Figure 2. Distribution of Trip Origins

during the Afternoon Peak Hours.
Figure 2. Distribution of Trip Origins during the Afternoon Peak Hours.

The method of geocoding travel survey data down to the street address level also encounters a major challenge. Due to the requirement of recording the specific street address of each trip end, it is more likely to receive incomplete or incorrect locational data from the survey. Compounded with a less-than- perfect address coverage, the percentage of successful matches may be lower than the desired level from an automated address-match run. This address-matching rate can normally be improved if an interactive method of processing the rejected records is implemented. However, such an interactive processing method could mean a significant increase of database creation time.

Based on the above discussions, it is clear that the traditional method of geocoding travel characteristics data by traffic analysis zones is incompatible with the ArcInfo GIS database structure and cannot take full advantages of GIS spatial analysis capabilities. Although the creation of GIS databases for travel characteristics data using the address-match function also experiences some shortcomings, it shows significant improvements over the traditional method in data retrieval, data analysis, and data display within a GIS environment.

Geographic Variations of Residential Trip Generation Rates.

This section presents some examples of analyzing geographic variations of the residential trip generation rates using both the ArcInfo spatial analysis functions and the SPSS statistical analysis procedures. There are two conventional methods (i.e., regression analysis and cross-classification analysis) used in the estimation of trip generation rates (Kanafani, 1983; Meyer and Miller, 1984). Since the Florida Standard Urban Transportation Modeling Structure (FSUTMS) employs a cross-classification method, this study therefore focuses on the three classification variables (dwelling type, household size, and auto ownership) used in the FSUTMS. It should be noted that this study does not distinguish the home-based trips from the non-home-based trips. Instead, the total number of trips made by all individuals in a household represents the trip generation rate of that household. The hypothesis tested in this study is that households located at different distances from the urban centers and from the major highways would exhibit different trip generation rates.

In addition to the "HOUSEHOLDS" and the "TRIP ENDS" point coverages created from the data collected in the Treasure Coast Travel Characteristics Study, two more ArcInfo coverages were created for analysis purposes. The first is a polygon coverage consisting of the city limits of the five major urban centers in the Treasure Coast region (Figure 3). The second is a line coverage including the three major highways (US-1, I-95, and Florida's Turnpike) located within the Region (Figure 4). In order to derive the trip generate rate of each household, the FREQUENCY command was used on the "Household ID" and the "Individual ID" in the "TRIP ENDS" coverage to derive the total number of trips made on a selected date by each individual. Since the "TRIP ENDS" coverage geocoded the origins of all trips plus the destination of the last trip made by each individual, the numbers derived from the FREQUENCY command represent one extra trip for each individual. Therefore, these numbers need to be adjusted by subtracting the total number of individuals in each household in order to derive the correct total number of trips made by all individuals in a household. Once the trip generate rates of each household are written into an Info table, this table is joined with (or related to) the PAT of the "HOUSEHOLDS" coverage using the "Household ID" as the relate item.

Figure 3. Distribution of Major Urban 

Centers within the Study Area.
Figure 3. Distribution of the Major Urban Centers within the Study Area.

Figure 4. Distribution of Major Highways 

within the Study Area .
Figure 4. Distribution of Major Highways within the Study Area.

For a preliminary evaluation of the impact of geographic locations on the trip generation rates, one-mile interval was used to create buffer zones around the city limits and around the major highways. These buffer zones are then overlaid with the "HOUSEHOLDS" point coverage to assign a buffer zone ID to each household location. The data in the point attribute table of the "HOUSEHOLDS" coverage were then written out to an ASCII file for performing "difference-in-the-means" tests, using the SPSS statistical package, on the average trip generation rates between each pair of buffer zones. The test results provided some useful information. First of all, due to the geographic distribution of the sample households, the one-mile interval created some zones with zero or very few observations. Secondly, the one-mile interval appeared to be too small to reflect geographic variations between some zones. Lastly, the "t" statistic for different zonal pairs provided a good basis to estimate the best buffer intervals of dividing up the study area.

Based on the results derived from the preliminary tests, it was decided to divide the study area into two buffer zones for further evaluations of the impact of proximity to urban centers and the impact of proximity to major highways on the trip generation rates. For the "URBAN CENTERS" coverage, it was divided into a zone of within-four-miles from the city limits and another zone of beyond-four-miles from the city limits. With regard to the "MAJOR HIGHWAYS" coverage, the Florida's Turnpike was dropped from further analysis due to its proximity to limited number of household locations. In addition, the US-1 and the I-95 were combined, due to their close proximity to each other, to generate a less-than-or- equal-to 2 miles zone and a greater-than 2 miles zone. These new buffer zones were again overlaid with the "HOUSEHOLDS" point coverage and two separate ASCII files were created for statistical analyses (Figures 5 and 6).

Figure 5. Distribution of Households

in the Buffer Zones around Major Urban Centers.
Figure 5. Distribution of Households in the Buffer Zones around Major Urban Centers.

Figure 6. Distribution of Households

in the Buffer Zones around US-1 and I-95.
Figure 6. Distribution of Households in the Buffer Zones around US-1 and I-95.

The results of the "difference-in-the-means" tests for the two files are shown in Tables 1 and 2, respectively. Both cases indicate that the average trip generation rates of the households in the two geographic zones are statistically different at a 95% confidence level. In other words, geographic location appears to play a significant role in trip generation rates in terms of both their proximity to major urban centers and their proximity to the major highways.


    Table 1. Results of the "Difference-in-the-Means" Test for

             Buffer Zones around the Major Urban Centers.   

    -------------------------------------------------------------

                    |  Mean Trip Rate  |  t value  |  Sig. Level

    -------------------------------------------------------------

    Within 4 miles  |       8.244      |    2.80   |    0.007

    Beyond 4 miles  |       5.750      |           |

    -------------------------------------------------------------



    Table 2. Results of the "Difference-in-the-Means" Test for

             Buffer Zones around the US-1 and I-95.   

    -------------------------------------------------------------

                    |  Mean Trip Rate  |  t value  |  Sig. Level

    -------------------------------------------------------------

    Within 2 miles  |       8.417      |    2.00   |    0.048

    Beyond 2 miles  |       6.562      |           |

    -------------------------------------------------------------

In order to further examine the relationships between the geographic locations and the classification variables used in the trip generation stage of the FSUTMS, same test procedures were applied to the trip generation rates of households classified by the three variables (dwelling types, household size, and auto ownership) within each of the buffer zones created above. The analysis results indicate that single-family households have statistically different average trip generation rates between the two zones around the major urban centers (Table 3) and around the major highways (Table 4). Additionally, households with 2 or more vehicles also show statistically different trip rates between the two zones around the major highways (Table 5). Test results of all other classes do not show statistically different trip rates between the two geographic zones around either major urban centers or the major highways.


    Table 3. Results of the "Difference-in-the-Means" Test for

             Single-Family Households in the Buffer Zones around

             the Major Urban Centers.   

    -------------------------------------------------------------

                    |  Mean Trip Rate  |  t value  |  Sig. Level

    -------------------------------------------------------------

    Within 4 miles  |       8.727      |    3.12   |    0.003

    Beyond 4 miles  |       5.815      |           |

    -------------------------------------------------------------



    Table 4. Results of the "Difference-in-the-Means" Test for

             Single-Family Households in the Buffer Zones around

             the US-1 and I-95.   

    -------------------------------------------------------------

                    |  Mean Trip Rate  |  t value  |  Sig. Level

    -------------------------------------------------------------

    Within 2 miles  |       8.868      |    2.94   |    0.004

    Beyond 2 miles  |       6.342      |           |

    -------------------------------------------------------------



    Table 5. Results of the "Difference-in-the-Means" Test for

             Households with 2 or more vehicles in the Buffer 

             Zones around the US-1 and I-95.   

    -------------------------------------------------------------

                    |  Mean Trip Rate  |  t value  |  Sig. Level

    -------------------------------------------------------------

    Within 2 miles  |       9.333      |    2.50   |    0.014

    Beyond 2 miles  |       6.917      |           |

    -------------------------------------------------------------

Conclusions.

Travel characteristics studies traditionally collect and analyze travel data by traffic analysis zones. With the growing use of GIS in the transportation planning community, this traditional zonal approach should be re-examined in order to take full advantage of the spatial analysis capabilities available through a GIS. This paper describes the additional query, analysis and display capabilities that can be gained from the use of address-match function to create travel characteristics GIS databases. With the change from creating polygon-based GIS databases to point-based GIS databases for travel characteristics data, this paper also discusses important issues that must be considered during both the questionnaire survey design stage and the GIS database creation stage. Specific examples of using both ArcInfo GIS spatial analysis functions and SPSS statistical analysis procedures for the evaluation of geographic variations in trip generation rates demonstrate the power of this new approach over the traditional approach.

It is important to note that geographic information systems have presented a great challenge to the users community to re-think how we have been doing our works traditionally and what improvements could be achieved through the use of GIS. Hopefully, taking on this challenge will lead us to not only better ways to do our works but also further development of GIS capabilities.

Acknowledgement.

This study was sponsored by a Florida Department of Transportation (FDOT) Research Grant (WPI#0510720). We thank Shi-Chiang Li at the FDOT District 4 Office for his assistance.

References.

Environmental Systems Research Institute (Esri). 1992. PC Network User's Guide. Redlands, CA: Environmental Systems Research Institute, Inc.

Florida Department of Transportation (FDOT) District IV Office. 1995. Survey Design and methodology of Treasure Coast Travel Characteristics Study, prepared by Walter H. Keller, Inc., Coral Springs, FL.

Kanafani, A. 1983. Transportation Demand Analysis. New York, NY: McGraw-Hill Book Company.

Meyer, M.D. and Miller, E.J. 1984. Urban Transportation Planning: A Decision-Oriented Approach. New York, NY: McGraw-Hill Book Company.


Shih-Lung Shaw and Phillip Lall Dass
Department of Geography
Florida Atlantic University
Boca Raton, FL 33431
E-mail: shawsl@acc.fau.edu