Integration of Engineering Systems
ABSTRACT
GIS has become an integral element of overall systems development in local government. Responding to an ever increasing demand for data and analytical capabilities, the County of Riverside, California, has developed the Geographic Information System-Based Accident Records System (GIS BARS) over a two year period. The GIS BARS Project is funded jointly by the County of Riverside, the State of California Office of Traffic Safety, and the Federal Highway Administration. Collision_Info', an Accident Records Program has recently been released by the project team. The key to the successful development of this product is its ability to integrate data from a variety of sources to view the Big Picture' in an effort to move away from cause and effect Traffic Engineering to a proactive posture. Consideration and prioritization of data conversion and integration are delicate processes in the ever changing fiscal and political climate of local government in Southern California. Collision_Info' offers integration of accident locations with centerline, parcel, traffic volume, land use and survey GIS layers, and provides data links to Traffic Control Device and Pavement Management databases. The GIS BARS project and Collision_Info' house and provide traffic accident data for 26 cities, 5 California Highway Patrol Areas, 2 California Department of Transportation Districts and the unincorporated area of Riverside County.
INTRODUCTION
The County of Riverside, California is in the final year of the three year GIS BARS Grant Project. We have decided to focus our discussion on each of the project modules. In this manner, we can address issues of data integration, database development and systems development specifically, on a case by case basis, rather than simply summarizing the project and applying a moral in the conclusion. It is our opinion that moralistic conclusions are over used and over rated. This discussion will examine each major facet of the GIS-BARS, explain its function, and give an overview of its development with special emphasis on the assumptions and issues which gave each project module individuality (or personality, if you will).
The Geographic Information System-based Accident Records System (GIS BARS) Grant project's overall project objective is to develop and implement an efficient, ongoing, County-wide, GIS-based accident records system that will provide surveillance and identification of significant accident locations through the use of sophisticated display modeling, and analysis tools through the integration of diverse engineering information on the GIS. The system will help to identify high accident rates for locations including intersections and roadway segments for deployment of Federal, State, County, and City resources to do the following within one year after the three-year project:
Many types of data have been manipulated for inclusion into the GIS BARS database. It seems that no data or database was ready for use by the system. The reasons for this are varied. The conversion of hard copy data, typically a labor intensive of operations, turned out to be the most straight forward, in our conversion efforts. We knew what we wanted and created it. Existing databases, of which the County of Riverside has no shortage, were typically to absolute for functional use (typical of any near engineering level accuracy GIS) or inconsistent (which is the result of generations of data input without formal and consistent quality control measures). The amount of time required for adaptation to the GIS BARS system, was, and continues to be extensive.
SYSTEM OVERVIEW - PROJECT APPLICATION MODULES
The project can be functionally categorized into three primary application groups. The Data Management Module, User Application Module, and Query Modules. The Data Management Module consists of two sub-menu groups: Collision Manager and Volume Manager.
Collision Manager provides privileged user access to maintenance and development programs for Collision_Info. All modifications, upgrades, new feature development and maintenance functions are conducted in this directory structure to insure system security and avoid the convergence of prototype and functional application releases. Volume Manager focuses on data entry and editing functionality and also provides a secure directory structure for project and prototype development. Each of these two modules contains suites of customized tools for automated data conversion, interactive data input applications, a menu driven query and reporting applications, and links to programs contained in both User Application and Query Modules.
The User Application Module currently contains �Collision_Info'. Collision_Info is the foremost project application, providing accident query data, mapping and reporting to the end user. Collision_Info provides both standard and customized reporting capabilities and is the first menu driven application to successfully handle an accident database of this size without lengthy delays. Collision_Info uses the entire data set as the default set. In this manner county wide surveys and queries are available to the user without requiring programming assistance. Reselection sets are defined by location, date, type of collision etc. . ., or by graphic selection.
Query Modules allow system users to select specific records in databases which are significant to traffic accident investigation. The Query Module currently provides links to include data such as number of lanes, pavement width, Pavement Management and Federal Functional classifications from the Pavement Management database; Traffic Control Devices, through a link to the Traffic Control Device Inventory and reselected information on Roadway Volume. The Pavement Management and Traffic Control Device Inventory databases were not converted into spatial databases as they extend beyond the primary focus of the Grant. These databases will however need to be converted in order to provide users with condition diagrams. Funding sources are currently being sought to accomplish the conversions after completion of the GIS BARS Project.
MANAGEMENT MODULES - COLLISION MANAGER
The following provides specific details of the Collision Manager features:
Data Conversion - Data Sources - Data received for integration into the collision database comes from two sources. The primary source of data is provided at no cost (excepting data transfer medium), by the State of California Highway Patrol, Records Management Division in the form of Statewide Integrated Traffic Records System (SWITRS) ASCII data files. These files are obtained under existing agreements with all traffic enforcement jurisdictions within the county. Data is provided quarterly about four months after the end of each calendar quarter. Approximately 3,500 accident records are submitted each quarter. The GIS BARS collision database currently maintains 48,601 accident records for the period between January 1992 through September 1995.
The second source of data submitted for inclusion in the collision database are records received from Electronic Transfer Centers (ETC), which were originally prototyped under the GIS BARS Project. An ETC provides accident data to the system as the records are finalized by the reporting agency. This method allows for efficient and timely identification of potential hot spots county wide. ETC prototypes are discussed later in detail, but a brief description is necessary as a prelude to the conversion process. ETC conversion must undergo conversion from California Statewide Accident Reporting System (C*STARS) format to SWITRS format, and electronically transferred to the GIS BARS Project Operations Center. The C*STARS to SWITRS conversion can be performed at the issuing agency to remove sensitive data or after transmission by the GIS BARS staff. After these two steps are completed, the data is processed just as SWITRS data records, except that ETC records are processed on a daily basis.
Collision Conversion Specifics - Preprocessing - After receipt, accident records are copied to the GIS BARS System. The incoming records are merged, and sorted by collision date, time and officer badge number. The records are then screened for duplicates. Once the incoming data is confirmed, it is added (using the �Info' ADD Command with the (from) option) to the collision �Info' database (see discussion on conversion for evaluation of ADD (from) command). Street names are then parsed and verified against an alias table to provide correction of common mis- spellings and abbreviations in the SWITRS records. Street names without correlation in the alias table or StreetIdentification file (STIDS) are output to a table for verification (scrub.tab). The names are then corrected (scrubbed) through a street name quality control program. Unmatched names cannot be plotted. These are typically land mark references or other uncorrectable errors in the SWITRS data. These zero location records comprise about 4% of the total collisions processed.
The data is now processed to minimized program runtime. The two major time saving routines are the indexing of 19 �infofile' items in the SWITRS, STIDS and County wide Centerline (CWCL). The second routine generates WRITESELECT files for grouping sets of common street name arcs which are used in the conversion process to quickly identify arcs ident- ified within in the collision location. These files are contained in the selection-file directory which is used in the preprocessing stage of conversion as well as in the physical conversion stage.
Physical Conversion - The physical conversion process is performed on the records which have no location flag attribute (which is added at the end of this process once the location coordinate value is accepted), RESELECTS arcs which have matching street names, and test for nodes intersecting Primary and Secondary street name arcs. As each collision record is selected Primary Street Name (prime_name) and Secondary Street Name (secondary_name) nodes are calculated to a value of one (1). Prime_name nodes in the selected set are then calculated to a value negative one (-1). All secondary_names nodes are then multiplied by a value two (2). The node(s) holding a value of negative two (-2) are accepted as viable intersections and the coordinate location(s) are recorded in the data conversion output file (SWITOUT). Next the direction of the arcs leaving the node is analyzed to determine if the direction in the collision record corresponds. The number of corresponding directions is also recorded in SWITOUT. Finally the arcs meeting both intersection and direction criteria are traversed to check for good distance. If one good intersection, one good direction and a good distance exist, the collision point is appended to the collision coverage and the collision location flag attribute is filled. Records not attaining the single match requirements remain unflagged and are located during the Collision Q.C. program.
Conversion Quality Control Programming - The records which are not converted in the automated routine are reviewed in this process which utilizes Info, ARC, AP commands, ArcEdit and ArcPlot to allow GIS BARS staff to update the coverage (CWCL) or Collision Record data file. Cursor Processing, AP Commands and ArcPlot are used to display (if located) the identified Primary and Secondary Street arc locations and all (if any) possible collision points. After the necessary editing is complete, the unflagged records are reprocessed through the conversion processes. This procedure insures that modifications are made correctly while maintaining the highest possible level of accuracy.
Conversion Discussion- There are important issues regarding the development of the coverages and data files which were created to facilitate conversion processing. Street Centerlines in the County of Riverside GIS database (more than 70,000 arcs once combined) are stored in a typical ArcInfo tile structure broken out by Assessor's Book Boundaries (approximately 544 books). It was necessary to append all off the tiles to accommodate the continuity necessary to traverse arcs for collision placement. Creating routes to accommodate cross tile continuity was considered and ruled out. Extensive edge matching was necessary due to irregularities in the parcel based development of the base map. Street names as a matter of policy were only assigned on dedicated roadways. Unfortunately, collision reports list the primary street names and distance and direction from the nearest cross street, whether or not that roadway is dedicated or not. Issues of accuracy also had a significant impact on the man-hours required in modifying the Centerline Network. The County of Riverside's GIS Staff prides itself on the accuracy of the centerline attributes and they are maintained to a near engineering level of accuracy. This posed a problem for the GIS BARS project in that more than one hundred years of minor errors which were not detected on recorded instruments and maps, and forty years of Road Identification Numbers were transferred to the database as a matter of record. This created inconsistencies which made reselection of roadways by either street name or road number inadequate for collision conversion. After two years of intermittent editing the GIS BARS �County Wide Centerline Layer' (CWCL) can now be referred to as a functional database.
In addition to database limitations the length and structure of the SWITRS records posed challenges. Each SWITRS records data set length is dependant on the longest record. Victim and Party data are contained in the center of each record in non-fixed length fields. The ADD (from) commands limits record length to 200 fields with additional accommodation of redefined items. SWITRS record length can be in excess of 600 characters if a collision involves a bus or other mass transit vehicle. Again options were considered including the use of the EXTERNAL command to access the data from a text file. This was ruled out as an option due to the non-fixed length structure of the record. The best solution to date has been the use of UNIX script routines outside of ArcInfo to create consistent blocks of individual SWITRS records that have equal length, and adding these blocks to the Info data base in units.
VOLUME MANAGER
The primary establishment of the Volume Layer in GIS is to provide actual and estimated traffic volumes as factors for calculating accident rates in the �Collision_Info' program group. Due to the nature of the development process, several other programs, or by products were identified. The development of these by products provide the Transportation Department with tools that were not previously available and will replace and enhance existing tasks. Volume Manager provides the same functionality to the Traffic Volume layer utilized by the GIS BARS Project as Collision Manager provides the Collision Layer. There are major differences however in the nature of the two databases. The purpose of traffic volumes in a collision-based system is to determine an accident rate by which severity of the collisions can be accurately determined. To achieve this traffic volume values must be assigned to every roadway county wide. The volume layer uses a semi-automated conversion process and has yet to develop data transfer capability between computerized traffic counter and the GIS data base. The Volume layer is far more static than the collision database adding only 300-600 new counts annually. The GIS BARS Project Staff reviewed the existing Volume Maintenance and Count Program and requested numerous revisions to accommodate the data requirement of the GIS BARS System. As a result the project took over authority of the count program and restructured the entire program. Primary functionality can be described in four major categories: A link from Collision_Info to arc volumes or estimates for calculation of Accident Rates;�Volumeedit' - programing for maintenance updates; Traffic Volume Query and Reporting Program & Traffic Count Management Program; Traffic Flow Map Applications. The following paragraphs are the preliminary functionality descriptions of each of the applications listed above.
Link from �Collision_Info'- This is a hidden application. The fundamental purpose is to provide volume values for all roadways within the County of Riverside, as factors for the calculation of accident rates. Actual count values will be used where available. When an Actual Count value is not available, an extrapolation along continuous arcs or an estimate of volume will be assigned as equated from the Volume Assignment Algorithm definitions. Volume or assigned volume figures are displayed on the appropriate �Collision_Info' reports as an independent field and indicated as estimates if appropriate. The volume values are also used as factors in the accident rate calculation performed in the �Collision _Info' reports process. The process is not independently initiated by the user, but set by default.
�Volumeedit' GUI (interface for maintenance updates and Count Program Management)- The �Volumeedit' GUI provides privileged user access to the volume database maintenance functions and Count Program management. The functions for database maintenance updates include addition, modification, deletion of volume points and point attributes in the point attribute table (PAT) through attribute item, and graphic selection routines. The Count Program Management programs also included under this GUI will allow automated Count Book generation, and automated output formatting for Count Book Diskettes, as well as the generation of annual and monthly Traffic Census Count Location Reports,
�Volume_Info' GUI (interface for system user's)- Traffic Volume Query and Reporting Program applications has the same �look and feel' functionality as �Collision_Info'. The �Volume_Info' GUI provides non-graphic query routines in Info data file records. These query results provide actual and estimated counts (designated as such), for full segment and intersection entering volume search requests. The user will have the option to include additional records or omit selected records.
Traffic Flow Map- The map product itself displays only true volume count data and allowable extrapolations under the Volume Algorithm Methodology. The map product reflects traffic volume values on roadways with General Plan Types of Major or above. The map products are produced in five standard sections encompassing the entire county, and for each city individually (based on availability of roadway classifications in appropriate arc info or dBase (formats). Volume values will be defined by bands of width and color.
Volume Layer Development Discussion - The first phase in the development of the volume database was to review the existing Traffic Census Count Program. Each count station location was added to a point coverage in ArcEdit. All stations were reviewed for redundancy to proximity. Stations were deleted, moved or added as necessary to create a comprehensive coverage of the County Transportation Network. Each point was then populated with Old and New Station Number attributes to develop key fields for the relates used to access the volume info file which had been converted from a dBASE III file. Counts dating back to 1992 from the Engineering Count Request Program were manually added to the volume PAT and info file through Volumeedit (a Formedit menu driven data input utility).
Once the data base was fully attributed the GIS Staff attempted to develop an algorithm which would define roadway volume attributes based on attributes in related info files. The first attempt used a combination of General Plan Type, Road Book Type and Street Name to reveal categories by which volume estimates could be assigned into six categories. The expected results never materialized. Each of the six categories, defined by evaluation of 28 Road Book Types and 16 General Plan types, yielded roadway volumes from less than 5,000 to over 35,000, without producing meaningful mean or average values. The Second methodology attempted to isolate statistical anomalies and correlations between virtually all physical and identification attributes. Attributes from the Centerline, Pavement Management and Volume database were grouped into variable combination groups and analyzed in Arc through FREQUENCY and STATISTIC Commands. Attributes such as Pavement Management and Federal Functional Classifications, Road Book Type, Number of Lanes, Roadway Width, etc... were analyzed in varying combinations to detect grouping characteristics with street names, and again no meaningful volume classification groups emerged.
Finally, working from the set of arcs which contained volumes the GIS BARS Staff categorized street arcs into three regions Cities, Mountain & Desert and Western County. All of the roadways which lacked a volume attribute, PMS Functional Classification or Federal Functional Classification were populated with a Volume Classification attribute between 1 and 10, representative of true function characteristics. The Groups were reselected by region and Road Number (not Street Name) to give better definition to changes in condition which effect volume levels. Volume classifications were analyzed and successfully grouped by volume classification.
An algorithm which extrapolates and estimates volumes by Road Number was then finalized. The Program to accomplish extrapolation identifies road arcs containing a volume value and records the Road Number. Road Number groups are then selected and analyzed for multiple count locations within the set. Where the number of count locations within a group is one, all arcs within the group are assigned that volume value. For arcs where multiple count locations are present, the arcs within the Road Number group not holding a volume value are either calculated as an average (if between two count locations), or assigned the value of the last volume station to the end of the road number group. Estimates for road number groups not containing a volume value are assigned a value based on Volume Classification attributes.
USER'S MODULE - �COLLISION_INFO'
Collison_Info provide users access and interface to the underlying databases and coverages that make up GIS BARS. Collison_Info is a Formedit menu driven application providing users predefined query, reporting and mapping applications with customization available for severity, type, jurisdiction and motor vehicle involved with, day of week, vehicle type, road type and primary collision factors. The query application serves as the selection reference for reporting and mapping routines. By default the entire collision database is the selected set. Query accommodates the reselection process. Specific types of reports and maps which are available, are listed in the following sections.
Collision Location Reports
Collision Summary Reports
Maps and Diagrams
QUERY MODULES
Query applications allow the user to access data that is not converted into a spatial coverage or information on Spatial coverages without mapping or graphics capabilities. The following gives specific detail on individual query applications.
Pavement Management System (PMS) Query Screen
This Formedit Menu application allows user defined selection criteria in combinations using any of the following attributes:
Traffic Control Device Inventory (TCDI) Query Screen
Provides user defined search parameters base on:
Volume Query Screen
Provides user defined search parameters base on:
CONCLUSION
The GIS BARS Grant Project will continue to be developed under grant funding through December of 1996. Product deliverables are still scheduled and enhancements to further improve speed and reliability are still being made. The GIS BARS system will continue to receive upgrades and features after the grant period is complete, although with decreased intensity, depending on the availability of resources. We have both enjoyed the process under which GIS BARS was developed, even when the product delivery scheduled seemed like a pipe dream. Comments, suggestions and questioned are encouraged and welcome. Ron Filian can be contacted at (909) 275-6807 or E-mailed at rfilian@co.riverside.ca.us. Jeff Higelin can be contacted at (909) 275-2088 or E-mailed at jhigelin@co.riverside.ca.us.
ACKNOWLEDGMENTS
We would like to take this opportunity to thank the following people for their assistance, patients, support and vision: Lawrence T. Tai, P.E., GIS BARS Project Director and Riverside County Traffic Engineer; Frank Sherkow; William (Pat) Egetter, Richard (Dick) Barrera, P.E., William Kaftan (thanks for the code Bill!), and Kenneth Logan (at OTS).
CREDITS
ArcInfo, ARC, ARCEDIT, ARCPLOT, INFO and some of the command references throughout this document are registered Trade Marks of Environmental Systems Research Institute (Esri).
Statewide Integrated Traffic Records System, SWITRS, and C*STARS are products of the State of California, Highway Patrol and are used by permission.
dBASE is a registered Trade Mark of Borland Industries.