Spatially Enabling an Incident Data Warehouse
Baron O. Grey, Ph.D.
GIS/Trans, Ltd.
Abstract
This paper discusses the use of the Esri Spatial Database Engine (SDE), ArcView GIS, and Avenue relative to the Freeway Incident Response Services Tracking (FIRST) system's Incident Data Warehouse. It describes how FIRST is using SDE and ArcView to manage geo-spatial data and conduct spatial analysis in conjunction with a data warehouse. The Los Angeles County Metropolitan Transportation Authority (MTA) funds the FIRST project and the FIRST system is implemented for the California Highway Patrol (CHP). The prime contractor is bd Systems, Inc. parent to GIS/Trans, Ltd. FIRST has been in beta test since April 1988 by 16 organizations and over 80 registered users, including CHP, MTA, the LA County Coroner, the LA Department of Transportation, Caltrans, and several media organizations.
Introduction
With over 527 centerline miles of freeways, Los Angeles County has some of the highest traffic densities and motorized congestion in the world with over 4000 traffic incident-related activities reported to the California Highway Patrol (CHP) daily. This data, collected by the CHP's computer-aided dispatch (CAD) system and the FIRST system, can be very useful to CHP department groups, area and divisional offices, to planning and funding organizations such as the Los Angeles County Metropolitan Transportation Authority (MTA), other State and Federal agencies, and the public at large. The data collected serves as a critical input to the development of the State's multi-billion transportation programs. Current methods of collecting, analyzing, and distributing this data rely on manual labor-intensive processes that require merging of data files from archive media, printing and sorting though data and, for non-standard requests, may even require the generation of new code. The FIRST system addresses these problems by implementing an online Data Warehouse that collects and summarizes incident data and allows analyses to be performed both spatially and non-spatially.
This paper discusses the use of the Esri Spatial Database Engine (SDE), ArcView GIS, and Avenue in the implementation of the FIRST system’s Incident Data Warehouse. It describes how FIRST is using SDE and ArcView to manage geo-spatial data and conduct spatial analysis in the Data Warehouse.
The FIRST System
The FIRST system is a joint project of the Los Angeles County Metropolitan Transportation Authority (MTA) and the California Highway Patrol (CHP). It is an Intranet-based incident information and communication system that streamlines the entire incident management process, linking the CHP bi-directionally to the MTA, the media, allied agencies, first responders, and other incident and emergency management organizations as depicted in Figure 1.
Figure 1: FIRST Concept Architecture
The system helps users to obtain real-time information on ongoing incident activity on roadways in Los Angeles County and, in turn, provide status and updates to the CHP. By utilizing the power and efficiency of multi-tier open systems, Intranet, and World-Wide Web technologies, the FIRST system is enhancing and revolutionizing incident management in Los Angeles County. The system encompasses multiple agency jurisdictions, including the CHP, MTA, Los Angeles County Department of Public Works, California Department of Transportation (Caltrans), Los Angeles Department of Transportation (LADOT), Los Angeles Police Department (LAPD), Los Angeles Sheriff's Office (LASO), Los Angeles County Coroner, Los Angeles County Fire Department, local media traffic reporters, and many first responders.
Features of the FIRST system that users benefit from include:
·
Simplicity
The system has been developed to be simple and intuitive to use. All operations are accessible from graphical user interfaces using a simple point-and-click metaphor consistent with familiar Web browser technologies. Information is logically arranged and presented.
·
Two-Way Communication
The system supports bi-directional communication so that authorized users can not only receive information from the CHP, but they can provide information as well, make inquiries, or give status on how they are responding to an incident as it progresses through its life cycle.
·
Notification
The system can notify users when selected events occur. By using the rich information content provided by the system, users not only know that they should respond, but they can determine how they should best respond.
·
Efficiency
The system is not only easy to use, it will make the user’s job easier. Information that is needed to respond to an incident is right at their fingertips, often at a glance, thereby allowing them to be more productive and efficient in their job.
·
Integration
The system is designed using open-systems technologies specifically for integration with other computer systems. This allows information to be shared to further enhance responsiveness to incidents.
Open-Systems Architecture
In order to leverage information technologies now and in the future, the FIRST system is developed using open-systems principles that leverage national and international standards in the following ways.
·
By employing a 3-tier architecture (see Figure 2), client processing is logically and physically decoupled from server processing. This allows client and server components to be changed as long as interfaces remain compatible. The middle tier supports extensibility since it can be distributed over multiple sites, thereby allowing the system to easily evolve into a geographically distributed architecture.·
By using the TCP/IP protocol family for implementing network communications between clients and servers (via the second tier), and for implementing communications with external systems using ITS National Architecture compliant protocols as they become available.·
By using CORBA as an application-level protocol for communicating with external systems. FIRST also supports Microsoft’s COM/DCOM for external users who wish to use this de-facto standard for distributed computing. Optionally, FIRST can also support the Enterprise Java Beans component object model should the need arise.·
By using an industry standard database (Oracle Enterprise Server) supporting the SQL standard and the XA-Open distributed transaction protocol.·
By using the Microsoft desktop (Windows 95/98 or Windows NT/2000) for client processing.·
By using the Windows NT operating system, which is POSIX-compliant, for all servers.·
By employing asynchronous messaging (IBM MQSeries) as an alternative for communications with external systems that do not need or cannot support synchronous communications with the FIRST system.
Figure 2 depicts the primary components of the system’s architecture. The Map Services component is used to deliver maps to Web clients. The Spatial Data Server is an Oracle database that contains SDE layers, and the SDE service application.
Figure 2: FIRST System Architecture Block Diagram
Geo-Spatial Data Architecture
Figure 3: FIRST Geo-Spatial Architecture
The Data Warehouse and the Geo-Spatial Database are instances of the Oracle Enterprise Server RDBMS (Version 8.x). The SDE (Version 3.0) application manages the Geo-Spatial Database via Oracle where geographic features and other spatial information are maintained in SDE layers. The Data Warehouse contains freeway incident-related data and is sized to accommodate 2-3 years worth of data with an expected capacity of 80-100 GB. The Geo-Spatial Database contains static data layers comprising information such as streets, lauduse, political boundaries, and demographics, and a dynamic data layer containing geographically referenced incident location data obtained from the Data Warehouse. Data from the Data Warehouse and the Geo-Spatial Database are relationally linked thereby allowing queries to span both instances. The Data Warehouse and the Geo-Spatial Database are kept on separate dedicated servers on the FIRST network.
Although it is not necessary to use a separate Oracle instance for spatial data when using SDE, this architecture was chosen for performance, capacity, and maintenance reasons. Data is bulk-loaded into the Data Warehouse periodically; once data is loaded into the Data Warehouse, the FIRST SDELoader, a custom C++ application that uses the SDE API, is used to update the Geo-Spatial Database’s dynamic incident layer using spatial reference data from the Data Warehouse. The dynamic incident layer can also be updated in real-time using the CAD Interface Processor (CIP) application, which is a custom C++ application that utilizes the SDE API.
Although the SDE API can be used directly to formulate spatial and attribute queries against the databases via custom programming, our approach was to use the ArcView application as a front-end. ArcView is used to establish a database connection to the Geo-Spatial Database; however, since the databases are relationally linked, this gives it the capability to perform spatial queries that relate Geo-Spatial Database data with Data Warehouse data. This arrangement demonstrates the power of SDE and ArcView in dealing with spatial and non-spatial data dispersed across distributed databases. To provide a more user-friendly interface, ArcView has been extended via Avenue scripting to provide customized user interfaces that make spatial analysis simple; however, the full features of ArcView are available to the experienced user.
Incident Data Management
Incident data is collected in real time from the CAD system and is transferred in ASCII format as "Incident" records with the location information already geocoded by the CAD system whenever geocoding is relevant for the incident (a geocoded location can be overridden with a text description if the text description is deemed more accurate); in any case, all incidents have a text description of its location, whether or not it is geocoded. These records are parsed and the parsed values stored in the Transactional database and the Data Warehouse. Each incident has a key value, IncidentID, which is used to relate incident information across the multiple database instances in support of spatial queries.
The dynamic incident layer in the Geo-Spatial Database instance is kept up-to-date with the data in the Data Warehouse immediately as data is added to or removed from the Warehouse. This is handled by a loading application that copies data from selected columns in the business table in the Warehouse into the Geo-Spatial database or it calls up SDE to delete the appropriate rows from the dynamic incident layer. While it would have been ideal to have the dynamic incident layer in the Warehouse Database and the remainder of the layers in the Geo-Spatial Database, we couldn't convince SDE to operate across both database instances.
Geo-Spatial Analyses
The SIM allows users to conduct geo-spatial analyses on incident data in the Data Warehouse and is implemented using ArcView on the client side and SDE on the server side. The SIM is made up of four subsystems compiled as ArcView extensions:
1. An SDE Layer and Information Manager (SLIM) package;
2. A Boundary Review and Maintenance (BRAM) package;
3. An Incident Loader package;
4. An Incident Spatial Analysis Mapping Manager (ISAMM) package.
The SLIM handles the administrative functions for the Geo-Spatial Database. The SLIM interface is designed to assist the SDE administrator, as much as possible, with basic SDE-SQL tasks and with incident loading. Only the most fundamental tasks are possible within SLIM; consequently, if high-end database changes are necessary with Oracle and SDE the task must be completed at the Spatial Server. SLIM allows the SDE administrator to perform update and editing of tabular data, spatial features in layers, and also examine or modify the SDE parameters established for each layer. Additional tools for layer creation and database editing are included as well. Incident loading will be specific to start-end cutoffs defined either by date/time or incidentID.
The Boundary Review and Maintenance (BRAM) functions of the SIM also have an administrative classification. To assist CHP individuals with the ongoing manual task of review and change of boundaries, the tools of ArcView were customized so that, on an as-needed basis, CHP would have the ability to take a close look at current boundaries and make the necessary digital changes before conversion to CAD format. BRAM also provides the opportunity to identify locations where changes are needed, create Map Reports of proposed changes, generate a digital shapefile of alterations, and assists in the process of implementing new spatial referencing into the CAD system.
The Incident Loader package is a custom interface that allows user-definition of date/time or IncidentID start-end cutoffs before converting the Incident coordinates into an SDE layer.
The ISAMM application’s design is focused on ease of client use, while still offering a stand-alone GIS software application with the potential for providing many benefits and meeting the needs of the CHP end-users. ISAMM tools have a design and user input that minimizes the learning curve, yet offers the user maximum feedback on incident activity with display, analysis, and report functionality. ISAMM is built using the ArcView development environment - Avenue and Dialog Designer, which is easily transferable between ArcView software installations. The ISAMM application provides the following functionality, implemented as a set of tools sharing a common user interface:
·
Application Display Tools with Zoom, Pre-Set Zooms, Identify, and Layer Controls;·
Search and Visual Identification of Incidents;·
Visual and/or Interface Selection of Incidents;·
Scaled Display of Background features and Annotation;·
A pre-defined dictionary of Incident Requests;·
A Query builder allowing Request Dictionary additions, editing, and deletions;·
Selection of Incidents according to Date, Time, and Spatial Filters;·
Map Report Functions with layout templates, single and map series options, multiple data sources, plus pre-constructed symbology and legends;·
A Reverse Geocoder allowing for interactive address approximation;·
Proximity Reports providing both Map and Table, calculating distance to … ;·
Density Reports providing Map & Table, displaying gradients for and hot spots of Incident activity, according to a date/time filter;·
Incident Profile Reports with Milepost Data;·
Document Output to printer, file, raster image, or Web HTML file.
Figure 4 below depicts a typical ArcView user-interface for the ISAMM package and Figure 5 depicts the interface to the tool for generating Area Reports.
Figure 4: ISAMM ArcView Interface
Figure 5: Area Reporting Tool Interface
Lessons Learned
Overall, SDE performed well as a spatial data engine. We found the SDE "C" language API relatively easy to work with and the documentation was easy to use and reasonably accurate. In order to make the C API more application-friendly, we wrapped it (the API calls we needed) in a C++ class that eased some of the issues working between C++ and C (we recommend that Esri provide a C++ class library or a COM component in the future). We did experience some difficulties working with SDE across two instances of the Oracle database; we could not coerce SDE into having some layers in one instance and some in the other—we ended up copying a few columns of the business table from the Data Warehouse into the Geo-Spatial Database so that all layers were in the same database instance. Spatial queries were then run against the Geo-Spatial Database and the results joined back to the Data Warehouse using an appropriate key field.
We took advantage of Oracle's "partitioning" capability to structure the Data Warehouse - this facilitated loading and removal of data by partitions. This is not a capability directly supported by SDE, nor does it represent any limitation of SDE. Hence the need for two separate database instances. However, we would have liked to be able to use partitions across SDE layers as well.
Because of the way the data in our Data Warehouse is partitioned, SDE did not present any performance problems in serving up spatial query results since, to a large extent, query performance is dictated by the Oracle RDBMS. However, one should not expect blazing performance when running spatial queries across a multi-gigabyte distributed database without extraordinary hardware support.
We found, as anticipated, that database tuning is critical to good performance, especially when creating features in SDE layers in real-time. Loading the Data Warehouse has to be as fast as possible especially when dealing with large data sets.
The ability to perform spatially-referenced queries on incident data was very useful as we used this feature to support an analysis of secondary incidents on freeways in Los Angeles County as part of a study being conducted by a local university.
Summary
SDE and ArcView proved to be a powerful combination for dealing with spatial data in a distributed database environment. Currently, the system is being used to perform spatial analysis on freeway incident data. We would not hesitate to use SDE again in a project that requires serving up spatial data quickly to a large number of users, especially with new tools that are now available to facilitate layer creation and manipulation in a spatially-enabled data warehouse.
At the present time the FIRST system is undergoing beta testing in the Los Angeles area and is expected to become a production system in the second half of 2000.
Acknowledgments
The author gratefully acknowledges the work of the FIRST development team in implementing the GIS architecture, particularly Bryan Parlee for developing the ArcView interface and generating SDE layers, and Jim Strzyzewski for database integration. The author also acknowledges the support of the MTA and CHP in making the FIRST project a reality.
Author Information
Baron O. Grey, Ph.D.