Anne M. Leibold, Michael J. Blaskowski, Robert J. Farnes
The management of Environmental sampling data for use in a GIS is a complex task. Large amounts of data, sometimes including over 1 million rows of analytical results, must be entered, maintained and analyzed using GIS technology in a timely manner. At PRC Environmental Management, Inc., an Oracle RDBMS serves as repository for field and laboratory data at sites under investigation. Before data are accepted in the database they are processed through stored procedures such as constraints and triggers to ensure data integrity and completeness. Using the database integrator, this quality checked data may be accessed by ArcInfo and ArcView for attribute display, spatial analysis and data modeling. This presentation describes the database management techniques and procedural data quality checks as well as the GIS analysis of the environmental data such as risk assessment, site characterization and desktop access to the analytical data.
PRC Environmental Management, Inc., (PRC) is using Oracle, ArcInfo, ArcCAD, and ArcView software to support environmental restoration activities including site characterization, fate and transport modeling, human health and ecological risk assessment, remedial design, and long-term monitoring. The complex and multi-dimensional nature of data from potentially contaminated sites presents significant challenges for designers of environmental databases. PRC developed a rigorous yet flexible system for maintaining attribute data in a normalized data structure and promoting data integrity through the encapsulation of stored triggers, procedures, methods, constraints, and views. Because environmental restoration programs frequently produce large volumes of chemical, geological, and geotechnical data, sophisticated performance tuning measures are used by PRC to retrieve the needed data in a timely manner. Through Esri’s database integrator, the data which are optimally stored in Oracle can be fully integrated with spatial data such as sample locations, installation restoration boundaries, parcels, land use, habitat delineations, and digital aerial photographs which are maintained in ArcInfo.
Environmental restoration databases are designed for a wide variety of purposes and users. Data are derived through the field efforts of environmental scientists and engineers and the subsequent chemical analyses of water, soil, sediment, and, tissue samples by analytical chemists. These data contributors and generators become the first users of the database as they strive to validate its content and assure adequate sampling coverage. Later, geologists access the database to determine the nature and extent of contamination in the affected media. At the same time, hydrologists, soil scientists, and atmospheric scientists estimate the fate and transport of contaminants in all potentially affected media. Eventually, ecological and human health risk assessors quantify risk based on a variety of scenarios for all potentially impacted species. If remediation is r equired, engineers access the data to develop feasible remedies and final remedial designs. Throughout this process, federal, state, and local regulatory agencies, potentially responsible parties, and the public might require access to the data to monitor the restoration process. The database must be designed with an understanding of each user’s needs.
The nature of chemical data and the multiple dimensions of time, media, phase, and geographic location, including depth, lead to very complex data sets and similarly complex data designs and applications. Databases in excess of one million analytical results and over 10 million data items are common. Other factors complicating the work are a mix of the following: point and non-point, in situ and ex situ, and purposive and random samples; field and laboratory methods; sample treatments and preparations; quantitative and non-quantitative results; quantitation limits and recovery methods; sample reanalyzes and dilutions; tentatively identified compounds and target analyte results; validated and non-validated data; and field and laboratory quality assurance samples.
Chemical site characterization studies are inherently complex because of the intricate interplay of human activities and natural processes over time. Contaminated sites are typically affected by several compounds, and several media, such as soil, surface water, and groundwater may be affected. Once a media is contaminated, point sources of contamination can transform into non-point sources through both natural and human activity. For instance, point-source contamination can become non-point in nature by migration through and along utility conduits and tunnels. Point-source soil contamination can become area groundwater contamination through the interaction of these two media. Once impacted, the groundwater may develop into a very complex area-wide or regional plume and may affect multiple water-bearing zones. Soil and sediment contamination typically leads to plant and animal tissue contamination depending on the bio-availability of the contaminant and the bio-accumulation capability of the species.
Chemical site characterization databases must be designed to accommodate and model all this natural and human complexity. Table structures must track chemical results as they relate to all potentially affected media and categorize results by source type such as point, non-point, area, mobile, and fugitive sources. Each media and source type are analyzed and interpreted differently by the many users of the data. Proper use of the data should also be promoted. For instance, chemical concentration data from soils and chemical data from plant and animal tissue samples typically should not be grouped together for interpretation. Designing a chemical site characterization database to track the full range of data collected during a remedial investigation is essential, but proper structure does not automatically lead to the correct population of the database or to its proper use.
With the widespread availability of open database connectivity (ODBC), active x control, data access objects (DAO), remote data objects (RDO), and object linking and embedding (OLE), data integrity must now be promoted through data encapsulation methods rather than data access methods. Oracle relational database software supports the object-oriented concept of data encapsulation through database constraints, triggers, and stored procedures. To promote data integrity, PRC has implemented hundreds of stored constraints, triggers, and procedures. This subjects data insertions, updates, and deletions to intricate data controls regardless of the data source. The triggers act as methods in the standard sense of the object-oriented programming model. When an insert, update, or delete message is sent to a database table (object) that is encapsulated with a constraint or trigger, the constraint is activated or the trigger executes to verify that the requested action does not violate data integrity parameters. For example, encapsulated triggers would not allow data from a groundwater sample to be relationally joined to a soil sample location record. Some other triggers and constraints automatically check for inconsistencies such as duplicate sample records, incomplete sample depths or dates, and mismatched pairs of field duplicates and normal sample records.
Similarly, data retrievals are facilitated by developing views which present data from several joined tables or from snapshots which refresh the data sets regularly. Because the views are developed by database administrators, all required joins and filters are pre-set for the user. If a view or snapshot does not meet the needs of a particular user, custom functions and procedures can be used to help promote proper use of the data. For instance, stored procedures may be developed to disallow the mixing of data from soil and water samples, or preferentially selecting validated results over non-validated results, or forcing one analytical method to be returned when two or more analytical methods were used for the same compound.
If care is not taken, database performance can be adversely affected when data are encapsulated by many constraints, triggers, views, functions, and stored procedures. Optimization techniques such as table clustering, physically storing related tables around their common key, table stripping, partitioning tables across multiple physical drives, and physically separating indexes and tables among drives and data channels, produce very satisfactory results for the size of environmental databases typically created.
Database management using Oracle in conjunction with ArcInfo, ArcCAD, and ArcView facilitates integration, analysis, and display of these diverse types of data on a single platform. To avoid sample naming update anomalies, all sample chemistry is linked with numeric keys to the appropriate sample location stored in ArcInfo. If it is necessary to change the name of a sample location, such as a monitoring well, the numeric key maintains the link between the Oracle tables and the ArcInfo coverages regardless of where the name was updated.
PRC has used this database management system for attribute display, spatial analysis, contour or isopleth mapping, surface modeling, statistical interfacing, and pre-processing and post-processing for groundwater and air dispersion modeling. In addition, ArcView and Avenue are being used to develop query stations to provide both clients and PRC technical staff with direct access to the data in a graphic format.
For example, one of the sites PRC is currently under contract to investigate encompasses approximately 2,200 acres of land, and there are more than 600 soil and sediment sampling locations. In addition, there are more than 500 monitoring wells on the site, and approximately 500 more wells in the surrounding vicinity. Each round of soil, sediment, or groundwater sampling may yield chemical data for as many as 160 specific compounds, as well as extensive supporting chemistry used to meet the quality assurance/quality control objectives of the program.
At this site, PRC collected and evaluated data in support of a sitewide remedial investigation, including a human health risk assessment. Many maps depicting the spatial distribution of contaminants of concern in both the groundwater and the surface soil and sediment were used to evaluate the nature and extent of contamination. Human health risk assessment requires the integration of individual risk factors for each contaminant of concern, and the union of risk values in soil with risk values in groundwater. Groundwater contaminant plumes were identified by extracting pertinent chemical data from the Oracle tables and modeling or displaying the data in a spatial context. The various plumes were overlain to provide a composite representation of contaminants in the groundwater at a given location. Using chemical data sets extracted from the Oracle tables, the results of spatial analyses, and input of various parameters related to human health risk assessment, groundwater risks could then be estimated. Soil risks were estimated according to several specific land use scenarios, including residential, occupational, and recreational, and overlain with the groundwater risk representations. The ability to quickly retrieve appropriate chemical data sets and then implement spatial analysis functions such as polygon overlay, point-in-polygon identity, and nearest neighbor analysis enhanced the efficiency of data processing and presentation in support of the human health risk assessment.
At another site, PRC used an environmental database and GIS technology to streamline landfill design and closure. On-going field investigations at this site produced large volumes of chemical, geological, hydrogeological, land survey, and well construction data. The databases maintained in Oracle and ArcCAD were first used to characterize the landfill site. The extent of contamination was evaluated using isopleth maps of chemical concentrations from soil, groundwater, and soil gas sampling. Potentiometric surface maps were generated using water level data, and lithologic cross-sections were created to define the subsurface geology.
As part of the remedial design process, PRC evaluated twelve capping scenarios for the engineered alternative to assess which configuration would best meet the client’s needs and regulatory requirements. With the use of GIS technology, landfill cap scenarios were modeled and displayed in plan view, profile, and 3-dimensional net mesh or triangulated irregular network (TIN) representation. Initially, a digital terrain model of the existing topographic surface was created to establish a baseline. Then, various cap designs were modeled, including one-, two-, three-, four-, and ten-mound configurations. Some configurations were modeled with several different minimum slope scenarios.
The models developed in the GIS were used to quickly calculate volumes of fill material needed for each configuration. This in turn determined the amount of fill material required from the adjacent borrow pit. Due to shallow groundwater levels, economic out-lease agricultural agreements related to borrow pit land, and cost of moving fill material, minimization of the borrow pit extent was very important Overall, landfill cap configurations were evaluated based on impact to adjacent agricultural out-lease land, volume of soil required for the cap, construction cost, and duration of construction. The selected remedial design included a soil cap, a passive landfill gas venting system, a surface waste drainage system, a groundwater monitoring network, and access roads.
An accelerated schedule was necessary to meet a funding deadline for the construction phase of the project. With the help of an effective relational and spatial database design to support PRC’s solid waste engineering expertise, hydrogeology expertise, and regulatory acumen, the client was able to expedite the remedial investigation and remedial design processes and respond to all regulatory agency concerns.
The database management system employed by PRC is highly effective for supporting environmental characterization and restoration activities. However, additional measures are being taken to increase utilization and efficiency in data compilation and analysis. Data generated from remedial investigations frequently require manual entry or reformatting of pertinent data collected at the time of sampling. Due to the complexity of the field measurements and sample tracking, essential data such as unique sample identifiers, requested analyses, and survey benchmarks, may occasionally be omitted from the package submitted to the database management team. These problems may be compounded by concurrent sampling by multiple field teams and separate contractors.
PRC is currently developing pen-based computer field forms that prompt the field technician for key pieces of information and then directly transpose the field data into digital format suitable for upload to the main database tables. The forms will be filled out during the day’s field activities by the field team, and data will be electronically transferred by modem to the database coordinator on a pre-determined schedule. If data are missing or incorrectly entered, the database coordinator can notify the field team immediately, and request the missing information. Without this technology, the database specialists periodically have to retrace events or contact field personnel collecting the data, often months after the sampling event is over, to track down and enter all pertinent information into the database and GIS. The pen-based computers are designed to function under field conditions. This technology should make data collection easier for field personnel, and increase efficiency and data completeness for database management tasks.
The ability to encapsulate attribute data with constraints, custom triggers, procedures, functions, and views in Oracle promotes data integrity and usefulness. Furthermore, the ability to effectively link the data stored in Oracle with spatial data maintained and accessed in ArcInfo, ArcCAD, and ArcView provides PRC with important capabilities needed for data management, data integration, data interpretation, graphic visualization, predictive modeling, and community relations. These tools have allowed PRC to turn large volumes of complex data into useful information.
Anne M. Leibold
GIS Coordinator, PRC Environmental Management, Inc.
1099 18th Street , Suite 1960
Denver, Colorado 80202
leibola@ttemi.com
Michael J. Blaskowski
MIS Coordinator, PRC Environmental Management, Inc.
1099 18th Street , Suite 1960
Denver, Colorado 80202
blaskom@ttemi.com
Robert J. Farnes
GIS Analyst, PRC Environmental Management, Inc.
1099 18th Street , Suite 1960
Denver, Colorado 80202
farnesb@ttemi.com