Environmental Decision Making: Integrating Analytical and GIS Data

July 2002, LAUR 02-4097

Steven Scherma1, Stephen Bolivar2, Alison Dorries2, and Khalil Nasser3

1 reVision, Inc., Santa Fe, New Mexico, U.S.A.
2 Los Alamos National Laboratory, Risk Reduction and Environmental Stewardship Division, Los Alamos, New Mexico, U.S.A.
3 reVision, Inc., Denver, Colorado, U.S.A.

ABSTRACT: The Environmental Restoration Project at Los Alamos National Laboratory designed a system for the cradle to grave management, tracking, validation, and storage of the organization's analytical chemistry data. The challenge was to provide a system that would allow project personnel to easily retrieve this analytical data as well as the project's spatial data and provide needed analysis, mapping, and reporting tools on the desktop. The result is the current prototyping of a ArcGIS based environmentaldecision management tool that will be integrated to work with the sampling and analytical results tracking system. This paper will explore design issues and challenges faced by the project and how these issues were resolved on both a technical and reengineering level.

1.0 Introduction

The Los Alamos National Laboratory (LANL) Environmental Restoration (ER) Project recently initiated a top to bottom look at how their information management activities were handled. To better meet stakeholder and user needs and to create a more defensible Administrative Record, a major information management reengineering effort was initiated. This paper describes this reengineering effort. It will examine both the problem drivers for the effort and the solutions that management and the information management team have devised and are in the process of implementing. The first section will discuss the overall reengineering effort. The second section will focus on the GISbased applications currently under development by the ER Project to assist in the environmental decisionmaking process and clean up efforts.

1.1 General Background

Los Alamos National Laboratory (LANL) was founded in 1943 as part of the Manhattan Project to develop the United States’ first atomic weapon. The Environmental Restoration (ER) Project was established in 1989 by the Department of Energy (DOE) to remedy environmental problems developed during the subsequent years of operations. The ER project is governed by the corrective action process under the following regulations: Resource Conservation and Recovery Act (RCRA), the Comprehensive Environmental, Response, Compensation and Liability Act (CERCLA), National Environmental Policy Act (NEPA), and DOE orders. The ER project also operates under extensive legal, public, and stakeholder scrutiny.

The ER Project is responsible for the characterization, clean up, and monitoring of over 2,124 identified potential release sites (PRS). These PRSs have resulted from operations largely associated with weapons research, development, and production which have been conducted since 1942. To accomplish mission goals, the ER Project conducts field sampling activities to determine possible types and levels of chemical contamination as well as their geographic extent.

These sampling activities result in the generation of significant amounts of geologic, hydrogeologic, geochemical, and contaminant information. To ensure accurate environmental analyses, data quality review, analysis, and evaluation are an integral part of the business processes conducted by the project. Creation of an infrastructure to provide for costeffective management of the data and the associated activities has therefore been a high priority. This infrastructure will ultimately support the establishment of a defensible environmental Administrative Record generated and maintained with standard extraction, visualization, reporting, and knowledge dissemination tools and integrated with robust document tracking and control tools.

2.0 The Reengineering Effort

2.1 Goals and Needs Assessment

In addition to supporting the Administrative Record, ER Project management is committed to streamlining existing work processes to better support quality environmental decision analysis. One goal of this process improvement is to better meet the needs of the nonER stakeholders with respect to the environmental decision issues facing the Lab and surrounding communities. A second goal is to support the requirements and performance standards set forth by Lab and ER Project management as well as state and federal regulators. It was determined by ER Project management that improving decision analysis was dependent on improving the following factors: data availability, data control and data quality.

In order to accomplish the above goals several steps were followed. First, a management level needs assessment, consisting of extensive interviewing, was performed. The above organizational drivers identified the needs assessment results, summarized below:

-Need for formal processes to track information flow
-Need for an effective information management infrastructure for information storage and retrieval
-Need to create a robust information audit trail processes
-Need for a qualified information management team
-Need for more effective data sharing with the stakeholders
-Need for a defined strategy to implement an information management system

2.2 Business Process Analysis

The next step in meeting the improvement goals and fulfilling the stated project needs was the performance of a Business Process Analysis (BPA). A BPA consists of an inventorying and mapping of existing work processes and data flow throughout an organization. It identifies the people (and their roles and responsibilities), the data, the processes (how data is passed around), and the tools (both manual and automated) that allow an organization to perform its work. The BPA allows an organization to identify gaps, inefficiencies and misalignment with stated goals in their existing processes. This analysis then presents opportunities for reengineering these existing work processes.

Based on the BPA performed at the ER project, it was determined that the following factors were hampering the organization, preventing them from fully realizing their goals:

-People
     -Data quality and control functions were scattered across the organization, making it difficult to effectively manage.
     -There was no centralized information management leadership.
-Data
     -Data were scattered across the organization and contained in multiple, sometimes incompatible, formats. There was little integration of the relevant technical, regulatory, GIS and management data.
     -Collection and reporting standards generally did not exist or were not followed.
     -The process for collection of data from initial generation to final deposit in a document repository was weak.
     -Data quality and integrity needed to be better documented.
-Processes
     -The sample management process was inefficient.
     -The data management process needed to integrate processes.
     -The document management process needed to integrate with the sample and data management processes.
-Tools
     -Current tools did not effectively communicate and support each other.
     -Current tools were insufficient.

2.3 Critical Success Factors

An IM management team assessed the results of the BPA and from those results extracted several critical success factors. Critical success factors are those sets of activities without which reengineering success cannot be achieved. While some factors may carry more weight than others, typically all are necessary for a successful outcome. The critical success factors identified by the ER Project are shown below:

-Information Management System Plan Development
-Information Management Technical Team Creation
-Reengineering of Business Processes
-Hardware/software Infrastructure Implementation
-Conceptual Design Development
-Database Design and GIS Integration
-Data Migration and Clean Up
-Software Definition and Development

The following sections give an overview and describe some of the accomplishments relevant to each one of the activities associated with the critical success factor.

2.3.1 Information Management System Plan Development

The foundational task was to develop a strategic plan that outlined the creation of the required Information Management System (IMS). An IMS strategic plan was developed next and integrated into the overall ER Project strategic plan (Canepa, 1999). This IMS strategic plan was then reviewed by the Department of Energy (DOE) and subsequently received their support. This represented a critical step. Large scale reengineering efforts such as this one can rarely succeed without strong management support and commitment. The integration of this effort into the overall strategic plan and getting stakeholder buyoff demonstrated such a commitment and put the plan on a sound footing.

The IMS strategic plan was developed as a phased approach plan. System components, new processes, new roles and responsibilities would be phased in so as not to negatively impact existing production processes. In other words, the new system had to work before the old system could be deleted. Elements of the IMS strategic plan included reengineering of work processes, development of a conceptual system design, a technical architecture design, infrastructure design, software development, reorganization of data duties, and creation of an IM team.

2.3.2 Information Management Technical Team Creation

Information management resources were evaluated, inventoried and centralized into a single team under senior leadership reporting directly to ER Project management. The team was charged with the sole responsibility of developing and maintaining an IM system that would meet the new technical and business requirements of the project. A major reorganization effort was initiated to ensure that all Information Management groups were organized under a single leadership with a welldefined mission.

Prior to initiating the reengineering effort, information management functions within the ER Project were decentralized and scattered across multiple functional groups. This decentralized information management structure allowed each of the functional groups to maintain and control their own data and information systems. Under the decentralized structure there was redundancy in many information management functions, conflicting and uncoordinated development efforts, redundant storage and maintenance of data, redundant data collection, and lack of coordination of staff skills and training issues. In the end, this decentralized structure made it difficult to coordinate efforts and prevented development of a system of the caliber required for the project.

2.3.3 Reengineering of Business Processes

The ER Project business processes were divided into Technical Processes and Management Processes. As part of the project’s IMS Strategic Plan, these processes were identified, inventoried, and prioritized. The goal was to eventually revisit all business processes and identify areas of improvement prior to automation. The strategy for process reengineering, however, called for a focus on those processes critical to the project’s major mission. A massive effort was initiated, which built off of the BPA effort, to define the ‘AsIs’ scenario for the priority processes and to then develop the desired ‘To–Be’ states. These ‘To–Be’ processes are then documented and integrated into the organization through the development and update of standard operating procedures, desk instructions, and quality procedures.

2.3.4 Hardware/Software Infrastructure Implementation

In order to meet stakeholder and ER project needs, the Information Management Plan (Canepa, 1999) called for an enterprisewide system to be created. An enterprise system is characterized by the easy access of tools, data and communication across an organization and the elimination of “data silos” and incompatible technologies. Additionally, stakeholders would eventually need varying degrees of access both within the larger LANL environment and outside of the LANL security firewall.

To create an enterprise system for the ER Project, it was determined that the hardware and software infrastructure would have to be upgraded and standardized. While multiple platforms and legacy systems can be accommodated within an organization, a cost benefit analysis may show that the cost of integrating and maintaining these systems outweighs the cost of upgrading and development. This was the conclusion reached by the ER Project.

The ‘AsIs’ state of the infrastructure consisted of multiple hardware platforms including Macintosh, Windowsbased desktop and laptop computers, and Unixbased workstations. Users were running multiple incompatible spreadsheets and database management systems including 4D, Access, SQL Server, and Oracle. Users could choose between Microsoft Internet Explorer and Netscape browsers. The infrastructure was also characterized by multiple domains, with desktop storage of data that wasn’t captured in the project’s main database. Additionally, some users did not follow standard business practices, e.g., uniform and regular backups. In short, although everything worked, data accessibility was not optimal, development efforts were difficult, and data processes were not well documented.

Stateoftheart technology was implemented to create a single domain with a uniform compatible hardware platform. Software options were standardized. Centralized file, Internet and data servers were installed. Daily backups with system redundancies were also implemented to ensure reliable system operation.

2.3.5 Conceptual System Design Development

One way to look at the work performed at the ER Project is that Potential Release Sites (PRS) closure (i.e., the removal of PRSs from the operating permit) is the purpose for which the data and environmental decisions are made. Figure 1 shows a representation of the conceptual design developed to support that process. The technical work processes required by the ER Project are shown in the outside boxes. For each of these technical work processes, the required process automation application is shown in the inside boxes.

The IM system conceptual design consists of a central database repository that stores both tabular and GIS data in a Relational Database Management System with a Spatial Database Engine (SDE) that serves the spatial data. Needed components fall into two general categories: data input applications and data output/decision applications.

After the conceptual design was approved, the data input applications for sample management and the central data repository design efforts were initiated. These two efforts were viewed as key, as the entry, maintenance, and effective storage of quality data was seen as the foundation for the entire system and would be required for all data output/decision applications.

Figure 1: IM System Conceptual Design

2.3.5 Database Design and GIS Integration

The ER Project Database (ERDB) was designed with the goal to support all the phases of the ER processes. It is a comprehensive design that allows the integration of workflow data, regulatory data, GIS data and technical data into a single data warehouse. The workflow data consists of project status data, management data, and business rules that support the ER workflow process. The technical data consists of field data, results and analysis data. The regulatory data consists of tracking information on regulatory decisions and the supporting document repository. The spatial data consists of the information necessary for data visualization and modeling.

The resultant database is comprised of over 300 tables with a complex set of relationships as well as implied and programmed business rules. Figure 2 shows the various database modules for the ERDB. The workflow management module integrates work processes. The list management module provides output tracking. The other modules are selfexplanatory.

Figure 2: ERDB Conceptual Design

2.3.6 Data Migration and Cleanup

Every reegineering effort will more than likely be faced with a data migration and cleanup exercise. Moving data from old structures to new is not a straightforward or push button task. Many issues, including but not limited to data type inconsistencies, differing methods of storing data, and constraining previously unconstrained data, will reveal themselves. GIS data has its own unique issues. The following two subsections discuss some of the data migration and cleanup issues encountered.

2.3.6.1 Tabular Data Migration and Cleanup

After deploying ERDB, the next step was to clean up legacy technical data and migrate it into the new database structure. A significant amount of environmental data was collected over the years but under differing regulatory requirements, standards, and accepted methodologies. This data was stored in both multiple databases and in hard copy documents. Data needed to be moved into the new standard structure and checked against the hard copies. The data migration and cleanup included approximately 2.5 million records of analytical data.

Moving data from the old system to the new structure proved to be a very timeconsuming task. To prepare for and execute a successful data migration, we soon discovered that data migration is more than simply transporting data from one system to another. Several questions had to be answered early on in the project:

-What data and how much data must be migrated (i.e., what data is relevant to the Administrative Record)? -Where is the data?
-What quality standards are required of the data? How much data cleansing is required?
-How should the effort be prioritized?

The ER Project not only had to identify the number of data files to migrate, but also the number of different systems from which data would be migrated. Data sources were not limited to actual data processing systems since some employees maintained files on their own workstations. These sources also had to be captured in the ERDB. While global fixes and “crosstalk” tables did help in the migration effort, there was still a significant amount of timeconsuming recordbyrecord cleanup required.

2.3.6.2 GIS Data Migration and Cleanup

The GIS data migration and cleanup presented its own unique set of problems. A separate organization maintained the GIS data for the project. The ER Project was the “owner” of a number of layers stored at the GIS facility; however, data was also derived from other organizations within LANL. Unfortunately, these sources were not responsible for the data sitting in the repository. In the past, changes could be made to the data but then were not reflected back to the “owner” organization. Obtaining and tracking updates and then reconciling layers became more and more difficult as time progressed. Over time, hundreds of spatial layers were created. These layers quickly became untrackable, because their history was not well documented and sometimes lacked sufficient metadata. Attribute data contained in those layers were often out of date and/or did not correspond to data in ERDB.

Up to this point, GIS related information in tabular databases was not in sync with the attribute data in the GIS repository. Identifiers and names were not uniform, changes were often made outside of the GIS system (e.g. such as in Adobe Illustrator) and these changes never made it back into the GIS layers. Multiple layers were created changed or updated for specific projects thus creating confusion over which layer or layers contained the latest information.

The ER Project therefore embarked on a course to more effectively manage their GIS resources. The current GIS facility, which has an excellent infrastructure, will still be providing the serving of large data sets that are static in nature – orthophotos, DEMs, contours, LIDAR, multispectral, etc. The ER Project has identified the layers that are under its ownership and is currently setting up an internal SDE Server on top of ERDB to provide access to these layers. Many layers that are not properly documented have been archived and will now be used on an asneeded basis only. A stringent change control process has been put in place. Field teams within the ER Project are now taking ownership of the parts of layers that they can best define, such as PRSs and sampling locations. All attribute data currently stored with the layers is being purged, and attribute data will now only be stored, nonredundantly, in ERDB.

2.3.7 Software Definition and Development

Software definition followed a strict process. Based on the reengineered (i.e. “Tobe”) processes, a conceptual design for an individual application was developed. Once client buyin was obtained, a detailed requirements generation and review process was undertaken. The subsequent requirements document allowed developers to understand the scope of the system they were to build and allowed the client to determine if the deployed system met specifications. Once developed, the requirements document is brought under change control. All changes must then be negotiated between client and developer, as changes to requirements at this point will impact either schedule, budget, or resources. The ER Project implemented a robust tracking system software to manage this aspect of the project.

Once a requirements document receives user signoff, a design document is developed. The design document contains screen shots, business rules, database table designs, and architecture designs. This document is dynamic as it is constantly revised as new issues arise and are resolved within the context of the requirements.

The next step requires a detailed design to be maintained by the developers. This documents the types of objects the system will need to function properly. At this point, coding, testing, and bug fixing and tracking occurs. Deployment involves yet more testing, development of help files and training manuals, and actual user training. Once all parties feel confident of their products and processes, the move to production occurs.

Even after production is initiated, the development process is never absolutely complete. Despite everyone’s best efforts, gaps may remain, new enhancements and changes that will increase efficiency are discovered as production activities kick in, new and more efficient technologies are constantly coming on line, and organizational goals and missions shift. Knowing this will always be the case, organizations must still proceed forward with their efforts, the only alternative being paralysis and stagnation.

3.0 GIS Applications: A Case Study of SMO and SMART

This section presents an overview of two of the GISoriented applications under development by the ER Project. The Sample Management Office (SMO) application allows the primary association of GIS data to analytical data to occur. The other application, Spatial Mapping, Analysis and Reporting Tool (SMART), helps in the data extraction, visualization and reporting of the analytical data.

3.1 Overview of the Information Management System

To briefly summarize the Information Management System from an application development perspective, the following categories of tools are required which resulted in the development or proposed development of the following applications (Figure 3):

1)Data Input Application, including tools for:
     a.Sampling event planning
     b.Sample tracking
     c.Electronic Data Deliverable and data checkers
     d.Data Verification and Validation
     e.Data Quality Assessment
2)Data Reporting and Dissemination Application, including tools for:
     a.Data extraction, analysis, visualization and mapping
     b.Modeling management
     c.Webbased information portal and decision management
     d.Workflow management
     e.Document control system

Figure 3: Information Management System Overview

3.2 Sample Management Office (SMO) Application

The SMO application allows for the input, QA and tracking of sampling plans, chains of custody for physical samples, lab requests, shipping chains, lab deliverable receipt and order fulfilment, as well as validation and quality assurance procedures. In the past, these activities were captured in a series of systems that did not communicate with each other efficiently and could not provide the ER Project with both the tracking information and, more importantly, the data turnaround times it required. The new system provides a seamless, efficient method to process sampling activities from the planning stages to final disposition in the data repository and flagging for public release.

The analytical data generated during the SMO process is inherently tied to GIS data. Analytical data is collected at a particular location. It is usually associated with a particular investigative area. In the ER Project, investigative areas are designated potential release sites (PRS). Additionally, the ER Project performs investigations on stretches of river channels they designate as “reaches,” and they examine impacts on partial and/or entire watersheds. At issue was the fact that before implementation of the SMO system, the project had no centralized, reliable source for determining the association of analytical sample data with these GIS features. Much of the information was contained only in hard copy reports, spreadsheets and small desktop databases.

The solution to this issue was twofold. First, GIS feature identifiers are now managed in a uniform way in ERDB and a data structure to support feature associations was created. Secondly, the SMO application required users to establish these associations as they developed their sampling campaigns. While a seemingly simple and obvious step, the implementation of this solution vastly simplifies the reporting and analysis of data and will provide for a more timely and available level of association data.

3.3 SMART

The SMO application allows the ER project to produce a high quality, complete stream of analytical data. This data is now captured in a reliable, secure, and controlled data repository. The next stage of the IMS Strategic Plan called for the creation of an application that can extract, analyse, and report this analytical data. Critical to the success of such an application would be data accessibility, strong tracking functionality, ease of use and a high level of GIS integration.

Currently, maps have to be requested and specified from the GIS facilitycreating high costs, long turnaround times, and often the need to request multiple iterations to get what was desired. Similarly, on the tabular side, data sets and standard reports need to be requested and generated by the “data group.” While this approach was appropriate when data was not centrally managed and “clean” data was not readily available, as these issues fade in importance, it is becoming increasingly feasible to allow for direct access to data, automated reporting and individual map production. The application that is currently being developed to accomplish these tasks is the Spatial, Mapping, Analysis and Reporting Tool (SMART).

SMART is designed to allow data access and analysis to the field teams that generate the data. SMART consists of several modules: Project Setup, Data Set Generation, Report Generation and Tracking (Figure 4). SMART is built on ArcView 8.1 technology. It is a customisation of the ArcView interface that will interact with a VisualBasic application. Additionally, a customized ArcCatalog will be used as a tracking and metadata mechanism.

The functionality of SMART can be described as follows. A data set needs to be extracted from the over 2.5 million analytical result records contained in ERDB. SMART provides a user with a number of ways to accomplish this task. There is the straight Standard Query Language (SQL) query interface similar to the one provided by ArcView. There is a custom “Query Builder” with an array of pickable lists that allow a user unfamiliar with the complex structure (or a nonSQL power user) to nonetheless develop complex queries on the data. Finally, the user may make location or PRS selections directly off the map. Once this data set is retrieved, the user can then view and further manipulate it.

Once an acceptable data set is created, a list of those records can be saved and audited. Metadata can be associated with the saved list, describing the data set preparation process. If this data set is then used in a report or map, an association to the list and metadata can be established. When maps and reports are now sent to colleagues, regulators, or other stakeholders, the relevant database records, selection criteria, and data preparation method information can be quickly accessed and understood. This effectively closes the loop on reporting and data activities.

Figure 4: SMART Application Overview

The mapping and analysis functions provided in SMART will allow users to quickly post analyte, analyte result values, depths and analyte detects on the map to their associated sampling location points. The application will also automate the placing of graphs that describe one or many analytes with detection limits on the map again associated with their sampling locations.

The goal of the application is to allow a user an intuitive and easy interface. To that end not all ArcView functionality will be exposed. Custom buttons and interfaces will allow the user to pick parameters and easily generate the desired outputs whether those outputs are a table, a standalone graph, and data posted on a map, or graphs posted on a map. The result will be a new level of data transparency to the user, immediate access to analytical and GIS data, an integration of analytical and GIS data, visualization and reporting tools that will allow the user to interact with their data immediately, and tools to be able to keep track of the decision making process associated with data set, map and report production.

4.0 Conclusions and Future Work

The ER Project study presented here is an example of a successful reengineering effort of business processes in an environmental restoration program. Although production work at Los Alamos National Laboratory is ongoing, the project has been able to build new work processes and electronic systems that meet their needs as well as the needs of the internal and external stakeholders. Readily accessible data, integration of GIS data, better data control and data quality will not only satisfy user needs, but will provide documentation and audit tracking capabilities that can be utilized in a legal arena, if necessary, to defend the environmental Administrative Record.

The reengineering effort has also been an important step toward creating an information technology infrastructure that will support other environmentallyfocused decision support systems. Two future applications (currently being prototyped) that the project will tackle next are a “decision and knowledge management portal” currently codenamed Zeus and a modelling management system known as GeoPro.

Zeus will be a webbased application based around ArcIMS that will bring together documents, GIS data, technical data and project workflow statuses to allow upper management to understand, manage and view the information surrounding all environmental issues. Zeus will provide multiple entry points to information: by organization, by regulatory driver, by geographic feature, etc.

GeoPro is an application that will integrate databases that store technical data (ERDB) with the data from numerical models and 3D framework modeling software. The result will be a seamless integration of data across systems, the streamlining of model evaluation, and the facilitation of management decision making. The ER Project is currently working with the US Geological Survey on jointly developing this aspect of the system.

In conclusion, the ER Project has developed a strategic plan, robust data repository, and the primary building blocks in a series of applications that to support environmental decisionmaking. These tools will allow for more efficient use of resources and acceleration of complex environmental clean up activities at LANL.

5.0 References

Canepa, J.A., 1999, Integrated Information Management Plan. Los Alamos National Laboratory memorandum E/ER:99.228 with attachment (ER ID 64099).

6.0 Author Information

Steve Scherma
Senior Consultant
1567 Luisa St.
Santa FE, NM 87505

Preferred contact information:

5055014786 (cell)

scherma@lanl.gov
Additional contact information:

5058206501 (Santa Fe office)
3033887072 (Denver office)
3033887613 (Denver fax)

sscherma@revisioninc.com