Christoph Spoerri, Anne Marie Miller, Tod Dabolt
Christoph Spoerri, Anne Marie Miller, Tod Dabolt

The Reach Indexing Tool for the National Hydrography Dataset: Functionality and Impacts on State Water Programs

Abstract

The U.S. Environmental Protection Agency (EPA) is developing a Reach Indexing Tool (RIT) for the National Hydrography Dataset (NHD). NHD is a spatial database containing national surface water features. The NHD-RIT is an ArcView based tool to help states georeference surface water entities to NHD using the dynamic segmentation data model. The tool is primarily used to index Clean Water Act Section 303(d) and 305(b) waterbodies, and state Water Quality Standards. This paper discusses the evolution, purpose, and functionality of the NHD-RIT and its impacts on states' Clean Water Act programs.

Introduction and Background

Until 1998, most states in the United States did not have spatial datasets allowing them to analyze, display, and map their waterbodies for programs under the Clean Water Act (e.g., impaired waterbodies under Sections 303(d) and 305(b) and water quality standards). The lack or incompleteness of such datasets was a problem since an accurate determination of the environmental health of the nation's surface waters was not possible. The visualization of impaired waters would also have been beneficial to states and the federal government to make decisions regarding resource allocation to improve water quality.

In addition to the lack of datasets, datasets used by the states often vary in quality, scale, and origin. Some states use hydrographic datasets they developed, and others use EPA's standardized, national Reach File 3 (RF3) dataset or the U.S. Geological Survey's (USGS's) digital line graph (DLG) dataset. RF3 is a spatial digital dataset that was first compiled in 1992 with a scale of 1:100,000 (for more information see: http://www.epa.gov/owowwtr1/monitoring/rf/rfback.html). States that adopted RF3 or DLG sometimes made changes to the datasets (e.g. changing the scale or the reach structure depending on their needs). This diversity in spatial datasets reduces the confidence in results from analyses using those datasets.

To promote the use of a single, national dataset, EPA contracted the Research Triangle Institute (RTI) in 1997 to design and develop a tool that would enable states to georeference their water-related information to RF3 (referred to as reach indexing). The RF3 Reach Indexing Tool (RF3-RIT) is designed to allow users to assign attributes to entire or partial RF3 reaches by using Esri's dynamic segmentation model. Indexing only portions of a reach was a major improvement, since this allows tracking water-related information in its exact location along reaches without modifying the underlying spatial dataset.

In December 1999, the first (preliminary) version of the National Hydrography Dataset (NHD) was made available to the public. The NHD was developed through a joint effort of the USGS and EPA during the last few years and is the next Reach File version. It was created by integrating USGS's DLG spatial information and EPA's RF3 attribute information, and is intended to provide a standardized, national data model for the U.S. surface water network and any associated water features. The dataset contains information about surface water features such as lakes, ponds, streams, springs, and coastlines (USGS and EPA, 2000). Within NHD, homogenous hydrologic features are combined into reaches that are assigned unique identifiers (reach codes) and form a stream network containing directional flow information (stream routing). Reaches and their reach codes provide a framework that allows any water-related information to be linked to the surface water drainage network. In addition, NHD provides the following new features over RF3 (Figures 1 and 2):

Polygons are used to represent two-dimensional (2-D) waterbody features such as lakes, ponds, and wide rivers
Artificial paths through lakes, ponds, and wide rivers simplify modeling and size estimations
NHD will have a protocol in place for updating the database to include features that were not in the original 1:100,000 version.

Figure 1. Lake representation in RF3; the lake actually consist of three reaches (red, green, purple).

Figure 2. Lake representation in NHD; the lake consists of a polygon and a single reach (artificial path)

Due to the new feature in NHD, EPA decided to develop a new version of the Reach Indexing Tool (NHD-RIT) that works with NHD. The design of the tool was begun in the summer of 1999, and development started shortly after the first release of NHD. The rest of this paper will discuss some of the major functions of the tool and benefits that may occur to state water programs through the use of the NHD-RIT. More detail about the information presented in this paper is available from RTI (2000a and 2000b).

Reach Indexing Tool Functionalities

The following section discusses the functionality and the requirements of the NHD-RIT and describes the design approach used to satisfy them. The requirements for the new NHD version of the tool were collected over a 2-month period. Inputs were obtained and collected from users of the RF3-RIT as well as an EPA team that is involved in developing NHD applications and tools.

General Design

The NHD-RIT is designed as an ArcView extension. This allows the user to include the tool with any existing ArcView project as well as any other extension. As an extension, it also has the advantage of being easily upgraded without having to rebuild existing ArcView projects. This can be achieved by simply replacing the extension with a new version of the NHD-RIT, whenever one becomes available.

In order to avoid any conflicts with other extensions or project-specific scripts, all the scripts in the NHD-RIT are prefixed with "rit". In addition, the tool does not make use of global variables, since they can be easily overwritten by other applications. Furthermore, the use of global variables can lead to confusion, since developers/programmers can easily lose track of when, where, and by whom they were created or last modified. To avoid these problems, the NHD-RIT keeps track of application-wide information by storing it in a Script Editor (Sed) class as a script object. This also allows the user to keep track of the information across indexing sessions, meaning that after the user exits the ArcView project, the indexing can be resumed where it was left off the next time the project is opened again. Another advantage is that a project (including the necessary external data files) can be sent to another user, and this user can start working where the previous user stopped. This is especially useful when multiple people are working on the same project.

The information in the script object is stored in a hierarchical structure by key word. For example:

<Keyword 1>
	<Keyword 1.1>
		Information line 1
		Information line 2
	<Keyword 1.2>
		Information line 1
<Keyword 2>
	<Keyword 2.1>
etc.

To retrieve data for a given keyword, the text string for the script can be searched for the keyword. Any information under the keyword or any subkey words are returned. New information or keywords can easily be added, and old information can be removed when the need arises. The manipulation and retrieval of information in the script object occurs through three scripts:

rit.readinfo	reads the desired information from the specified keyword
rit.writeinfo	writes new information to the desired keyword; if the keyword does not yet exist it will be created, otherwise the information will be overwritten
rit.deleteinfo	deletes the specified keyword or information

Assigning Attributes to Entire or Sections of Reaches

The requirement to allow the user to assign attributes to entire reaches or portions of a reach without changing the underlying spatial data is essential to the tool. It allows the user to specify exactly the location to which water-related information applies without changing the underlying spatial data. By avoiding any structural changes to the spatial dataset, information from different users as well as across state water programs can be processed and compared in a consistent manner.

The following example shows how this functionality becomes crucial: a user wants to display the location where the effluent from a wastewater treatment plant enters a stream. At the same time the user wants to display the section of the stream where the water quality is impacted by the effluent wastewater. This can be easily accomplished by creating a point event for the end of the pipe and a linear event for the impacted stream section as described in the following paragraph.

To achieve this functionality, the NHD-RIT applies Esri's dynamic segmentation data model available in ArcView. This data model is based on the idea that attributes applying only to a section of a feature can be displayed by simply specifying the start (From measure) and end (To measure) points of the section as measures along the route feature (Figure 3). The start and end points and the unique identifier of the feature are then stored with the attribute data in a database table (Note: the record in the table is called an event). The same methodology is used when a point event is created, although instead of specifying both a start and end point a single point position (Point measure) is stored.

The NHD-RIT applies Esri's data model to the transport and coastline reach feature in NHD (USGS and EPA, 2000). In the transport reach feature, each reach has a length of 100, and therefore the From and To measures are from 0 to 100 (with the exception of branched artificial paths in lakes where the measures are from 0 to 200). The tool allows users to select one or more features (reaches) from the transport reach feature, determines the appropriate measure, and stores the identifications of the reach (reach code), the From and To measures, and some additional information in a dBASE table (event table).

Figure 3. The From (F_meas) and To (T_meas) measures specify the exact location along a reach (Rch_code) to which a set of attributes apply. If the user specifies more than two sets of attributes to a reach, each set can be displayed offset from the original reach by a distance specified in the Eoffset field.

Indexing of 2-D Features

A feature in NHD that was not available in RF3 is the polygon topology and three associated region themes representing lakes, wide rivers, swamps, and other hydrological features (USGS and EPA, 2000). Of the three region themes, the waterbody reach feature has an unique identifier (reach code) and can exist for headwater, terminal, in-line, and isolated waterbodies. The waterbody reach provides a link to waterbodies for external information.

The NHD-RIT also allows the user to index regions from the waterbody reach theme. Since the dynamic segmentation model cannot be applied to polygon features, the tool creates a shapefile that contains the waterbody features selected by the user. The selection is made by drawing a polygon around the waterbodies to be indexed.

By using a shapefile, the user can also index only partial waterbodies. For this purpose, the user needs to draw a polygon around the portion of the waterbody that should be indexed. The tool then clips the desired section of the waterbody with the polygon, and saves the resulting shape in the waterbody shapefile (Figure 4).

Figure 4. The polygon (thin black line) created by the user is used to clip the 07080107_WBrch theme (blue) and create the waterbody shape file 07080107w.shp (orange).

Other Requirements/Functionalities of the NHD-RIT

Use the standard NHD event table format

EPA's NHD Team developed a standardized structure for event tables that are used with NHD. In addition to the default event table fields (F_meas, T_meas, Rch_code, and Eoffset), the tables include the following fields:

EVENT_ID: An unique identifier used for event maintenance

ENTITY_ID: Identifier used to link the event to an external data source

ATTR_PRG: Attribute describing the program/purpose/classification of the event table

ATTR_VAL: Value related to the Attr_Prg field

META_ID: Identifier used to link the event to metadata information

STATE: State abbreviation in which the reach is located

RCH_DATE: Date when the associate reach was created

DUU_ID: Digital update unit (DUU) identifier

Create, modify, and delete events from the event table and waterbody shapefiles through a user-friendly interface

The NHD-RIT is designed to hide the complexity of the dynamic segmentation data model and its data structure from the user. This is accomplished by providing the user with tools for adding, modifying, and deleting linear and point events and waterbody shapefiles (Figure 5).

Figure 5. Menus, buttons, and tools available to the user. The buttons with blue symbols are the additional buttons and tools provided by the NHD-RIT

To add linear events, selection tools (see below) are available to the user. After selecting the desired reaches from the transport reach theme and providing an ID for the new events, the tool populates the fields in the event table with the appropriate data values.
To create a point event, the user has two choices: (1) a point located on the reach, and (2) a point offset from the reach. In either case, the user only has to click on the location where the point should be created.
Tools for editing spatial and attribute information in the event tables and waterbody shapefiles are provided as well. The user can change the extent of indexed entities by moving the end points, or an entity can be split into two if the need arises. Entity ID and attribute program (Attr_prg) and value (Attr_val) fields can be changed as well.
The NHD-RIT also allows the user to delete linear and point events by selecting them and clicking on the delete button.

Provide a set of selection tools to facilitate the selection of continuous sets of reaches

The NHD-RIT includes several selection tools to help the user select multiple stream reaches that make up a surface water entity connected by flow relations. Currently, only the "point-to-point" selection functionality, enabling the user to select all reaches between two points along the transport route, and the "upstream" selection functionality, selects all reaches upstream of a selected reach, are implemented. Additional selection tools will be added later such as:

select only reaches along the main stem upstream of the selected reach

select all reaches along the main stem downstream of the selected reach

select all reaches between two reaches including tributaries.

Create and maintain meta data for event and waterbody tables

The tool maintains meta data that is compliant to the Federal Geographic Data Committee (FGDC) standard (Version 2 - 1998; FGDC-STD-001 June 1998) for every event and indexed waterbody. To minimize the amount of meta data, events share meta data entries. If events were created with the same source information and by the same user, they share the same meta data and therefore the information is stored only once.
Meta data are maintained in a dBASE table, which contains a separate field for each item of information that is required. Most of the information is automatically maintained by the tool. The user is only responsible for providing information about himself/herself and the sources that were used to index the entities (Figure 6).

Figure 6. Main entry screen for the meta data created by the NHD-RIT.

Provide the user with a list of IDs to facilitate indexing

To facilitate the task of assigning IDs to events, the tool provides the user with a list of IDs. The list is specified in one of the following ways:

As a predefined list in dBASE format that is loaded into the project

Created while the user indexes; in this case, IDs are added to the list as the user enters them

Extracted from an existing database through Open DataBase Connectivity (ODBC).

The list is displayed (Figure 7) whenever the user adds a new line/point entity or a new waterbody feature. If the user wishes to change entity information for one or more events, the list will be displayed as well.

Figure 7. This dialog box is display, whenever a user adds a new entity or updates information related to an entity ID, attribute program or attribute value.

Conflate information from user-specified coverages to event tables

To provide an easy way for new users to convert their existing data to event tables, an automatic conflation tool is integrated into the NHD-RIT. The conflation functionality creates events by creating buffers around features to be indexed. These buffers are then used to select the closest NHD reach, on which the event is created. Once the conflation process is completed, the user is encouraged to visually verify the results. For this purpose, the tool contains a quality assurance/quality control (QA/QC) utility that steps through the conflated entities and lets the user compare them to the original features.

Work on NHD coverages and shapefiles

To provide users with maximum flexibility, the tool was implemented to work with NHD in coverage or measure shapefile format.

Create a transaction file for updating a central database

Throughout the indexing process, the tool records any creation, modification, or deletion of an event in a transaction table. This table is used for event maintenance in a central event database such as EPA's Reach Addressing Database (RAD).

Impact on State Water Programs

The benefits of using the NHD-RIT can be evaluated on two levels: technical and management.

Technical benefits

On the technical level, the following benefits can be identified:

The NHD-RIT automates the process of georeferencing surface water information. The users only need a working knowledge of ArcView and the surface waters they are indexing. This benefits not only states that often have a shortage of geographic information system (GIS) specialists, but also enables smaller agencies and groups to take advantage of NHD. This is accomplished by reducing the learning curve to index water-related information, because the tool does not require the user to learn the complex data structure of NHD.

Event tables provide a dynamic environment. Changes in water-related information can lead to frequent updates in waterbody delineation. Events can be easily added, modified, deleted, or copied with the NHD-RIT. Also, the offset feature in ArcView allows the user to display information that applies to the same reach at the same time in a clear manner.

Information for different water programs (e.g., Clean Water Act Sections 303(d) and 305(b)) can be displayed using the same spatial dataset, which reduces the storage space requirements. Also, event tables can be easily exchanged between users, because these tables are often only a fraction of the size of a spatial dataset.

One state that used the RF3-RIT version extensively is Tennessee. Before they used the NHD-RIT, the GIS staff had to visually identify reaches and look up their IDs in the attribute table of the RF3 coverage. The IDs were then stored with the water quality information and were used to link the information back to the coverage. This process involved various steps and was not automated, which made it fairly time consuming and prone to errors. In addition, it was not possible to assign attributes to only a section of a reach.

Through the use of the RF3-RIT (and now the NHD-RIT), Tennessee users are able to simply select the reaches that need to be indexed and assign IDs with a few mouse clicks. Also, they are now able to assign attributes to an exact location on the stream network, since the tool is not limited to indexing only entire reaches.

Another benefit noted by Tennessee users is that mapping of the information can be done much more quickly. Previously, creating a map involved various join operations among different datasets. This is no longer required since the event tables provide an easy-to-use structure that facilitates the linking of spatial and attribute information.

Tennessee recently indexed their Section 305(b) waters for the entire state. To do this, they had two staff members working together, one using EPA's 305(b) Assessment Database (ADB) and one using the RIT. While one person entered the data into the ADB, the second person delineated the waterbody and was able to provide additional spatial information for the database (e.g., length of waterbody). Through this method, a waterbody entity could be loaded into the database and indexed at the same time.

Management benefits

State and federal resource managers will benefit from spatial information in a consistent format that they have never had before.

Maps can be created easily and quickly from event tables, and this can provide management with excellent visual tools for decision making. For example, impaired waters and the potential sources of pollution can be easily located.

Event tables can also be used as a link between the different databases because they contain the reach code, which can be used to spatially relate information and make decisions based on this. For example, if point discharge information were mapped with the NHD-RIT (point events) and any streams or lakes with water quality problems were also indexed, a map displaying both could help management determine which facilities may have water quality problems and which may not.

In addition to improved mapping, connecting water resource information to the NHD expands the range of questions water quality managers can ask of their data. Using NHD's inherent network, a manager could evaluate the spatial distribution of pesticides in source water as a function of stream distance from a collection of drinking water intakes.

Conclusion

The Reach Indexing Tool allows users to assign and display water-related information easily and consistently. It is designed to work with the recently released NHD, and provides a full set of functionalities that provide a user-friendly way to georeference any water-related information. The output of the indexing work (event tables) can be used for modeling and display purposes to improve decision making. EPA and several states are actively using the RIT.

In Tennessee for example, the tool was a used to to create and submit a section 303(d) list to EPA for the year 2000. The tool was used to index their section 303(d) list for the previous cycle (1998), which made the data easily accessible for the cycle in the year 2000. EPA is using the NHD-RIT to georeference several types of waterbodies to the NHD for national tracking and decision making.

References

USGS and EPA. 2000. The National Hydrography Dataset: Concepts and Content, [On-line]. Available: http://nhd.usgs.gov/chapter1/index.html. [Access date: 2000, May 26].
Research Triangle Institute. 2000. Reach Indexing Tool for the National Hydrography Dataset (NHD-RIT): Requirements Document. Research Triangle Park, NC.
Research Triangle Institute. 2000. Reach Indexing Tool for the National Hydrography Dataset (NHD-RIT): Design Document,. Research Triangle Park, NC.

Acknowledgments

The work described in this paper was funded by the U.S. Environmental Protection Agency under Contract 68-C7-0056 with Research Triangle Institute (RTI). RTI gratefully acknowledges this support.

Disclaimer: Although the research described has been funded wholly or in part by the U.S. Environmental Protection Agency Contract No. 68-C7-0056 to Research Triangle Institute, it has not been subject to the Agency's review and therefore does not necessarily reflect the views of the Agency, and no official endorsement should be inferred. Mention of trade names or commercial products does not constitute endorsement or recommendation for use.

Christoph Spoerri
Environmental Scientist/System Developer
Research Triangle Institute
3040 Cornwallis Road
RTP, ND 27709
919-485-7771
spoerri@rti.org

Anne Marie Miller
GIS Specialist, Water Quality Program
Research Triangle Institute
3040 Cornwallis Road
RTP, NC 27709
919-485-7768
ammiller@rti.org

Tod Dabolt
Office of Water GIS Coordinator
U.S. EPA
Assessment and Watershed Protection Agency
401 M. Street, S.W.
Washington, DC 20460
202-260-3697
DABOLT.THOMAS@epamail.epa.gov

EVENT_ID:	An unique identifier used for event maintenance
ENTITY_ID:	Identifier used to link the event to an external data source
ATTR_PRG:	Attribute describing the program/purpose/classification of the event table
ATTR_VAL:	Value related to the Attr_Prg field
META_ID:	Identifier used to link the event to metadata information
STATE:	State abbreviation in which the reach is located
RCH_DATE:	Date when the associate reach was created
DUU_ID:	Digital update unit (DUU) identifier



Figure 5. Menus, buttons, and tools available to the user. The buttons with blue symbols are the additional buttons and tools provided by the NHD-RIT