GeoSpatial Data Sharing through the Exploitation of Metadata



David Stein, Technology, Management, and Planning Corporation (TPMC), On-Site Contractor at the National Oceanic and Atmospheric Administration (NOAA) Coastal Services Center (CSC)



Abstract: Metadata, or data about data, describe the content, quality, and condition of a dataset or product. The importance of metadata lies in its ability to maintain, organize, and provide information about spatial data. Just as the World Wide Web is viewed as the information superhighway, metadata are increasingly becoming the vehicle for finding spatial data over the internet. The NOAA Coastal Services Center (CSC) has developed a search tool- the Coastal Information Directory- that uses metadata as a means for finding or sharing spatial data over the internet. This paper gives an overview of metadata, its role as a search vehicle, and the tool developed by NOAA CSC to facilitate this effort.




    1.0 Introduction

Geographic Information Systems (GIS) have proved to be efficient decision making tools in both government and the private sector. However, a wealth of geospatial data must be available at any given time for a GIS to perform its decision making functions effectively. Whether it is used to manage coastal resources, mining resources, or designed for market analysis, a GIS cannot survive without spatial data. NOAA CSC is addressing this problem through the development of a spatial data search and retrieval tool that promotes access to spatial data that can be utilized by any GIS designed for coastal management.


1.1 NOAA Coastal Services Center

In 1994, NOAA established the CSC in Charleston, South Carolina. The CSC is a coastal science and resource advisory center that draws on the expertise of NOAA and its partners to address critical coastal resource issues. The Center will serve to "bridge the gap" between coastal scientists and resource managers by bringing Center staff, technologies, and outside partner expertise to bear on national problems related to coastal ecosystems and economies.

In addition to developing data products that provide resource managers with the information they need to make responsible coastal decisions, the Center has also developed a way for this information to be shared, or disseminated to coastal managers, educators, policy makers, and the coastal resource management community. This data sharing initiative is a response to Executive Order 12906, Coordinating Geographic Data Acquisition and Access: The National Spatial Data Infrastructure (NSDI), and the Center's goal of "bridging the gap" between scientists and coastal managers.


1.2 The National Spatial Data Infrastructure

On April 11, 1994, President Clinton signed Executive Order 12906, Coordinating Geographic Data Acquisition and Access: The National Spatial Data Infrastructure (NSDI). The NSDI is defined as " the technology, policies, standards, and human resources necessary to acquire, process, store, distribute, and improve utilization of geospatial data." (U.S. Executive Office of the President, 1994). The NSDI is building the foundation to facilitate cooperation and interaction among various public and private sector organizations through policies, standards, and procedures. The vision of the NSDI strategy document is that "current and accurate geospatial data will be readily available to contribute locally, nationally, and globally to economic growth, environmental quality and stability, and social progress." Key actions underway are developing and implementing standards for framework and thematic data; producing framework and thematic data; implementing standards for geospatial data documentation and transfer; establishing procedures to use electronic networks to search for, access, and use geospatial data; and cooperating in the development of state and regional councils and private sector agreements to accomplish these actions (URL: http://www.fgdc.gov/nsdi2.html). The Federal Geographic Data Committee (FGDC) has been chosen by the Clinton Administration to be the leader in developing and enhancing the NSDI, which will achieve its goal in cooperation with state and local governments and the private sector.


1.3 Response to the Mandate

In response to Executive Order 12906; Coordinating Geographic Data Acquisition and Access: The National Spatial Data Infrastructure (NSDI), the FGDC is in the process of implementing the Content Standards for Digital Geospatial Metadata, and the National Geospatial Data Clearinghouse. The Content Standards for Digital Geospatial Metadata were approved June 8, 1994, and require federal agencies to use the standard to document data that they produce, beginning in 1995. The National Geospatial Data Clearinghouse is comprised of many distributed, electronically-linked stores of information about geospatial data (FGDC 1995). The goal of the clearinghouse is to facilitate large-scale access to geospatial metadata.

NOAA CSC's response to Executive Order 12906 has been to adopt the FGDC Content Standards for Digital Geospatial Metadata, and to develop a search engine, built on metadata, that facilitates access to spatial data held by agencies that have a coastal interest.


2.0 METADATA

Metadata are data about data, or information that describes the content, quality, and condition of a dataset (Martin 1983). Metadata provide the user with the who, what, when, where, how, and why that are pervasive in every dataset. Metadata help a user to understand what data are about and what processes occurred in producing those data.

All datasets held at NOAA CSC are documented according to the FGDC metadata content standards, which provide a common set of terminology and definitions to be provided in a metadata record. The standard also specifies the elements or fields needed to support the following three major uses of metadata: (1) maintaining an organizations internal investment in geospatial data, (2) providing information to data clearinghouses or catalogs, and (3) providing information needed to process and interpret data transferred from another organization (FGDC 1994). In addition, metadata are increasingly utilized as the foundation on which spatial data searches are performed.

As demands for digital geospatial data grow larger with increases in GIS technology and World Wide Web (WWW) accessibility, metadata are an effective way to satisfy these demands. Metadata have a niche in the spatial data community as the only consistent element among diverse data. As a result of this consistency, metadata may be used, or exploited as a search mechanism. In keeping with this concept, metadata not only describe data, but also allow a user to access data. When performing spatial data searches, it is the metadata that is being searched rather than the actual data. To promote consistency, metadata records have to be created using the production rules stated in the FGDC Content Standards for Digital Geospatial Metadata. Figure (1), below, demonstrates the role metadata play in spatial data searches.

Figure 1. Metadata Search Model. To find data products using a search engine such as the Coastal Information Directory, the following steps occur: (1) a query is performed using the search engine interface, (2) the search results are returned in the form of metadata (usually FGDC compliant), and (3) the metadata will either give the user direct access to the data, or instructions for acquiring the data. As shown above, metadata provide the vital link between client and data product.


3.0 THE COASTAL INFORMATION DIRECTORY

The Coastal Information Directory (CID), developed by NOAA CSC is a coastal information search and retrieval tool that allows a user to search a variety of databases throughout the U. S. for descriptions of coastal data, information, and products. Some items are available on-line, while others must be ordered from the given contact point found in the metadata. CID was designed to be accessible to users with basic or sophisticated computer communications, and was developed using free software (Web server, Web browser, freeWAIS-sf, PERL).

Figure 2. The Coastal Information Directory (CID) user interface.

CID allows for full text (keyword), geographical, and date searches. For example, if "salmon" were entered as a keyword and searched on, CID's search engine would search for the word "salmon" in every metadata file of every database that is accessible in the search network. In this particular search network there are fifteen searchable databases. The following graphic is CID's result page of the keyword search for "salmon".

Figure 3. The results of the keyword query "salmon".

The search engine found three dataset descriptions with the word "salmon". The third result is the Columbia River Estuary Change Detection Project, a project completed by the NOAA Coastal Services Center in cooperation with the Columbia River Estuary Study Task Force (CREST). This project is a combination of satellite imagery showing landscape change and GIS data layers that were integrated into a CD-ROM product. The metadata for this project can be viewed in either its native FGDC format, or a format that has been converted to a more readable text (Figure 4.).

Figure 4. Abstract section of the metadata for the Columbia River Estuary Change Detection Project

To acquire the project data (CD-ROM), one of two things can be done. The user can either contact the person listed in the distribution section of the metadata file (Figure 5.0), or order the CD-ROM directly from the distributor by clicking the "obtain" button at the top page shown in figure 3.0, and filling out the order form.

Figure 5. Distribution Section of an FGDC Compliant Metadata Record.

An order form for a data product may only be available in some cases; however, all metadata created in compliance with the FGDC content standards will have a distribution section.


4.0 SUMMARY

What do search tools such as CID mean for Geographic Information Systems? They translate into easier and more efficient access to geospatial data while enhancing the NSDI. Built entirely on metadata, these search tools allow a user to search for geospatial data, and in some cases order that data over the web with little effort involved. The only caveat is that data has to be documented both consistently, and according to FGDC standards. As long as there is metadata associated with a dataset, and the data producing organization or agency participates in a spatial data sharing network, numerous types of data can be found using a search tool such as CID or any tool with similar architecture.


References

Federal Geographic Data Committee. 1994. Content Standards for Digital Geospatial Metadata (June 8): Washington, Federal Geographic Data Committee.

Federal Geographic Data Committee. 1995. Development of a National Digital Geospatial Data Framework: Washington, Federal Geographic Data Committee.

Martin, J, 1983. Managing the Database Environment, Prentice-Hall, Englewood Cliffs, NJ.

U.S. Executive Office of the President. 1994. Coordinating Geographic Data Acquisition and Access: the National Spatial Data Infrastructure (Executive order 12906): Washington, Executive Office of the President.

URL: http://www.fgdc.gov/nsdi2.html. "FGDC Publication" US Department of Interior, Geological Survey. April 9, 1997.