Authors:
Chris Curlis, U.S. Bureau of Reclamation
Jeff Milliken, U.S. Bureau of Reclamation
Barbara Simpson, U.S. Bureau of Reclamation
David T. Hansen, U.S. Bureau of Reclamation
______________________________________________________________________________________________________________
Abstract: The U.S. Bureau of Reclamation and the U.S. Fish and
Wildlife Service are developing a land cover database to detect change
in the Central Valley of California. The main objective is to identify
and monitor change in habitats. An initial base line layer was created
using 1993 data This paper will detail the challenges of integrating
data from six pre-existing sources. The project also integrates some
of the unique capabilities of Arc/Info, ArcView, ERDAS Imagine and IPW
software to produce a seamless base layer which can be joined with similar
change detection programs being undertaken by other agencies for California.
______________________________________________________________________________________________________________
I. Introduction
The Central Valley of California is a critical tract of land that impacts
the economy of the United States and much of the world. The predominance
of agriculture in the Central Valley and the redistribution of water within
it to a state population of over 30 million has made it a landscape of
constant change. While the Central Valley is important to the economy
and well being of people, it is also a critical habitat to countless plant
and animal species. Despite all the aforementioned change there have
been few coordinated widespread efforts to monitor the nature and extent
of change and how it affects wildlife habitats.
The U.S. Bureau of Reclamation (USBR) and the U.S. Fish and Wildlife Service (USFWS) in cooperation with other interested parties are developing data and processes to monitor habitat change on a regular basis to develop a land cover database to detect change in the Central Valley. This project is known as the Central Valley Habitat Monitoring (CVHM) program. The baseline year of 1993 was chosen due to the availability of land use / land cover data coupled with requirements associated with water contract renewals. The data for the study area came from six pre-existing sources with varying characteristics and extents. This paper will detail the challenges of integrating data to produce a seamless base layer which can be joined with similar change monitoring programs already being undertaken by other state and federal agencies for the areas of California surrounding the Central Valley.
II. Project Area
The CVHM project area comprises approximately 31 million acres including
the entire Central Valley of California and surrounding lands. (figure1)
The boundary was determined from USBR federal irrigation districts and
USFWS areas of interest.
Figure1. CVHM Project Boundary
III. Data
The data sets used to develop an approximate representation of the
1993 base layer map include:
Ducks Unlimited / California Department of Fish and Game (DU/CDF&G)
This widely used data set was developed from multi-date 1993 Landsat TM
imagery. It is generally based on 2 1/2 acre minimum mapping unit polygons
which were derived from various original source data as well as an unsupervised
pixel classification of the TM data.
California Department of Water Resources Land Use (DWRLU)
The DWRLU database is on a county basis. Counties are not mapped on an
annual basis. Where available data was utilized for the years 1989-1995,
though the majority of counties used were within ± 2 years of the
1993 base year.
California Department of Conservation Farmland Mapping data (FMMP)
This database represents ground conditions for 1994 and also does not cover
the entire study area. This database primarily identifies agricultural
and urban areas.
California Department of Forestry Hardwoods Mapping (HDWD)
This database was developed in the early 1990's and covers only a portion
of the study area along the edges of the Central Valley floor.
Gap Analysis (GAP) This database has data for virtually
the entire project area though it does represent ground conditions that
are pre 1993. It uses the Wildlife Habitat Relationships (WHR) classification
system.
U.S.Geological Survey National Land Cover Database (NLCD)
This large region based land cover classification is derived from early
1990's TM imagery and several ancillary data sets.
U.S. Forest Service Vegetation Mapping (USFS) This database
has 2 ½ acre mmu polygons which are labeled in the CALVEG vegetation
classification. The polygons are derived from Landsat TM data using
some of the same techniques described in this paper.
The first six data sets above were integrated and compared to develop
the 1993 CVHM base map, however in portions of the study where the USFS
database existed for the proper time frame it was crosswalked and incorporated
directly into the CVHM base map.
Figure 2 shows the classification system and areal extent of the six data sources.
Figure 2. Areal Extent and Classification Legends of the Six
Data Sets to be Integrated
IV. Methodology
Classification System
A critical starting point for the integration of the six data sets
was the development of crosswalks which would convert every class in all
data sets to a common set of classes to be used across all areas. This
involved frequent review and revision based on the needs of all intended
users of the final 1993 base map. The process crosswalked six separate
classification legends into broad WHR categories. These categories
are under review and being modified in accordance with recommended standards
coming from state and national guidelines. See Hansen, et. al., 2001
for specifics on the classification systems and crosswalksused in the CVHM.
Data Conversion
The six existing data sets were crosswalked to the CVHM classification
through the use of AMLs in Arc/Info for the vector based polygon data as
well as a recoding program in Erdas Imagine for raster based data such
as the NLCD. All the crosswalked ARC coverages were next converted to raster
GRID data and then imported into Erdas Imagine .img files.
Image Segmentation
Overview - A unique aspect in this integration of the six non-conforming
data sources is the use of image segmentation algorithms (Woodcock and
Harward, 1992) to create spectral based polygons. It is often very difficult
to compare and contrast multiple sources of land cover data covering the
same area of interest. This is due to differences in classification systems,
minimum mapping unit size, and methods used for creating polygon boundaries.
Image segmentation offers a method for creating polygons from any digital
image based solely on spectral similarity. The resulting polygons carry
no label except a unique polygon ID. Land cover (or other) labels can be
given to each polygon based on any other ancillary data source (e.g. other
digital landcover layers, pixel-level spectral classification, etc.).
In the case of this project, after crosswalking all source data to the common CVHM classification system, grids were made for each source layer and “overlaid” with the spectral polygons. Histograms representing the distribution of classes for each source layer within each spectral polygon can then be generated. Based on this distribution, a single label can be assigned to the spectral polygon for each source layer. The result is a database containing a land cover label representative of each source layer for each spectral polygon (figure 4). This allows for a more direct comparison of databases regardless of differing minimum mapping units and mapping methods.
Determining proper parameters for the image segmentation algorithm is critical to getting acceptable results. Parameters may vary as a function of the type and resolution of the source imagery as well as other considerations such as the level of detail in the classification system being used.
Parameters and source imagery – Image segmentation is an iterative
algorithm which aggregates digital image pixels into contiguous groups
of spectrally similar pixels (regions). Image segmentation algorithms for
the CVHM project were developed by the Boston University Center of Remote
Sensing and function within the Image Processing Workbench (IPW) public
domain software (Frew, 1990). The algorithm produces a single raster layer
whose pixels each carry a unique number identifying which region they belong
too. ARC programs “imagegrid” and “gridpoly” can be used to convert the
raster region coverage into an ARC polygon coverage. Region boundaries
tend to conform to real boundaries in the landscape, much like polygons
derived from traditional aerial photo interpretation. Depending on the
type and resolution of the source imagery, minimum region size specified,
and spectral thresholds, regions may also represent much more subtle changes
in the landscape than what is typically delineated using more traditional
methods. However, these are often aggregated to coarser polygons as a function
of the classification system and nature of the source data being used for
labeling the regions. The user can define both spectral and spatial thresholds
to control the size and nature of regions, as well as merging parameters
for pixels during the multi-pass process.
Landsat 5 Thematic Mapper (TM) imagery from 1993 was used for
this project. A 2 ½ acre minimum mapping unit (mmu) was used with
the segmentation algorithm. This mmu is also consistent with other statewide
mapping efforts. The computational constraints created with this mmu necessitated
the use of 58 processing areas for image segmentation. Landsat TM imagery
was subset for each processing area. TM bands 3, 4, and a texture band
derived from band 4 were used for the segmentation process. This combination
of bands (3,4, texture of 4) have proven effective in vegetation
mapping efforts such as the USFS vegetation mapping program in Region 5,
California (Miller, et. al., 1994). Figure 3 displays segmentation polygons
generated from the Landsat TM data. Figures 4-8 show the polygons from
several of the pre-existing data sets for the same area as Figure 3.
Differences are apparent in the polygon labels and extent.
Figure3. Image segmentation polygons overlaying Landsat TM imagery
Figure 4. DU polygons within area shown in Figure 3.
Figure 5. DWR polygons within area shown in Figure 3.
Figure 6. DOC polygons within area shown in Figure 3.
Figure 7. GAP polygons within area shown in Figure 3.
Figure 8. NLCD pixel based polygons within area shown in Figure
3.
Attributing polygons
After generating 2 1/2 acre mmu spectral polygons for each processing
area, polygons were attributed based on raster grids of the six pre-existing
data sets. The "Zonal Attributes" function is used in Erdas Imagine to
examine all pixel values which underlie each spectral polygon and,
based on a plurality rule, create a new Arc coverage item and populate
this item with the label for each polygon. This process is completed
for all the 58 processing areas and also completed for each of the six
data sets. The resulting database contains six new items representing each
of the six data sets. Each item is populated with its corresponding
land use / land cover label. Figure 9 shows part of a data table:
each record represents a single polygon that has been assigned land cover
/ land use labels based on this process.
Figure 9. New attributes created for plurality of each of six data
sets.
Du = Ducks Unlimited
Dwr = Department of Water Resources Land Use
Doc = Department of Conservation Farmland Mapping
Hdwd = CDF Hardwoods Mapping
Gap = Gap Analysis
Nlcd = USGS National Land Cover Database
Rmap29-id = Polygon ID
Developing Labels
Evaluation and development of rules to create final labels The
evaluation of the newly attributed coverages allows for the development
of rules which are used to to decide which of the six data sets will be
used to determine the final label for each spectral polygon in the CVHM
base map. Many cases of systematic misclassification were detected
through this method. For example, the DU data clearly confused many fallow
agricultural fields with the barren classification (figure 10). A
set of rules were developed to correct this deficiency and use the other
data sets correctly identify the polygons with this particular combination
of labels. A new item for the final CVHM label was added to the database
and attributed based on a combination of labels represented in the six
source data sets. (See Hansen, et. al., 2001 for a discussion of the labeling
rules) All the processing polygons and with them the spectral polygons
are mosaiked together into a complete project wide coverage. At the
time of this writing the project wide coverage is undergoing final development.
Once completed it will serve as a base map for change detection studies
in the Central Valley and undergo refinements in an ongoing monitoring
program.
Figure 10. Systematic misclassification detected through CVHM
attribute analysis.
V. Summary
In order to create a seamless 1993 database of land use / land cover
for the the CVHM project area a technique had to be developed which would
integrate all the non-conforming data available and put it into a common
format. Image segmentation has proven to be the critical tool which
accomodates this task, enabling the comparison of multiple data sets in
common geographical areas. This allows for improved mapping efforts.
As work is progressing toward standardizing geographic data, the development
of innovative techniques to integrate data will be a key element
in achieving this goal.
References
Frew, J.E. 1990. The Image Processing Workbench, PhD Dissertation, University of California, Santa Barbara p. 303.
Hansen, David T., B. Simpson, C. Curlis and J. Milllken, Legend Development for a Land Cover / Habitat Classification Project for the Central Valley of California, Twenty First Annual Esri International User Conference, San Diego, CA., July, 2001
Miller, Susan C., H. Eng, M. Byrne, J. Milliken and M. Rosenberg, Northeastern California Vegetation Mapping: A Joint Agency Effort, Fifth Forest Service Remote Sensing Applications Conference, Portland, OR., April, 1994
Woodcock, C.E., and J. Harward, 1992. "Nested-Heirarchical Scene Models and Image Segmentation". International Journal of Remote Sensing, 13(16):3167-3187.
For more information on the Central Valley Habitat Monitoring Program please refer to the following papers in these proceedings:
Framework Land Cover Monitoring of California's Central Valley - Primary Author: Barbara D. Simpson
Legend Development for a Land Cover / Habitat Classification Project for the Central Valley of California - Primary Author: David T. Hansen
Acknowledgements
My thanks to the CVHM project team: Elena Robisch and Bart Prose,
U.S. fish and Wildlife Service
Jeff Milliken, Barbara Simpson and Dave Hansen, U.S. Bureau
of Reclamation
Contacts:
Chris Curlis, U.S. Bureau of Reclamation
Jeff Milliken, U.S. Bureau of Reclamation
Barbara Simpson, U.S. Bureau of Reclamation
David T. Hansen, U.S. Bureau of Reclamation
MPGIS
U.S. Bureau of Reclamation
Mid Pacific Region
2800 Cottage Way
Sacramento, CA. USA 95825-1898
Phone: (916) 978-5030
Fax: (916) 978-5055
Email: ccurlis@mp.usbr.gov