The California Natural Diversity Database;

A Spatial Model for Cataloging Biodiversity

Patrick Gaul

Abstract: The Nature Conservancy's "Heritage Methodology" used in North and South America by affiliated Heritage Program and Conservation Data Centres has become a standard method of cataloging biodiversity. This methodology provides tools and guidelines for storing locational and qualitative information about sensitive plant and animal species and natural communities. Although the information has a very strong spatial component, the complications involved in defining and storing biological data sets have made the development of spatial models to manage and analyze this information slow and uncoordinated. With the ever increasing emphasis on bioregional planning, however, the need to apply flexible spatial analysis tools on a broad level has driven the search for a suitable geo-spatial data model.

GIS Solution: In the Heritage model, an "element occurrence" has long been defined as a data management tool, or abstraction, which describes an extant or historical population, part of a population, small group of populations, or natural community. An element occurrence most commonly depicts, but is not limited to, rare, threatened and/or endangered taxa or natural communities. These records contain various attributes, both spatial and aspatial. In the pre-GIS model spatial attributes consisted of a centrum (latitude and longitude) and a radius (precision) around it indicating how accurately the feature has been mapped. This combination of centrum and radius of confidence defined the location and extent of the element occurrence. The fact that many of these spatial features overlapped each other, greatly complicated the development of GIS representations. The California Natural Diversity Database (CNDDB), California's Heritage Program, has used the recently introduced ArcInfo "region" feature class to design and implement a new spatial model not previously possible.

Methodology: The CNDDB model allows an element occurrence portrayed in a geographic information system to be represented by a spatial feature with areal extent, as opposed to a point or line. To accurately depict the complex biological situations inherent in the Natural Heritage element occurrence model these features are:

· Capable of overlapping with other features without loss of unique identity.

· Capable of containing voids or "doughnut holes".

· Capable of representing complex situations containing several spatial components, or parts, while still being considered a single occurrence.

· Capable of simultaneously representing the location of several element occurrences which share the same geographic location.

Software: An ArcInfo Forms application has been developed to standardize and partially automate the process of entering, editing, and querying element occurrences and the automation of sub-set creation. The spatial model was designed in such a way as to allow for its use on ArcInfo as well as ArcView clients.

This paper will discuss, in detail the biological and geographical justifications for the spatial model, and some of the technical aspects involved in the development of the application which supports and enforces it.


Introduction

An element occurrence record (EOR) is the central storage unit employed in the Biological Conservation Database (BCD) used by most Heritage programs to store information about element occurrences. Along with many other attributes, each record for an element occurrence is assigned a spatial component (albeit in an aspatial tabular sense) with a centrum (latitude and longitude) and a radius (precision) around it indicating how accurately the feature has been mapped. This combination of centrum and radius of confidence defines the location of the element occurrence.

The GIS model presented here does not seek to supplant this time tested methodology. Rather, the goal is to start with this concept and to update the model to allow it to take full advantage of GIS technology. The need to be able to select, compare or analyze element occurrences based on their spatial characteristics is the driving force behind this effort.

It is not the intention of this document to resolve basic questions regarding the definition of an "element occurrence". This ongoing effort is being addressed by The Nature Conservancy's Element Occurrence Design Committee. For the purposes of this model, however, an element occurrence has been defined as a data management tool, or abstraction and not a physical feature on the ground (known by some programs as an "EOR", or element occurrence record, see the definition below).

Also, do not allow confusion concerning the use of the terms accuracy and precision used to describe how closely a feature has been mapped to its actual real world location to become an issue. The intended meaning of the terms should be clear in the context of this document.

I) An element occurrence (EO) is a data management tool, or abstraction, which describes an extant or historical population, part of a population, small group of populations, or natural community. An element occurrence has both spatial and tabular components, represented by a mappable feature and its supporting database record. An element occurrence most commonly depicts, but is not limited to, rare, threatened and/or endangered taxa or natural communities.

II) An element occurrence portrayed in a geographic information system should be represented by a spatial feature with areal extent, as opposed to a point or line. To be able to accurately depict the complex situations inherent in the Natural Heritage element occurrence model these features should:

· Be capable of overlapping with other features without loss of unique identity.

· Be capable of containing voids or "doughnut holes".

· Be capable of representing complex situations containing several spatial components, or parts, while still being considered a single occurrence.

· Be capable of simultaneously representing the location of several element occurrences which share the same geographic location.

Note: This will require the use of a software model which will allow for these situations. The ArcInfo regions feature class or the ArcView object as contained in a shapefile meets these requirements. Additional software models may also comply.

III) Element occurrence spatial features should represent the full geographic extent, or footprint, over which the occurrence can be said to have a possible influence.

This implies that the spatial feature as stored in a GIS represents not just a point or a line, but rather a sphere of influence around those simple features. The size and shape of this footprint is based on how accurately the occurrence can be located, or upon other biological, ecological, or geographical considerations. Thus, this sphere of influence is the result of the combination of several possible supporting components:

1) Source features

2) Buffers

1) Source Features

Source features cartographically represent real world situations and act as the mappable source for an element occurrence. Source features can be either points, lines or areas and belong to one of two accuracy types; specific or non-specific.

· Specific source features are those which accurately represent the location and extent of an element occurrence.

· Non-specific source features are those which approximately represent the location and extent of an element occurrence.

Note: Because an element occurrence spatial feature should represent the full geographic extent, or footprint, over which the occurrence can be said to have a possible influence, the physical size of a non-specific source feature will typically be larger than that of a specific source feature. This effect is moderated by the fact that non-specific source features can be weighted differently for purposes of analysis because of their lower spatial accuracy (see the discussion of accuracy class on page 9).

Points. Descriptive or mapped source information tying an occurrence to a discreet x,y coordinate location.

· A specific point source feature would be a highly accurate coordinate location, such as accurately mapped features or GPS coordinates.

· A non-specific point source feature would be a vague descriptive approximation, such as a section.

S, M, and G precision occurrences (as defined and used in the BCD) are examples of point source features.

Lines. Descriptive or mapped source information tying an occurrence to a line such as a stream, canal, canyon or road.

· A Specific line source feature would represent a particular line segment (or segments) shown as a single line on a map, or a detailed narrative description tying the occurrence to a feature on a map or a geo-spatial data source.

· A non-specific line source feature would describe the same as above in situations where the exact position of the occurrence is not known. Non-specific, in this case, does not imply that the physical location of the linear segments are in question, but rather that the position of the occurrence along those segments is uncertain. In this case it would be appropriate to include all likely segments in the occurrence.

Note: In situations where the physical location of the line segments are in question (for example, when no mapped features are available or only vague locational information is given) the element occurrence would be better represented by using a non-specific area source feature (see below).

Line features are currently not represented as such using BCD.

Areas. Descriptive or mapped source information tying an occurrence to an areal feature.

· A specific area source feature would be a lake, marsh, a stream feature represented as a double line on a map, a stand of vegetation or any other regular or irregular shaped area recognizable as such on a map or on provided source materials of the scale at which the entire data set is standardized.

· A non-specific area source feature usually could be described as a general boundary encompassing an area of occupied or suitable habitat for an element occurrence for which an exact boundary is not known.

Area features are currently not represented as such using BCD.

2) Buffers

To allow the source feature to more closely depict the reality which the element occurrence attempts to represent, a buffer may be applied. Three different types of buffers are allowed are:

It must be remembered that representation of the location, extent and, to some degree, the accuracy, of an element occurrence is the primary function of the source feature, not the buffer. The buffer is merely a tool used in consort with the source feature to insure that all element occurrences conform to the GIS model by being composed of spatial features with areal extent which represent the sphere of influence or "footprint" of the occurrence.

Buffers can also allow for a shorthand method of creating spatial features in certain situations where a set of biological requirements can be standardized (see examples below).

Procedural buffer

Applies to: Specific point features and specific or non-specific line source features (Required).

The landscape feature that these source feature types represent actually has areal extent (a length and width) even though it may appear as a point or line on a map. Because our data set contains, by definition, only features with areal extent, a buffer must be added to the point or line to create such a feature. The amount of this buffer is procedurally set to the amount to the minimum mappable distance for the scale of the data set (in California's case, an 80 meter radius).

Does not apply to: Non-specific point features and Specific or non-specific area source features.

Procedural buffering is not allowed for non-specific point features. Being more general in their location these features already will be larger than the minimum mappable unit.

Procedural buffering is not allowed for any area features because an area feature is presumed to be large enough to not require it (areas smaller than the minimum mappable unit should be mapped as specific point source features).

Examples of occurrences using procedural buffers:

Specific Points

Þ An accurately located Swainson's Hawk nest.

Þ Any plant occurrence representing a small restricted population mapped at a known location.

Specific Line

Þ Lahontan cutthroat trout observed on a known stream reach or segment.

Non-Specific Lines

Þ Information linking rough sculpin observations to one or more uncertain points on a stream, with no significant barriers to movement dividing them. All likely segments, and those connecting them, would be included.

Þ Winter run chinook salmon on a known stream system but unknown segments. All stream segments would be included.

Þ Bakersfield cactus along an uncertain segment of a known aqueduct.

Biological buffer

Applies to: Specific point source features, specific or non-specific line source features and to specific area source features (optional).

A biologically indicated buffer is used in situations where certain biological considerations based on the circumstances of that occurrence or guidelines contained in the element occurrence specifications for that element indicate its use. Biologically indicated buffers are infrequently used.

Does not apply to: Non-specific point features and non-specific area source features.

Biological buffering is not allowed for non-specific point features or non-specific area source features because their general location makes detailed buffering meaningless

Examples of occurrences using biologically indicated buffers:

Specific Points

Þ A Goshawk occurrence generated by using a 200 meter buffer around a nest site to encompass foraging area.

Specific Lines

Þ A pond turtle occurrence generated by buffering a mapped stream location to encompass a 150 meter distance from the stream to include nests.

Þ An occurrence for the Sacramento splittail using a 100 meter buffer on a stream to take into consideration the fact that the fish spawns among shore plants flooded in high water.

Non-Specific Lines

Þ A plant occurrence in a canyon on Santa Rosa Island, buffered to include the upper slopes of the canyon to account for the fact that this plant is known to occur up to an elevation of 300 meters. Although the location of the canyon is known, the exact location in the canyon is not.

Þ (animal example please)

Specific Areas

Þ A aquatic bird occurrence generated by buffering a lake to encompass the lake and a 100 meter shore area.

Spatial accuracy buffer

Applies to: Non-specific point source features only (required).

The spatial accuracy indicated buffer represents, in some linear distance (meters, feet etc. depending upon the parameters of the geographic projection used by the program) the positional accuracy of the element occurrence (this does not attempt to address issues relating to map accuracy, scale or geographic projection). This will be included as a physical buffer around the source feature depicting its accuracy as plus or minus a given distance. S, M, and G precisions from the BCD essentially record spatial accuracy in a very coarse, limiting way. It would be best for this buffer distance to allow increments to be determined by each Heritage program to best suit the biology of the elements that they are tracking or the mapping methods being used (see the California example of accuracy class on page 9).

Does not apply to: Any other source feature type.

Spatial accuracy buffers are not allowed for specific point source features or any line or area source features because their spatial accuracy is implied to be the same as for all other features from the same data source, and is implicit to the map.

Examples of occurrences using spatial accuracy indicated buffers:

Non-Specific Points

Þ A plant occurrence created using information from an herbarium label which locates the plant to a vague location, such as a town.

Buffer Summary

The effects of buffers vary based on the source feature type:

Specific point source features utilize a required procedural buffer with an optional biological buffer. In cases where a biological buffer is used, and it exceeds the procedural buffer amount, the total buffer amount is equal to the biological buffer alone, not a combination of the two. Use of a biological buffer, thus replaces the need for a procedural buffer. This would result in circular element occurrence features with a radius equal to the minimum mappable unit or the biological buffer.

Non-specific point source features utilize a required spatial accuracy buffer and result in circular element occurrence features with a radius equal to the spatial accuracy buffer, decreasing in accuracy as they increase in size.

Specific line source features utilize a required procedural buffer with an optional biological buffer. Similar to specific point source features, in cases where a biological buffer is used, and the buffer amount exceeds the procedural buffer amount, the total buffer amount is equal to the biological buffer alone, not a combination of the two. This would result in an element occurrence with spatial extent but linear (if sausage like) appearance.

Non-specific line source features utilize a required procedural buffer with an optional biological buffer and result in features similar in appearance to specific line source features. Because a non-specific line source feature does not imply that the physical location of the line segments are in question, but rather that the position along those segments is uncertain these features generally encompass more segments than specific line source features.

Specific area source features may utilize an optional biological buffer, but no other buffering is allowed. Specific area source features result in element occurrences with areal extent, defined as bounded areas.

Non-specific area source features may not use buffers of any kind. Procedural buffering is not allowed for the above mentioned reason. Because of the general nature of non-specific area source features, both biological and spatial accuracy buffers are implied. For purposes of comparison, a spatial accuracy buffer is assigned to a point source feature to represent a circular feature. The resulting size and shape of this circular feature encompasses all area over which the occurrence has possible influence. Likewise, the size and shape of the non-specific area source feature is chosen in such a way as to include all area over which the occurrence may have a possible influence. In this case, however, the source feature is already an area, and no further processing is required.

IV) Tabular information referring to the GIS characteristics of the element occurrence should be stored in an internal feature attribute table. These data are distinct from all other tabular information stored about the element occurrence such as scientific name, global rank, federal listing status, etc. in that they are tied directly to the GIS characteristics of element occurrences. Other supporting information should be stored in a separate database which can be linked to the GIS spatial data set.

Columns contained in the GIS feature attribute table served two functions. First they may be used as a "primary key" to provide a direct link to other data tables. Secondly, they may provide specific information about GIS properties. Examine the following table structure from the California Natural Diversity Database:

COLUMN ITEM NAME WIDTH OUTPUT TYPE N.DEC 

1  AREA        8 18 F 5  Standard ArcInfo items
9  PERIMETER   8 18 F 5   "
17 EO#         4 5  B -   "
21 EO-ID       4 5  B -   "
25 PARTS       7 7  I -  Number of parts for the occurrence
32 MAPNDX      5 5  C -  Primary key for spatial features 
37 EONDX       6 6  I -  Primary key for element occurrences
43 ELCODE     10 10 C -  Primary key for elements
53 EONUM       3 3  I -  Compound key for occurrences
56 SOURCETYPE  1 1  C -  Source feature type (P, L or A)
57 ELTYPE_CODE 1 1  I -  Element type (1,2,3 or 4)
58 ACC_CLASS   2 2  I -  Accuracy class (1-10)
60 EOCOUNT     2 2  I -  No. of occurrences at this location

** REDEFINED ITEMS **
57 LUCODE      5 5  I -  Used for map symbology



In this example, items MAPNDX, EONDX and ELCODE are primary keys, allowing connection to external tables. ACC_CLASS and SOURCETYPE contain values which refer to GIS properties of the features. The inclusion of all of these columns is not mandatory, but ACC_CLASS and SOURCETYPE would be required for the GIS model to be implemented as presented in this document (see the complete California Natural Diversity Database metadata at http://www.dfg.ca.gov/Nddb/meta.html for information on the other items listed here).

Because the actual accuracy value assigned to a feature is dependent on the source feature type, and applied differently based upon it, this information is best stored in a separate look up table, using the ACC_CLASS field as a relating item:

COLUMN ITEM NAME WIDTH OUTPUT TYPE N.DEC 
1 ACC_CLASS  2  2 I - Accuracy class (1-10)
3 ACC_TYPE   1  1 C - Accuracy type (S or N)
4 ACC_VALUE  4  4 I - Spatial accuracy (in meters)
8 ACC_TEXT  50 50 C - Description



The above referenced items are described as follows:

Accuracy Class Description


  1. Specific bounded area with an 80 meter radius
  2. Specific bounded area
  3. Non-specific bounded area
  4. Circular feature with a 150 meter radius (1/10 mile)
  5. Circular feature with a 300 meter radius (1/5 mile)
  6. Circular feature with a 600 meter radius (2/5 mile)
  7. Circular feature with a 1000 meter radius (3/5 mile)
  8. Circular feature with a 1300 meter radius (4/5 mile)
  9. Circular feature with a 1600 meter radius (1 mile)
  10. Circular feature with a 8000 meter radius (5 miles)

Spatial Accuracy Values

The spatial accuracy value expressed in meters for any source feature cannot be greater (expressed as a lower number) than the published accuracy of the source map used as a base. Like Buffers, the application of spatial accuracy value varies based on source feature.

Presently, California has chosen to apply spatial accuracy values to point source features only. The following definitions that apply to line and area source features listed here represent possible but un-implemented solutions.

Points.

· A specific point source feature would have a spatial accuracy value equal to the minimum mappable unit. In California this has been set to 80 meters.

· A non-specific point source feature would have a spatial accuracy value based on how accurately the feature has been mapped and expressed as a metric distance, plus or minus. Although the distances used could been at any increment California has standardized on the following: 80, 150, 300, 600, 1000, 1300, 1600 and 8000 meters.

Lines.

· A Specific line source feature would have a spatial accuracy value equal to the minimum mappable unit.

· A non-specific line source feature would also have a spatial accuracy value equal to the minimum mappable unit. Recall that in this case, non-specific does not imply that the physical location of the stream segments are in question, but rather that the position of the occurrence along those segments is uncertain. For this reason the source feature itself will include all likely line segments, making the occurrence longer than a specific occurrence might be. Put another way, the variability in buffers for non-specific line source features increases in a linear direction only, as opposed to radially in all directions as in a buffer for a non-specific point source feature.

Areas.

· A specific area source feature would have a spatial accuracy value equal to the minimum mappable unit. This is true even though no buffer has been applied because the feature represented has the same accuracy value as for that published for the base map.

· A non-specific area source feature would have a spatial accuracy value based on how accurately the feature has been mapped and expressed as a metric distance, plus or minus. These distances should be the same as those used for non-specific point source features (see above). Actual buffers are not allowed for non-specific area source features because their general nature implies that they are built in. This permits a range of values with which to address the issue of "how non-specific is this non-specific feature?"


A visual representation of an element occurrence is produced using this formula:


Observed

Reality

Cartographically symbolized as a:
Point, Line or Area

Source Feature
and optionally modified or enhanced by a:
Procedural, Biological or Spatial accuracy

Buffer

=

Element Occurrence

The following table summarizes, by source feature, which type of buffers apply in EO representation:

Source feature type
Point
Line
Area
Accuracy type

(ACC_TYPE)

Specific

(x,y coords or known location)

Non-specific (uncertain or vague location) Specific

(digitized, described, or captured from an accurate source map)

Non-specific

(uncertain or vague location along an accurately described line)

Specific

(digitized, described, or captured from an accurate source map)

Non-specific

(uncertain or vague location)

Procedural bufferRequired Not allowedRequired RequiredNot allowedNot allowed
Biological bufferOptional Not allowedOptional OptionalOptionalNot allowed

(implied)

Spatial accuracy bufferNot allowed RequiredNot allowed Not allowedNot allowedNot allowed
Accuracy value

(ACC_VALUE)

Same as procedural buffer

(80 meters)

Same as spatial accuracy buffer

(150-8000 meters)

UndefinedUndefined UndefinedUndefined
Accuracy class

(ACC_CLASS)

1
4-10
2
3
2
3
Resulting EO spatial typeCircular

specific bounded area

Circular Non-specific bounded area Specific bounded areaNon-specific bounded area Specific bounded areaNon-specific bounded area

This is a constantly evolving model. Because of this, it is quite likely that key components have changed by publication the time. I encourage interested persons to contact me for an updated copy.


Patrick Gaul, Applications Development Coordinator
California Natural Diversity Database
California Department of Fish and Game, Natural Heritage Division
1220 "S" Street
Sacramento, CA 95814
Telephone: (916) 322-1950
Fax: (916) 324-0475
Email: pgaul@dfg.ca.gov
URL: http://www.dfg.ca.gov