Richard J. Aspinall, Gretchen Burton and Lisa Landenburger

Mapping and Modeling Wildlife Species Distribution for Biodiversity Management

 

Abstract

ARCINFO is used to manage species distribution data in conjunction with environmental data. Maps of species records and two types of model of species distribution are constructed using ARCINFO. Models use i) a simple index of habitat preference, ii) statistical description of the environmental conditions with which the species is associated, and iii) a spatial model based on Bayes theorem. The appropriate use of these methods is discussed in the context of biodiversity management using examples from the Cairngorm Mountains in Scotland and Yellowstone National Park in Montana.


Introduction

Mapping and modelling species distribution is increasingly important for a variety of management and decision-support purposes. Management of biodiversity, species protection, species re-introduction, control of noxious weeds, and prediction of possible impacts of land use or climate changes are all areas of land management that require detailed information based in knowledge of the distribution of organisms and the relationships between distribution and environmental variables. GIS are important both for managing distribution records and environmental data, and within which to develop models of distribution that contribute to these management activities.

The construction of GIS databases for managing and mapping species distribution records and environmental data is relatively straightforward and biological recording schemes routinely use GIS and/or relational database technology to manage and maintain their data resources. Once species distribution and environmental data are integrated within a GIS environment they potentially can be used to develop predictive models of species distribution. There are, however, a very large variety of methods available for modelling distribution, many of which are not fully integrated with GIS. Statistical and spatial methods have been developed and applied to wildlife-environment relationships both to interpret environmental associations from patterns of distribution and to model distribution from limited survey and incomplete biological records. These methods include: Artificial Intelligence (Folse et al 1989; Davey and Stockwell 1991; Saarenmaa et al 1988 1994), Bayesian models (Milne et al 1989; Pereira and Itami 1991; Aspinall 1992, 1994), boolean logic (Jensen et al 1992), canonical correspondence analysis (Hill 1991), decision-trees (Walker and Moore 1988, Walker 1990; Grubb and King 1991; Lees and Ritman 1991; Moore et al 1991), generalised linear models (Austin et al 1990; Buckland and Elston 1993), habitat preference indices (Duncan, 1983), habitat suitability models (Breininger et al 1991), monotonic functions (Mackey 1993), and a variety of types of multivariate analysis (Livingstone et al 1990; Clark et al 1993). A number of purpose-written software packages have also been developed to apply particular forms of modelling and analysis including BIOCLIM (Lindemayer et al 1991) based on statistical description of climatic data and indices, and HABITAT (Walker and Cocks 1991) that identifies the envelope of environmental conditions within which a species is located.

Three of these approaches are used in the analyses presented in this paper. These are:

  1. a habitat preference index,
  2. statistical description of environmental characteristics, and
  3. Bayesian modelling.

These provide alternative but complementary approaches to the question of modelling species distribution and illustrate the importance of selecting a method that respects the data inputs being used and that provides results that can be interpreted by the user.

Study Areas and Data

Two study areas are used. The first is the Cairngorm Mountains in Scotland, UK, the second is Yellowstone National Park within the Greater Yellowstone Ecosystem (GYE) in Montana, Idaho and Wyoming, USA. The Cairngorms study area covers 11,465 sq km while YNP covers about 8500 sq km.

Species data

Species data for each area are compiled from biological survey and from casual (opportunistic) sightings reported to biological recorders. For the GYE, range maps from atlases and identification guides as well as data from GAP analyses in Idaho, Montana and Wyoming, are used to supplement the species records from survey and sightings. The spatial units for recording species distribution are 1km squares in the Cairngorms and 10km squares in the GYE.

Each of these data sources has characteristics that impact on analyses. In general, biological records can be considered a random sample, although they also record the distribution of observer/recorder effort. A particular issue with biological records based on casual sightings however, is that they only record presence and thus there is no record of areas that were visited or surveyed but in which no record was found.

Distribution data for Red Squirrel (Scuirus vulgaris) are used as a test dataset for the Cairngorms; these data have a 1km resolution. Species distribution for Bison (Bison bison) are used for the GYE, these data having a 10km resolution.

Environmental data

Scotland

Three main environmental datasets were used in the Cairngorms. A classification of the environmental characteristics of the Cairngorms into eight types provides an ecoregion dataset. Baseline climate data describing the 30-year mean monthly temperature (1951-8) and rainfall (1941-70), were calculated from meteorological station data (Aspinall and Matthews, 1994). The difference in dates for the data are associated with these this being the latest period for which data were available when the climatic datasets were being developed (between 1989 and 1992). Mapping was carried out with trend surface analysis and geostatistics, using topography and distance from the sea as covariates, to produce a climate database with 1km spatial resolution for the whole of Scotland (Aspinall and Miller, 1990; Aspinall and Matthews, 1994). The method describes both large scale trends in the data as well as more local regional variation. Since the Cairngorms area contains very few meteorological stations, models of climate based only on the local data would be unreliable. The digital climate data for Cairngorms are extracted from the Scotland-wide dataset. The digital climatic data are available for each month as a series of raster surfaces with geographic resolution of 1km.

Land cover types for the Cairngorms are from the Land Cover of Scotland (1988) dataset. This dataset is described in MLURI (1993) and the quality of the data has been subject to extensive analyses (Aspinall and Pearson, 1995). The base data contains over 1300 categories although many of these are of limited extent and not important nationally. The key to the land cover types is hierarchical and a 34 class reorganisation of the key was used in these analyses.

Yellowstone National Park

Environmental data for YNP are USGS digital elevation data and LANDSAT thematic data classified to provide a map of land cover. A spatially detailed climate database is currently being developed for the whole GYE by the Geographic Information and Analysis Center and Mountain Research Center at Montana State University, Bozeman. Land cover data for the GYE are extracted from LANDSAT imagery. No ecoregion typology currently exists for the GYE although one will be constructed from the base environmental data once they are complete.

Analytical Methods

Habitat Preference Index

The habitat preference index was proposed by Duncan (1983) as a normalised version of an index developed by Hunter (1962). The index is the percentage of a environmental attribute in the area occupied by the species divided by the percentage of that attribute in the whole area being considered. The value is transformed by adding one and taking the logarithm (base 10).

The value provides an index of the importance of each category of the environmental variable to the distribution of the species. The minimum value is 0.0. Values less than 0.3 = (log10 2.0) reflect a category that is underrepresented suggesting avoidance, values close to 0.3 represent occurrence approximately in proportion to availability and is expected if use is random in relation to area of the category. Values in excess of 0.3 indicate over-representation and suggest a preference for that category of the environmental variable.

Statistical description of environmental characteristics

Species distribution is associated with climatic conditions for each month of the year (mean monthly mean temperature and mean monthly rainfall) using statistical descriptors. These associations are expressed as a description of the statistical properties of the climatic conditions in locations that a species is recorded. Minimum, maximum, mean and standard deviation are reported.

The statistical descriptors may also be used to model species distribution. The minimum and maximum for each variable can be used to set range limits on distribution through overlay. The mean and standard deviation can be used to develop a probabilistic model of distribution through overlay of climate data in which the similarity of the conditions in a given location are compared with the mean for that variable using normal distribution based statistics.

There are a number of issues associated with this approach however, not least being that there are 24 climate variables which permit a total of almost 17 million combinations of any number of the variables that may be used to construct models. Choosing which variables to use from these 17 million combinations would have to be based on other knowledge � for example, using only the months that correspond to the breeding season or wintering period for the species. In order to provide a distribution map as an output from the climate data a model of distribution is developed from the mean annual mean temperature and the mean total annual rainfall using the Bayesian method described next.

More detailed climatic associations based on monthly climatic records are also described using mean, standard deviation, minimum and maximum of each variable in the total area where the species has been recorded as present; these relationships are not mapped since there are a large number of permutations of the 24 input variables

Bayesian modelling

A model of distribution is developed through use of a Bayesian method. This method has been fully described in Aspinall (1992, 1994). This method has been specifically designed to operate with biological records for rare and under-recorded species and environmental data of differing spatial resolutions (Aspinall 1994). The method is based on calculating conditional probabilities from the relative frequency of association between the species data and attributes of the environmental data (eg climate, land cover). In this it is similar to the Habitat Preference Method although the associations are expressed as conditional probabilities of finding (and not finding) the species given that value of the environmental variable. When the species data and environmental data have different spatial resolutions, as is frequently the case when biological records are related to environmental data, the coarse resolution units of the species data are used as a sampling frame to generate results at the spatial resolution of the environmental data. This generates a model at the spatial resolution of the environmental data. The conditional probabilities are considered to describe ecological relationships between the species and the environmental data. THe output model is produced by applying Bayes theorem to the condition probabilities for the different environmental datasets. In this study, two Bayesian models are produced. The first relates species distribution to land cover, the second uses mean annual mean temperature and the mean total annual rainfall to provide a distribution output from climate data to accompany the tabular output from the statistical description.

A Bayesian model is developed for Bison in Yellowstone National Park based on elevation and land cover.

Data requirements of the methods

These three methods examined each has different requirements for the environmental data. The habitat preference index is for categorical data. Each of the categories in an environmental dataset is analysed to determine the degree of association between that category and the species. The results are themselves best treated as categorical but may be used in rule based models to identify areas where the species will be present and absent based on over- and under-representation of a category. The method based on statistical description of environmental data is for continuous measurement on interval or ratio scales. The Bayesian method can use both continuous and categorical data at the same time. This is an important property of the method, and is particularly useful for combining categorical coverages (eg land cover, soils) with surface or field data (eg temperature, rainfall, elevation) in a single model of distribution.

Implementation

These methods are implemented as an integrated series of Arc Macro Language programmes that analyse the data within ARCINFO GRID. A simple menu interface links these programmes to allow data entry, update or analysis to be chosen and carried out. There are two sets of outputs from the AMLs. Three maps and tables, in standard format, are produced for each species and these can be viewed on screen or printed. The output models are also available for further analysis in ArcInfo.

Results

The results of analyses for Red Squirrel in the Cairngorms are given as a series of figures, including both maps and tables, that are a standard output from the AML-based analyses

Figure 1 shows a map of the distribution of records for Red Squirrel and reports the number of 1km squares where presence is recorded in the study area.

Distribution

Figure 1. Red Squirrels in Cairngorms area. Records.

Figure 2 shows the associations between the species records and climate. The map is the probability of species presence in each 1km square based on use of mean annual temperature and mean total rainfall. The table reports the mean, standard deviation, minimum and maximum temperature and rainfall for each month of the year in the set of locations from which the species has been recorded.

Climate Model

Figure 2. Statistical and Bayesian models for Red Squirrels in Cairngorms area from climate data.

Figure 3 shows associations between the species records and land cover. The map is the probability of species presence; the table reports the probabilities of species occurence in each land cover class as a percentage. The table is reported in decreasing order of probability and only probabilities greater than 60% are shown.

Bayesian Model Cairngorm

Figure 3. Bayesian Model for Red Squirrels in Cairngorms area from land cover data.

Bayesian Model YNP

Figure 4. Bayesian Model for Bison in Yellowstone National park from land cover and elevation data.

Discussion

General Observations

The data available in existing databases provide a source on which to base a first estimate of the possible distribution and climate/land cover associations of species. Repeated modelling as new data are collected, field validation, and feedback from models into data collection will improve both the data available and their use for modelling habitat changes. Biological records are usually incomplete and, at best, give only an indication of the likely distribution of a species in an area. Models developed from climate, land cover and other environmental data may however, serve a useful purpose as a framework for targetting further data collection. Models also provide a first indication of other areas with similar environmental conditions to those in which the species have already been recorded. Models can also be used to target surveys for individual species or, used in combination and with the environmental data, can be used to design sampling schemes for survey of all species. All of these may find valuable application in systematic analysis and management of biodiversity.

Data Management: Biological Records

The species data available for this study are in a wide variety of condition. For example, many of the data have unknown georeferencing accuracy and describe distribution at a very coarse scale, certainly at scales that are not of use for land management. With increased use of GIS and wide availability of accurately georeferenced environmental databases it is increasingly important that accurate georeferencing becomes established as a fundamental component of biological recording (even though animals move). This will allow increased benefit to be gained from biological records. Additionally date-stamping should be encouraged as part of a data standard for biological records as this adds to the value of records, particularly in studies of change in distribution over time.

Ecological Studies of Habitat Relations

The models produced suggest climatic, land cover and other environmental associations for the different species studied. All three methods allow rigorous identification of important ranges of environmetnal conditions. The use of 1km georeferencing as a basic unit in the Cairngorms and 10 km in Yellowstone National Park limits the extent to which studies of the impact of habitat fragmentation can be carried out since this spatial unit is far larger than the basic mapping unit of most environmental data. In landscape ecological studies for example, it is normal practise to use detailed descriptions of land cover patches as the basis for studies of fragmentation. these descriptions are at a much finer spatial resolution than the species data recorded in biological recording schemes. Models such as those produced here, do however indicate possible associations between species and environmetnal conditions as a set of hypotheses that can be tested through systematic field survey. The relationships can also form the basis for models that predict how distribution of species may change as environmental conditions change in response to a range of social, economic, political and environmental stimuli. Perhaps the most appropriate use of the output of models of the distribution of species is to target and focus further study and to help visualise the importance of individual sites and locations within the study areas.

Summary

Use of these methods, and related techniques, will grow in future as databases for GIS become more widely available, as techniques are developed that run within GIS, and as the need grows for detailed information about the interrelationships of species and environment. The information will be increasingly needed for input to discussion of sustainability and biodiversity management as well as environmental decision making. GIS can play an important role in these debates. Development of suitable methods for analysis and their comparative evaulation in different geographic regions and for different taxonomic groups will be necessary for GIS to provide information with the sufficient detail and reliability.

Acknowledgements

The work in the Cairngroms was developed while RJA was employed at the Macaulay Land Use Resaerch Institute in Aberdeen, Scotland.

References

Aspinall, R. J. (1992) An inductive modelling procedure based on Bayes� theorem for analysis of pattern in spatial data. International Journal of Geographical Information Systems, 6(2), 105-121.

Aspinall, R. J. (1994) Exploratory spatial analysis in GIS: generating geographical hypotheses from spatial data. In Worboys W F (ed) Innovations in GIS 1. London, Taylor and Francis: 139-147

Aspinall, R. J. and Matthews, K. (1994) Climate Change Impact on Distribution and Abundance of Wildlife: An analytical approach using GIS. Environment and Pollution 86, 217-223.

Aspinall, R. J. and Miller, D. R. (1990) Mixing climate change models with remotely-sensed data using raster based GIS. In: Coulson, M. G. (ed.) Remote Sensing and Global Change. Proceedings of the 16th Annual Conference of the Remote Sensing Society. pp. 1-11.

Aspinall, R. J. and Pearson, D. M. (1995) Describing and managing uncertainty of categorical maps in GIS. In: Fisher, P. F. (ed) Innovations in GIS 2. Taylor and Francis, London. pp. 71-83.

Austin, M. P., Nicholls, A. O. and Margules, C. R. (1990) Measurement of the realized quantitative niche: environmental niche of five Eucalyptus species. Ecological Monographs 60(2): 161-177

Breininger, D. R., Provancha, M. J. and Smith, R. B. (1991) Mapping Florida Shcrub Jay habitat for purposes of land management. Photogrammetric Engineering and Remote Sensing 57(11): 1467-1474

Buckland, S. T. and Elston, D. A. (1993) Empirical models for the spatial distribution of wildlife. Journal of Applied Ecology 30: 478-495.

Clark, J. D., Dunn, J. E. and Smith, K. G. (1993) A multivariate model of female Black Bear habitat use for a Geographic Information System. Journal of Wildlife Management 57(3): 519-526

Davey, S. M. and Stockwell, D. R. B. (1991) Incorporating wildlife habitat into an AI environment: concepts, theory and practicalities. AI Applications 5(2): 59-104

Duncan, P. (1983) Determinants of the use of habitat by horses in a Mediterranean wetland. Journal of Animal Ecology, 52, 93-109.

Folse, L. J., Packard, J. M. and Grant, W. E. (1989) AI modelling of animal movements in a heterogeneous habitat. Ecological Modelling 46: 57-72

Grubb, T. G. and King, R. M. (1991) Assessing human disturbance of breeding Bald Eagles with classification tree models. Journal of Wildlife Management 55(3): 500-511

Hill, M. O. (1991) Patterns of species distribution in Britain elucidated by canonical correspondence analysis. Journal of Biogeography 18: 247-255

Hunter, R. F. (1962) Hill sheep and their pasture: a study of sheep grazing in south east Scotland. Journal of Ecology, 50, 651-80.

Jensen, J. R., Narumalani, S., Weatherbee, O. and Morris, K. S. (1992) Predictive modelling of Cattail and Waterlily distribution in a South Carolina Reservoir using GIS. Photogrammetric Engineering and Remote Sensing 58(11): 1561-1568

Lees, B. G. and Ritman, K. (1991) Decision-tree and rule induction approach to integration of remotely sensed and GIS data in mapping vegetation in disturbed hilly environments. Environmental Management 15(6): 823-831

Lindenmayer, D. B., Nix, H. A., McMahon, J. P., Hutchinson, M. F. and Tanton, M. T. (1991) The conservation of Leadbeater's possum Gymnobelideus leadbeateri (McCoy): a case study of the use of bioclimatic modelling. Journal of Biogeography 18: 371-383

Livingstone, S. A., Todd, C. S., Krohn, W. B. and Owen, R. B. (1990) Habitat models for nesting Bald Eagles in Maine. Journal of Wildlife Management 54(4): 644-653

Mackey, B. G. (1993) A spatial analysis of the environmental relations of rainforest structural types. Journal of Biogeography 20; 303-336

Milne, B. T., Johnston, K. M. and Forman, R. T. T. (1989) Scale-dependent proximity of wildlife habitat in a spatially-neutral Bayesian model. Landscape Ecology 2(2): 101-110

Moore D M, Lees B G and Davey S M 1991. A new method for predicting vegetation distributions using decision tree analysis in a geographic information system. Environmental Management 15(1): 59-71

MLURI (1993) The Land Cover of Scotland (1988). The Macaulay Land Use Research Institute, Aberdeen.

Pereira, J. M. C. and Itami, R. M. (1991) GIS-based habitat modelling using logistic multiple regression: a case study of the Mt Graham Red Squirrel. Photogrammetric Engineering and Remote Sensing 57(11): 1475-1486

Saarenmaa, H., Perttunen, J., V�kev�, J. and Nikula, A. (1994) Object-oriented modelling of tasks and agents in integrated forest health management. AI Applications 8(1): 43-59

Saarenmaa, H., Stone, N. D., Folse, L. J., Packard, J. M., Grant, W. E., Makela, M. E. and Coulson, R. N. (1988) An artificial intelligence modelling approach to simulating animal/habitat interactions. Ecological Modelling 44: 125-141

Walker, P. A. (1990) Modelling wildlife distributions using a geographic information system: kangaroos in relation to climate. Journal of Biogeography 17: 279-289

Walker, P. A. and Cocks, K. D. (1991) HABITAT: a procedure for modelling a disjoint environmental envelope for a plant or animal species. Global Ecology and Biogeography Letters 1: 108-118

Walker, P. A. and Moore, D. M. (1988) SIMPLE: an inductive modelling and mapping tool for spatially-oriented data. International Journal of Geographical Information Systems 2(4): 347-363


Richard J. Aspinall, Director
Gretchen Burton and Lisa Landenburger, GIS Specialists
Geographic Information and Analysis Center
Montana State University
Bozeman
MT 59717
Tel: (406) 994 2374
FAX: (406) 994 5122
e-mail: aspinall@sun2.giac.montana.edu
WWW: http://sun1.giac.montana.edu