Neil Brooker

THE DEVELOPMENT OF A MULTI-RESOLUTIONAL INTEGRATED LAND COVER DATABASE FOR SCOTLAND AND THE APPLICATION OF GIS IN PREDICTIVE SPATIAL MODELLING.

In Scotland a wide variety of agencies collect data and compile statistics on land cover. They have tended to adopt differing data collection methodologies, at different resolutions and with differing classification schemes, each reflecting available resources and varying policy objectives and data needs. The result has been a multiplicity of uncoordinated, contradictory land cover statistics. This paper outlines work aimed at coordinating these disparate sources of information by creating a multi-resolutional integrated land cover database, and describes procedures which utilise customised GIS capabilities to add value to land cover knowledge by applying predictive spatial modelling algorithms to the integrated database.


BACKGROUND

The aim of this paper is to describe a collaborative research project in the U.K between the
Macaulay Land Use Research Institute (MLURI) and the Institute of Terrestrial Ecology (ITE)
in which the main objective was to compare, quantitatively and spatially, data from three land
cover surveys: the Land Cover of Scotland 1988 (LCS88), the Countryside Survey 1990 (CS90),
and the Land Cover Map of Great Britain from satellite imagery (LCMGB). The underlying
rationale of the work was to;

1) Provide classification comparisons and thus to move towards standardised land cover
descriptions and statistics for Scotland.
2) Develop an integrated database with GIS functionality in order to enhance the value of each
survey
3) Provide a framework for recompiling CS90 results independently of the ITE land class
stratification by using the LCS88 census as a carrying surface.

The project addressed 3 main questions;

1) To what extent can the methods and results of the surveys be integrated to provide enhanced
land cover information?
2) To what extent can the detailed CS90 information be applied to LCS88 categories?
3) To what extent are the classification systems comparable?

SURVEY OUTLINES

Each of the surveys differ in data collection methodologies and scales, from large scale mapping
in the field of sample 1 Km squares (CS90), through medium scale census mapping from aerial
photographs (LCS88) to semi-automated census mapping using satellite technology (LCMGB).
Each exhibit unique benefits which it should be possible to increase through integration.

The Countryside Survey 1990

The Institute of Terrestrial Ecology (ITE) has carried out 3 major  surveys of Great Britain, in
1977/78 (Bunce and Heal, 1984), 1984 (Barr et al, 1985) and 1990 (Barr et al, 1993), using a
sampling approach based on the ITE Land Classification. This uses the national grid as a sampling 
frame for the large scale collection of environmental variables within 1 kilometre squares (Coppock 
and Kirby, 1987) assigned to one of 32 land classes. In Scotland the vegetation in 193 of these 
sample squares was mapped and national stock estimates obtained by extrapolation using the land class 
framework.

The Land Cover Map of Great Britain from Satellite Imagery

The field survey data of the Countryside Survey 1990 have been integrated with a land cover
dataset obtained from satellite imagery to produce a digital database of land cover for the whole
of Great Britain. Unlike the field survey, this is the first such coverage and therefore only
provides stock information, not change statistics (Barr et al., 1993). 

The satellite data were aggregated to 25 consistently recognised (or target) land cover classes
using computer classifications of summer and winter data , with a baseline date of 1990. The
classification is hierarchical; a large number of spectrally unique sub-classes were used to define
the target classes which, in turn, can be aggregated to 17 Key cover types or 9 Major cover
types. User-specific generalisation or detailed  enquiry of data can be obtained by aggregation
or disaggregation of classes.

The Land Cover of Scotland 1988
 
The Macaulay Land Use Research Institute has produced a national census of land cover to
determine the extent and distribution of land cover, from air photo interpretation, field validation
and computer-aided interrogation of digitised data. These data will serve both as an inventory of
land cover and as a baseline against which changes may be measured and monitored (Aspinall
et al., 1991). 
                                    
The project was based upon the interpretation from 1:24,000 scale photography of a land cover
classification which includes 6 principal land cover categories and 127 associated land cover
classes, with approximately 1,300 mosaic (combination) classes. Areal, linear and point features were  
identified with stipulated minimum mappable units: 10 ha for most areal units, 2 ha for woodland, 5 ha 
for built areas and 200m length for linear features (MLURI, 1993).
 
The surveys differ in the resolution of data capture, their coverage (census versus sample) and
the classification systems used. However, differences are not simply a matter of mapping
resolution and idiom of data capture; they are a result of a complex of interactions between
survey objectives, notably on what data are seen as relevant, what data can be collected 
consistently and unambiguously and what level of detail is required to meet the objectives of the
survey. Details of the surveys are summarised in Table 1.


METHODS

The comparisons were undertaken using the digital map data from the respective initiatives and
were undertaken in two parts;

1) The comparison of  the LCS88 and the CS90 field survey using the CS90 sample of 193 kilometre 
squares in Scotland
2) The comparison of LCS88 with the LCMGB satellite survey using the 193 kilometre squares of the CS90 
field survey plus an additional four blocks of 40x40 km squares.

The research used ArcInfo GIS software on a Unix platform to overlay the spatial data from the
respective surveys. Arc Macro Language (AML) routines (Esri, 1991) were written for the spatial 
overlays and subsequent spatial analysis (Brooker, 1993). The research attempted to take
account of some of the errors associated with the collection and digital representation of map
information (for the LCS88 / CS90 comparison). These ranged from errors of interpretation
(referential error) and data input, to errors of positioning (absolute errors) and errors associated
with data manipulation (relative errors) (MLURI, 1993). The analysis of errors or uncertainty
relating to polygon boundaries were included in the AML's by using epsilon bands, or buffers
(Burrough, 1986). These were used around the LCS88 polygon boundaries to compensate for any
positional or recording errors at a width of 20 metres. The choice of this width was related to
the relative and absolute errors, as well as the width of delineation lines associated with digitised
features at 1:25,000 scale (Aspinall, et al., 1993). All subsequent data processing thus
concentrated on areas deemed to be 'more certain' (i.e excluding buffered areas).

The results of the comparisons were derived from two statistical analyses:

1) The description of relatively small scale LCS88 categories in terms of the large scale CS90
field survey categories, expressed as percentages with error limits.
2) The analysis of one-to-one category agreements from an agreement matrix, data for which
were derived from the overlaying of the spatial data in raster (Grid) format.

While the research used available data from Scotland, the methods and developed uses of results
could be more widely applicable. More detail on methods may be found in Brooker, 1995.

RESULTS

The results from the above procedures were expressed as simple comparison tables. An example is given 
in Table 2 which describes LCS88 category 'Wet heather moor' in terms of the CS90 categories found 
within the LCS88 category for the whole sample, the mean percentage of the LCS88 category covered by 
each CS90 category, and the 95% confidence limits  for these means.

                                                          
Table 2 shows that LCS88 category 'Wet heather moor' has been described as consisting of 59.1% CS90 
category 'Wet heaths and saturated bogs', with high confidence, and by various other CS90 categories. 
It should be noted that 92% of the LCS88 categories were represented in the sample, at varying frequencies. 
This has implications for the use of the data.

Results showed some high one-to-one category agreements between LCS88 and the other surveys, although 
some level of category aggregation was necessary to achieve this. Further, there were many one (LCS88) 
to many (CS90 field survey and LCMGB) agreements, where agreement is based on confidence statistics. 
Finally there were a few categories which showed no direct comparisons. It is noted that this is due to 
either no or low occurrence in the samples or the lack of  similar categories in the other surveys.

Value can potentially be added to the surveys in two main areas;

1) High one-to-one agreements between categories of different surveys means that these
categories can be cross-referenced and area statistics from the individual surveys can be
reconciled.
2) High one-to-many agreements further define LCS88 categories by using the small mapping
threshold data of the field survey and satellite land cover information.
3) Where agreements are high the LCS88 can be used as an alternative to the ITE Land Class
system as an extrapolation framework for CS90 field survey data.

Results of the comparisons are given fully in Brooker, 1995.

PREDICTIVE SPATIAL MODELLING

The datasets are shown to have strong and weak links and comparisons. These can be modelled
using the data storage, manipulation and visualisation capabilities of a GIS. 

The CS90/LCS88 data are stored in a GIS database in the form shown in Table 2 and are one
of  the main data inputs into a menu driven user interface designed to display and model
information. The interface incorporates the primary ArcInfo map display and analysis tools and
many customised routines and network capabilities.  In brief the capabilities, allied with the
customised modelling routines outlined below, represent a powerful tool for data analysis.
Further, the interface is connected to a library of the LCS88 dataset enabling the extraction of
any particular area or region of interest in raster and vector formats.

Several routines have been developed which, by using the LCS88 dataset as a response surface
in combination with the integrated dataset described, can provide spatial and quantitative data
enhancement  for any chosen area. A major defining capability in all these enhancements and
predictions is the ability of the user to apply their own confidence thresholds to the data.

Predicting Total CS90 Category Coverage

This routine allows the prediction of the spatial distribution of all CS90 reporting or species
categories for a selected area. Visualisation of the data is based on dominance, with the highest
constituent CS90 cover type in each LCS88 category being spatially represented as the
distribution of the LCS88 category. Figure 1 illustrates this capability for CS90 categories for
Scotland. As many of the 'dominant' constituents actually make up less than 50% of the LCS88
categories  the routine supplies a 'scroll coverage' of the area of interest which can be
manipulated to visualise the next highest constituent CS90 category, as many times as required
by the user. The full range of detailed CS90 information can thus be interrogated and quantified.


Predicting Distribution and Abundance of  CS90 Categories

Data in the integrated dataset can provide predicted distributions of individual CS90 cover types. 
All data for the requested cover type are extracted from the CS90/LCS88 dataset and applied,
using the distributions of LCS88 categories, to the selected area as  indicators of abundance. 
Figure 2 is an example of this capability for the Dwarf heath shrub 'Calluna vulgaris', predicting
distribution from LCS88 and abundance from CS90, in this case in 25% ranges, for Scotland. 
Summary statistics are available for each percentage range. The confidences associated with
making the predictions are also available for query or for displaying in map form.


Describing LCS88 Cover Types By CS90 Categories

The main value of integrating LCS88 and CS90 data is the addition of information not available
from the air-photos alone. However, because of the one-to-many nature of the LCS88/CS90
comparison, visual representation of such data can be difficult. The problem has been solved by
this routine which allows the data to be 'looked through' to obtain the full characterisation of
LCS88 categories in terms of CS90 categories. Figure 3 gives an example of this GIS capability,
as it would appear on the screen. The dataset has been queried, firstly as to where the LCS88
'Improved pasture' category occurs in Grampian Region. The CS90 constituent categories in this
LCS88 category are represented above the distribution in graphical form and total area coverage
and the estimated proportional areas of CS90 categories are given. This represents an excellent
method of spatially and statistically displaying complex multi-scale data.


Searching for Occurrence of CS90 Categories

For any selected area the GIS can search for the predicted occurrence of a selected CS90
category within LCS88 categories. The routine then displays that distribution and gives
abundance estimates by LCS88 category. Figure 4 gives an example of this, as it would appear
on screen, for Calluna vulgaris in Grampian Region. Also given are estimates of gross area (from
LCS88) and total proportional  area (based on the CS90 percentage information) and similar
breakdowns according to LCS88 categories and CS90 use and condition information, in this case
Heather heights.


Area Calculation

This is a non-visual data capability which allows the estimation of areas for CS90 cover types
by specified map area. In brief the percentage information in the integrated dataset is related to
estimates of area for LCS88 categories using the LCS88 census results (MLURI, 1993). In
essence the LCS88 dataset is used as an alternative carrying framework for the CS90 data.


Table 3 shows an application of this by providing predicted area estimates for example CS90
categories for Scotland. The table gives estimates of areas of each CS90 category in terms of a
mean and a minimum and maximum possible area based on confidence limits. Initial estimates
using this extrapolation routine provided some poor comparisons with estimates using the ITE
land classes. For example the integrated CS90/LCS88 technique initially gave estimates for Wet
heaths and saturated bogs of 7408 km2 compared to 14900 km2 using the ITE land classes, and
3288 km2  against 5600 km2  for Open-canopy heath. There were two possible reasons for this;

1) Under-representation of ITE land classes in the sample.
2) The LCS88 categories in the comparison sample only cover 92% of the total land area of
Scotland.

Of  these the most significant in producing the spurious results was thought to be the 8% sample
shortfall.  All of the missing LCS88 categories from the sample were mosaics (combinations of
2 single categories), most with small national occurrences, according to LCS88. The GIS routine
for using LCS88 for CS90 data area estimation included estimates for LCS88 categories not
occurring in the sample by taking the LCS88 assumption of a 60:40 mosaic category ratio and
using the comparison data from the single LCS88 categories within the missing  mosaics in those
ratios. 

It can be seen from Table 3 that the estimates are improved to the extent that most of the
estimates from the two techniques are, at least, within each others standard errors or confidence
limits. Estimates for Wet heaths and saturated bogs has been improved to 14217 km2  with a high
estimate from the confidence limits of 14833 km2.  This is within 100km2 of the CS90 estimate
and falls well within the standard error of the latter. 

From the results it may be concluded that the LCS88, in conjunction with the integrated LCS88
/ CS90 dataset from the comparison, provides a alternative framework for determining spatial
distributions of CS90 data at the resolution of LCS88.

CONCLUSIONS

This report has described  research to compare and integrate land cover information from the
Land Cover of Scotland 1988 (MLURI), the Countryside Survey 1990 (ITE) and the Land Cover Map 
of Great Britain from satellite imagery (ITE). The main aims have been to provide classification
comparisons from spatial analysis, to develop the ability to cross-reference land cover categories
from the respective surveys, and to develop an integrated database with GIS modelling
capabilities to enhance the value of the respective surveys.

Value has been added to the surveys in two main areas;

1) High one-to-one agreements between categories means that the individual categories can be
cross-referenced and area statistics from the individual surveys can be readily integrated.

2) One-to-many agreements, where confidence limits are close to the means, add value  to LCS88 
through the further definition of  relatively small scale land cover data with large mapping
thresholds from large scale, small mapping threshold field survey and satellite land cover
information.
3) Where agreements are high the LCS88 can be used as an alternative extrapolation framework
to the ITE Land Classes to provide more accurate spatial distributions of CS90 vegetation types
throughout Scotland.

The results of the research have allowed the initial development of an integrated database with
GIS modelling functionality to enhance the surveys. Thus, it is possible to predict both the
distribution and abundance of CS90 categories using the CS90 field data in conjunction with the
LCS88 census information. Further, LCS88 categories can be spatially represented in terms of
large scale vegetation community data, information not available from medium scale air photos.
Finally estimates of sample data can be made using LCS88 as the carrying surface.

The research has answered some questions on the comparability of surveys, and the potentials
for integration and the use of GIS in predictive spatial modelling of land cover data. Many other
potentially useful land cover / land use datasets exist which might enhance, and be enhanced by
this integrated approach, and work continues on such data to provide a complex, functional multi-
resolutional land cover database.


REFERENCES
Aspinall, R.J., Miller, D.R., and Birnie, R.V., 1991. From data source to database: acquisition
of land cover information for Scotland. Proceedings of Remote Sensing of the Environment,
Image Processing '91, Birmingham, 131 - 152.

Aspinall, R.J., Miller, D.R., and Richman, A.R.,  1993. Data quality and error analysis in GIS:
measurement and use of metadata describing uncertainty in spatial data. Proceedings of the
Thirteenth Annual Esri User Conference, Palm Springs.

Barr, C.J., Ball, D.F., Bunce, R.G.H. and Whittaker, H.A., 1985. Rural land use and landscape
change. Annual report of the Institute of Terrestrial Ecology 1984, pp. 133-135.

Barr C.J., Bunce R.G.H., Clarke R.T., Fuller R.M., Furse M.T., Gillespie M.K., Groom G.B.,
Hallam C.J., Howard D.C. and Ness M.J., 1993. Countryside Survey 1990: Main Report. Report
to the Department of the Environment. Institute of Terrestrial Ecology and the Institute of
Freshwater Ecology.

Bunce, R.G.H. and Heal, O.W., 1984. Landscape evaluation and the impact of changing land use
on the rural environment: the problem and an approach. Planning and ecology. Roberts, R.D. and
Roberts T.M. (Eds.), pp. 164-188. London, Chapman and Hall.

Burrough, P.A., 1986. Principles of Geographic Information Systems for Land Resources
Assessment. Oxford University Press.

Brooker, N.A., 1993. Addressing the problems of integrating data from land cover mapping
projects held in different GIS environments. Proceedings of the Thirteenth Annual Esri User
Conference, Esri, Redlands.

Brooker N.A., 1995. The integration of land cover and ecological information from the
Countryside Survey 1990 and the Land Cover of Scotland 1988. Main Report (internal).
Macaulay Land Use Research Institute, Aberdeen.

Coppock, J.T. and Kirby, R.P., 1987. Review of approaches and sources for monitoring change
in the landscape of Scotland. Consultancy report for the Scottish Development Department.
Scottish Office.

Esri, 1991. Environmental Systems Research Institute, Inc. AML's users guide.  Esri, Redlands.

Macaulay Land Use Research Institute., 1993. The Land Cover of Scotland 1988. Final report.
The Macaulay Land Use Research Institute, Aberdeen.

Neil Brooker
Higher Scientific Officer,
Macaulay Land Use Research Institute,
Craigiebuckler,
Aberdeen, AB9 2QJ
U.K.
Telephone: (01224) 318611
Fax: (01224) 311556