David T. Hansen
The Sacramento - San Joaquin Delta is a major estuary on the west coast of the United States. The Delta occurs at the confluence of the Sacramento River and San Joaquin River and receives drainage from about 40 percent of California. The watershed receives flows from the western slopes of the Sierra Nevada mountain range and eastern slopes of the coastal mountain ranges draining into the Central Valley of California. This delta is unusual in that it is constricted at its outlet into San Francisco Bay by a series of parallel coastal mountains and valleys. The Delta is bordered by alluvial fans of the Sierra Nevada and coast ranges. During the Quaternary period, the Delta was subject to two main cycles of deposition and erosion. During interglacial periods, active deposition of organic rich sediment occurred as tidewater invaded the area with the rise in sea levels. During glacial periods, shorelines retreated and erosion of these sediments was active along with the deposition of glacial outwash from the Sierra Nevada. During glacial periods, deposition of eolian sands appear to have occurred along with deposition of alluvial materials from glacial outwash. During the Holocene, wetlands expanded with the rise in sea level until about 1850. Since 1850, active development in the Delta began with the conversion of lands into farms and homesteads (Atwater and Belknap).
The Delta region in this study covers approximately 2300 sq.km. (230000 hectares). Prior to 1850, approximately half of the Delta consisted of sub tidal wetlands. These wetlands are bordered by floodplains and alluvial fans of the tributary rivers particularly the Sacramento, Mokelumne, and San Joaquin Rivers. It is an area of low relief. Over 74 percent of the area is under 10 feet(3.0 meters)in elevation with 46 percent of the area at sea level or below. Approximately 17 percent of the area is between 0 and 5 feet (1.5 meters) and 11 percent is between 5 and 10 feet (3 meters) in elevation.
The Sacramento River historically delivers about three times as much water as the San Joaquin River of which the Mokelumne River is a part. Flow regimes for these river systems undergo large seasonal variations with high flows occurring in the winter and spring from rainfall and snowmelt. Low flows typically occur during late summer and fall seasons. The Delta is subject to catastrophic flood events with extreme events raising water elevations to as much as 13 feet (4.0 meters) above sea level (Atwater and Belknap).
Higher areas within the
wetland portion of the Delta are typically represented by supertidal natural
levee deposits chiefly along the Sacramento River and relict eolian deposits or
sand mounds. The Mokelumne River on the eastside of the Delta
carries approximately 22 percent of the runoff for the San Joaquin River system.
This river
has built an extensive fan on the eastern side of the Delta representing an older
surface (West, Hansen, Welch). Figure 1 shows the overall Delta Region and the
major geomorphic units as identified by Atwater.
A total of 178 prehistoric sites have been recorded within the study area (See Figure 1). About 80 percent of these sites were recorded prior to 1960. Of these sites, a majority were recorded before 1950. Most of these sites are described as occurring on low gently sloping mounds with elevations a few feet above the surrounding land surface. They are assumed to occur on natural rises which were enlarged by the accumulation of midden materials (West, Hansen, Welch). Sites are assumed to occur up slope of areas affected by tides. In tidal areas, they are associated with slight rises such as eolian deposits or super tidal natural levees. A majority of the sites (65 percent) are between 0 and 10 feet (3 meters) above mean sea level (West, Hansen, Welch).
Since 1850, the Delta has been subject to extensive modification with the development of artificial levees, drainage ditches, canals, dredging and other developments. By 1930, 1800 sq. km. (178000 hectares) had been converted to agricultural land. Since that time much of the remaining area of the Delta has been converted. Agricultural lands developed on the peat and muck soils of the Delta have resulted in subsidence, decomposition, and deflation of the highly organic soils. Much of these former tidal areas are now several meters below sea level with elevations ranging from -25 feet (7.6 meters)to present sea level (West, Hansen, Welch). The Delta itself serves as a major conduit for fresh water flows for irrigation and municipal water for the State of California. Approximately 70 percent of the water used for irrigation and municipalities flows through the Delta. This development has placed major stresses on the ecological health of the Delta Region and efforts to protect the environment of the Delta has placed constraints on the operation of Federal and State water projects supplying water. In order to address these issues, a joint Federal and State program, CALFED has been involved in developing a long range comprehensive plan for the Delta region. Since most of these sites were recorded before methods of performing systematic surveys for an area, a site density model can not easily be developed. USBR archaeologists are interested in developing such a model for the prediction of site occurrences that will support the CALFED planning effort.
Archaeological site data was collected from the California State Historic Preservation Office (SHPO). This information included site location as well as attributes identifying date the site was recorded, site elevation as reported by the observer, and categories for features observed at the site. These categories include habitation debris, burials, rock shelters, and unknown or other categories. Site locations were digitally captured and locations verified with U.S. Geological Survey 7.5 min quadrangle sheets of site locations held by SHPO. Delta surficial geology and geomorphology as mapped by Brian Atwater between 1979 and 1981 was digitally captured (Atwater). Major elements from Atwater's study that were captured as separate themes include the 1850 tidal line identifying the maximum extent of wetland areas, geologic units, perennial lakes at about 1910, and former tidal and nontidal drainage channels. Many of the boundaries or lines developed by Atwater have an inferred uncertainty associated with their location of up to 500 meters. Examination and treatment of this uncertainty is discussed in a paper presented at the Esri 1998 International User Conference by Hansen (Hansen).
Examination of the archaeology site distribution with the geomorphic data indicated a relationship between the 1850 tide line as identified by Atwater. Over half of the sites are within 1000 meters of this tide line and 75 percent are within 2000 meters. Examination of the sites with the major geologic and geomorphic units mapped by Atwater indicated a disportionate distribution of sites with certain map units. The main units associated with the archaeology sites are:
GIS provides a graphic display of the conceptual model for the site distribution. The number of sites meeting characteristics identified in the conceptual model and the percentage of the area represented can be reported. It does not estimate the probability for a site meeting one or more of these characteristics. Weights of evidence is a Baysian approach for combining data to predict occurrence of events. It is based on the presence or absence of a characteristic or pattern and the occurrence of an event. In spatial analysis, it has been used extensively in the mineral and mining fields. An estimate of the prior probability of the occurrence of an event is based on the total number of events distributed over the area. A posterior probability is calculated for a characteristic of data theme based on this prior probability and the presence or absence of this characteristic with the events. The odds of occurrence or logits are calculated for the event of interest. In weights of evidence method, these values are converted to natural logarithms to produce weights for the characteristic of the theme. For our archaeology sites, weights for a theme attribute such as geology map unit can be calculated based of the presence or absence of sites in the geologic units, etc.. For a particular spatial pattern or characteristic, weights are calculated based on:
This method depends on several factors. The event of interest such as archaeology sites is assumed to be a point and that this particular event is only recorded once and is not represented by multiple points. Calculations are based on a unit area for measuring the total study area and for calculating weights. This unit area is used as a GRID to calculate the total study area, the areas containing or lacking the spatial patterns of interest with and without sites. Where weights are calculated for several individual features or themes and then combined, it is assumed that these features are conditionally independent of each other. Weights are based on the aerial extent of features in the themes used as evidence. Linear features can be buffered to generate an area based on the distance from or to the feature. Bonham-Carter provides an excellent description of weights of evidence and its application in Geographic information Systems for Geoscientists.
Weights of Evidence is planned as a part of the spatial statistics package for Arc/Info. Bonham-Carter and others have been instrumental in developing a weights of evidence extension to ArcView. This extension has been available on the web at http://gis.nrcan.gc.ca/software/arcview/wofe. The extension requires Spatial Analyst. While used in the minerals industry, few applications of weights of evidence have been developed in the natural resources area. Gary Raines who assisted in the development of the ArcView extension and Mark Mihalasky who has applied the extension in the prediction of mineral deposits provided our staff training and assistance in using this extension in evaluating the data associated with the archaeology sites for the Delta.
The initial application of weights of evidence for archaeology sites in the Delta examined some characteristics of the conceptual model. A smaller area for the Delta was selected to evaluate the data themes generated from Atwater's study of the geology and geomorphology for the Delta.
This study area encompasses approximately 2100 sq-km (210000 hectares) instead of the original 2300 sq-km. The number of archaeology sites in this portion of the Delta dropped from 178 sites to 151 sites. Major themes developed from Atwater's mapping modeled in this application of the extension are:
The geologic map units in this study area include the following units.
The first step in using weights of evidence is definition of the study area for analysis which must be a grid. A digital elevation model (DEM) was developed for the area based on the U.S. Geological Survey 30 meter DEMs. Major water ways in the Delta were captured from 1:24000 scale U.S. Geological Survey 7.5 min quads. This hydrography was used with the effective area of the DEM data to produce a study area grid theme. Areas identified as water were set to no data values or -999 to eliminate them from analysis.
Half of the 151 archaeological sites were randomly selected as a training set of 76 points to develop a probability surface. The unit area of analysis must be defined. The extension calculates a suggested unit area based on the size of the study area and the number of points available. This unit area is the basis for tabulating the area of the patterns of features and the occurrence of events with or without those patterns for the calculated weights. The user also has the option of writing the options selected out to a text file for documentation. Finally, the value used to identify missing data must be identified. Since calculations of weights depend on the aerial extent of features, any theme which is incomplete for the study area must have those missing areas recognized. For those areas missing data for an evidence theme, the weight is set and reported as zero for weight calculation.
Then weights of individual themes can then be calculated. Table 1 shows the weights and contrast for the geology map units.
Map Group | Map Unit | Area (sqkm) | # of Points | W+ | W- | Contrast |
---|---|---|---|---|---|---|
1 | Qds | 10.9837 | 0 | -- | -- | -- |
2 | Qpm | 441.4515 | 2 | -2.092 | 0.213 | -2.305 |
3 | Qfp | 478.0372 | 23 | 0.314 | -0.111 | 0.425 |
3 | Ql | 107.243 | 19 | 1.764 | -0.243 | 2.008 |
3 | Qb | 207.7777 | 8 | 0.082 | -0.009 | 0.091 |
4 | Qm2e | 56.6033 | 10 | 1.761 | -0.118 | 1.879 |
4 | Qoe | 0.0428 | 1 | -- | -0.0137 | -- |
5 | Qm | 330.8605 | 3 | -1.394 | 0.133 | -1.527 |
6 | Qr | 1.0365 | 0 | -- | -- | -- |
6 | Qry | 57.9160 | 3 | 0.3926 | -0.013 | 0.405 |
6 | Qro | 201.9517 | 4 | -0.602 | 0.047 | -0.648 |
7 | Qyp | 5.7911 | 0 | -- | -- | -- |
7 | Qop | 0.0386 | 0 | -- | -- | -- |
7 | Qym | 0.3144 | 0 | -- | -- | -- |
7 | Qymc | 45.8683 | 1 | -0.504 | 0.008 | -0.512 |
7 | Qch | 28.5997 | 0 | -- | -- | -- |
7 | Qcr | 153.7973 | 2 | -1.029 | 0.049 | -1.079 |
8 | Qmz | 7.7961 | 0 | -- | -- | -- |
Once a table of weights for attributes of a theme have been calculated, the user can evaluate the weights to determine attributes that are associated with the sites and those which are not. As indicated in documentation for the extension, weights between 0.1 and 0.5 are mildly predictive, 0.5 and 1 are moderately predictive, 1 and 2 are strongly predictive, and greater than 2 are extremely predictive for mineral analysis. Weights for the geology map unit indicate that natural levee deposits (Ql) and eolian deposits (Qm2e) are strongly predictive for the set of training points. The Riverbank formation (Qry) is moderately predictive. In addition, organic peat and muck of tidal wetlands have a strongly negative weight. This supports the conceptual model.
In addition to calculating weights for polygon and grid themes, the extension has tools for calculating weights based on a buffered distance or direction from a linear feature. This tool was used to calculate weights from the 1850 tide line, and the tidal and nontidal channels identified by Atwater. These linear features were buffered at the following intervals and distances:
Once weights have been calculated for a set of themes, attributes are reclassified or grouped into a binary classes. For this analysis, values of 2 - inside or associated with the points and 1 - outside or not associated with the points were used. The following classification was used for the binary themes.
The binary classes assigned to these separate themes can then be combined to generate a probability surface. In this process a unique conditions grid is generated over the study area. This integer grid contains every condition or combination represented by the binary themes. The VAT table for this analysis contains items identifying each unique condition represented by the set of themes, the number of training points occurring in these conditions, and the calculated probability and uncertainty represented by the data. Weights are reported for each theme that was combined and are identified in table 2.
Theme | Outside | Inside | Contrast | Confidence |
---|---|---|---|---|
Geology Units | -1.215 | 0.672 | 1.887 | 6.187 |
1850 Tide Line | -2.816 | 0.235 | 3.050 | 3.029 |
Tidal Channels | -0.216 | 0.812 | 1.028 | 4.106 |
Nontidal Channels | -1.534 | 0.374 | 1.907 | 4.481 |
This data is the basis for the extension calculating an overall test of conditional independence for the themes and a Chi squared test for each combination of the themes. The overall test of conditional independence for this particular model is 0.76 (On a scale of 0.0 to 1.0 where 1.0 indicates conditional independence). The pair wise Chi squared test did not suggest that the null hypothesis of conditional independence should be rejected for any pair of themes.
The particular combination of these themes resulted in very low probabilities being calculated. The probability surface generated has a maximum value of 0.028. Figure 6 shows the probability surface resulting from combining these binary themes.
The weights of evidence extension provides the opportunity to explore other data themes and calculate weights based on attributes of those themes. After the initial analysis of the distribution of archaeology sites with the geology data, additional data themes were incorporated into the analysis. One of these themes is the National Resource Conservation Service (NRCS) detailed soil surveys (SSURGO) for Sacramento, San Joaquin, Yolo, and Contra Costa Counties. These NRCS soil surveys represent field mapping of soils for a major portion of the Delta area at a scale of 1:24,000. In addition, the digital elevation model (DEM) and additional grids derived from the DEM such as slope and flow direction were included. Major water ways in the Delta were captured from 1:24000 scale U.S. Geological Survey 7.5 min quads. Geology was included from the initial weights of evidence model.
The binary classification of the geology map units from the first combination of
themes was used in this model.
Figure 7 shows the binary classification for the geology theme and
the distribution of archaeology sites.
Analysis of the NRCS soils data relied on the soil taxonomic classification of the individual soil components comprising the soil map units. For the study area, most soil map units are comprised of single named soil series. A few map units contained two named soils or a named soil and a miscellaneous land type. In these cases, the dominant named soil was used to represent the map unit. Water was one major miscellaneous land type represented in the combined surveys and often occurred along the major rivers. The study area grid theme largely eliminated water in the analysis area. Soil taxonomic classes are composed of several parts. They represent a fairly simple method for grouping the soil map units into some dominant characteristics. Table 3 identifies soils by their taxonomic class of great groups with their associated weights that were placed in the binary class of 2 or inside the predictive area.
Symbol | Great Group | Area (sqkm) | # of Points | W+ | W- | Contrast |
---|---|---|---|---|---|---|
AXENA | Natrixeralfs | 2.4908 | 2 | 4.66 | -0.02 | 4.68 |
EAQFL | Fluvaquents | 72.5631 | 3 | 0.11 | -0.00 | 0.11 |
EAQHA | Haplaquents | 22.6426 | 4 | 1.72 | -0.04 | 1.76 |
EFLXE | Xerofluvents | 97.3195 | 16 | 1.63 | -0.19 | 1.82 |
IAQHP | Haplaquepts | 37.1426 | 6 | 1.61 | -0.06 | 1.67 |
MXEDU | Durixerolls | 37.3700 | 4 | 1.13 | -0.03 | 1.16 |
MXEHA | Haploxerolls | 270.1878 | 19 | 0.67 | -0.14 | 0.81 |
In addition to these soil great groups, two soil subgoups were included that had not been captured at the great group level. Table 4 identifies these two subgroups carried as 2 or inside in the binary classification of soils.
Symbol | Subgroup | Area (sqkm) | # of Points | W+ | W- | Contrast |
---|---|---|---|---|---|---|
DU02 | Duric | 58.3015 | 4 | 0.65 | -0.02 | 0.67 |
FL02 | Fluvaquentic | 21.2899 | 4 | 0.65 | -0.02 | 0.67 |
Figure 8 shows the binary classification of the soil theme and the distribution of the archaeology sites.
A buffered distance from the major water features captured in the hydrology layer was used in this iteration rather than the buffered distance from the 1850 tide line. The line feature from hydrology was buffered at an interval of 30 meters to a distance of 900 meters. Weights were then calculated for these buffers. Figure 9 shows the chart of the positive weight, negative weight, and contrast for a portion of these distances. Based on a drop in the contrast at a distance of about 100 meters, the buffered hydrology theme was classed as 2 or inside the predictive surface at 90 meters from the hydrologic feature.
Figure 10 shows the binary theme for the buffered hydrology with the binary theme
for soils in the background.
The DEM elevation data was converted into an integer grid for use with weights
of evidence. For the study area, elevations ranged from -25 feet to 164 feet.
Weights were calculated in both ascending and descending mode to see what values
might be effective in developing a binary classification. All of the training
points had elevations between -13 and 20 feet. Positive values in the contrast
occurred between elevations of 1 and 18 feet. In ascending mode, positive values
in the contrast values occurred between 10 and 20 feet. In descending mode,
positive values occurred between 10 and -12 feet. For an initial comparison,
elevations between 5 to 20 feet were classed as 2 or inside with all other elevations
set to 1. Another iteration of the binary classes for elevation would be run for
elevations of 1 to 18 feet. Figure 11 shows the results of this binary classification
with surface hydrology.
A variety of alternatives were evaluated for representing surface slope and for identifying local topographic highs. Most of the study area has very little change in elevation. Both the natural and man made features such as levees, roads, and dikes cause a large number of sinks in processing the DEM for flow direction. Initial processing identified over 1200 sinks. In examining the DEM and associated elevation changes with these sinks, it appeared that a relationship existed between local elevation changes and the site locations. A variety of outputs from processing the DEM were evaluated to try an represent this. Weights were calculated for:
Figure 13 shows the resulting binary classification for slope drop.
These themes were then combined to produce a probability surface. Figure 14 shows the resulting surface that was generated. The resulting probability surface produced much higher probability values for predicting the occurrence of sites. Using the ranges from figure 14, the values and the number of sites associated with each range are identified in table 5.
Probability Interval | Number of Archeology Sites | Percentage of Total Sites |
---|---|---|
0.001 to 0.007 | 0 | 0 |
0.007 to 0.014 | 21 | 13.6 |
0.014 to 0.033 | 17 | 11.0 |
0.033 to 0.056 | 12 | 7.9 |
0.056 to 0.141 | 31 | 21.1 |
0.141 to 0.385 | 36 | 23.3 |
0.385 to 0.778 | 34 | 23.1 |
Table 6 identifies the contribution of each theme in this combination of weights.
Theme | Outside | Inside | Contrast | Confidence |
---|---|---|---|---|
Geology Units | -1.319 | 0.721 | 2.04 | 6.419 |
Soil Units | -1.171 | 1.024 | 2.195 | 7.848 |
Slope Drop | -0.325 | 0.695 | 1.02 | 4.295 |
Elevation | -0.710 | 0.708 | 1.418 | 5.703 |
Distance to Water Ways | -0.315 | 1.407 | 1.722 | 6.698 |
The overall test of conditional independence for themes combined in this model is 0.86. The table of Chi-squared values for paired themes has values of 8.3 for the geology and soil themes and 4.6 for the elevation and slope drop themes. This indicates that conditional independence between these sets of themes may be violated. The slope drop theme and the elevation theme used in the model are both derived from the DEM data and this is expected. Although the geology and soil maps were developed independently, both sets of themes are describing surficial materials and their depositional environments. It illustrates one of the pit falls as well as strengths in applying weights of evidence. Figures 7 and 8 show the areas for geology and soils included in the model. These areas are very similar. Combining these themes in the analysis will inflate the combined weights for the sites but they also increase the Chi-squared statistic. The soil and geology units involved in the binary classes need to be further examined. Those units that represent the same physical feature need to be identified. It may be possible to combine elements from one theme with the other where the conflict exists. If not one of these themes should be excluded.
Weights of evidence is available as an extension to ArcView and will be included as part of the spatial statistics package available with ArcInfo. Weights of evidence is a useful tool in evaluating data themes against the occurrence of events in the natural resources field. It provides the opportunity to explore attributes of a theme and generate weights based on the pattern of those attributes and the occurrence of the event of interest. The extension provides the opportunity to combine the weights from these separate themes to produce a probability surface for predicting the events. In the process, it performs tests of conditional independence for themes used in the analysis. In this example, the location of known archaeology sites could be evaluated against a set of commonly available GIS themes. Relationships identified in a conceptual model for site distribution or could be tested, quantified, and evaluated for the study area.
Geology map units representing alluvium from super tidal flood plains, eolian deposits, and the Riverbank formation were shown to have a positive weight in predicting the occurrence of known archaeology sites. The relationship of the 1850 tidal line was also identified, but this relationship is much less clear. This is consistent with the initial examination of the archaeology site distribution as reported by West, Hansen, and Welch. Long winding linear features such as the 1850 tide line and present day water ways are fairly common for natural resource data. Buffering features to calculate distance to an event may not be the best method for representing the relationship of that feature to events.
Additional data themes can be added into the extension and explored. Including soils, distance to present day water ways, and elevation data assisted in further development of a predictive surface. The soil taxonomic classes with positive weights are soils on fairly stable surfaces. These soils have developed some clear diagnostic horizons. It also provides the opportunity to identify and understand characteristics that may be useful for inclusion in the model. How do the soils in these taxonomic classes differ from nearby or adjacent soils that did not show positive weights with the test points?
As more data is included into weights of evidence, spatial correlation between data themes increase the possibility of violating conditional independence between themes. These violations may not be as clear cut as the inclusion of data themes developed from the same source like the DEM data. What is the relationship between soil taxonomic classes and the geology map units that are spatially correlated? Can attributes or characteristics between these themes be combined to improve the model? Eolian sands in the geology units were shown to have positive weights but sands in the soils theme did not. What are the differences in the characteristics of soils identified in areas mapped as eolian deposits?
Weights of evidence is a very effective tool for identifying and quantifying relationships between features in a data theme and events of interest. It provides a rich environment for identifying relationships that should be further explored and understood. It raises additional questions about both themes used in weighting as well as the event theme. In many mineral applications, the mineral deposits used as training points have a well understood depositional environment. Deposits with different depositional environments are usually not mixed in the training set. The archaeology sites need to be examined to see if they represent the same or different environments.
Atwater, Brian, Geologic Maps of the Sacramento - San Joaquin Delta, California. Miscellaneous Field Studies Map MF-1401. Denver CO: U. S. Geological Survey, 1982
Atwater, Brian F. and Daniel F. Belknap, Tidal Wetland Deposits of the Sacramento - San Joaquin Delta, California , Quaternary Depositional Environments of the Pacific Coast, Pacific Coast Paleogeography Symposium 4, Editors: M. F. field, A. H. Bouma, I. P. Colburn, R. G. Douglas, J. C. Ingle, April 9, 1980.
Bonham-Carter, Graeme F., Geographic Information Systems for Geoscientists , Pergamon Press, Elsevier Sciences Inc, Tarrytown, New York, 10591-5153, 1994.
Hansen, David T., Visualizing Uncertainty Captured from Source Documents with GRID., Proceedings of the Eighteenth Annual Esri International User Conference, San Diego, CA, July 1998.
Natural Resource Conservation Service, Soil Survey Geographic (SSURGO) Data Base, U.S. Department of Agriculture, NRCS, National Soil Survey Center, Miscellaneous Publication Number 1527, P.O. Box 6567, Fort Worth TX 76115-0567, January 1995.
West, G. James, David Hansen, and Patrick Welch, A Geographic Information System Based Analysis of the Distribution of Prehistoric Archeological Sites in the Sacramento - San Joaquin River Delta, California: A CALFED Planning Study, Paper for prepared for publication, March, 2000.