Dave Connett and Bryan Mayhan, CARES, University of Missouri-Columbia

DIGITAL SOIL DEVELOPMENT AND APPLICATION

Abstract: The National Cooperative Soil Survey (NCSS) is a cooperative undertaking of the United States Department of Agriculture (USDA) and other federal, state, and local agencies currently working to map the soils of the United States. In Missouri, the Department of Natural Resources (MDNR) is participating with the Natural Resource Conservation Service (NRCS) in the NCSS. The MDNR has begun to transfer all published soil surveys into a computer data base. The Center for Agricultural, Resource and Environmental Systems (CARES) is cooperating with MDNR to digitize 15 counties during this fiscal year. At present, both CARES and NRCS are working to translate Missouri county soil surveys into a digital data base. Digital soils maps and digital soils information can be linked with other ancillary information to produce time efficient analyses. The purpose of this paper is to detail the processes involved in digital data base construction of published NCSS soil surveys and to discuss potential uses of this information. At CARES, a geographic information system, high resolution scanning, raster to vector translation software and optical character recognition technologies are being utilized to produce a digital soils data base. These methods are proving to be more time saving and cost effective than previously employed methods. Numerous applications exist for this information to be used in the area of natural resources management.


Introduction

The Missouri Department of Natural Resources (MDNR) in participation with the Natural Resources Conservation Service (NRCS) in the National Cooperative Soil Survey (NCSS) contracted with the Center for Agricultural, Resource and Environmental Systems (CARES) at the University of Missouri - Columbia, to develop a statewide digital soils layer. At CARES, a methodology was developed to streamline the digitization process by utilizing scanning technology, implementing raster-to-vector and optical character recognition (OCR) software, and to develop menus and macros using ArcInfo's Arc Macro Language (AML).

Source Material Acquisition and Registration

Mylar text and line work separates, obtained through NRCS, on either full or one-third quad sheets at a 1:24,000 map scale will be registered to a Universal Transverse Mercator (UTM) coordinate system through computer generated United States Geological Survey (USGS) 7.5' quadrangle boundaries. Corner quadrangle registration marks are drawn on each full or one-third quadrangle mylar sheet.

Scanning Hardware and Software

Scanning for this project is performed on a Contex FSS 8200 E-size scanner connected to a IBM RS/6000 workstation. The software used to scan the mylar separates is Contex's CADImage/Scan. One of the key features of the software is the threshold settings, in which, all gray tones lower than the input number used will be represented as white pixels, and all the gray tones over the threshold will be represented as black pixels. This works as a thinning and filtering process. Mylar line work separates are scanned at 300dpi. The text mylar separates are scanned at 400dpi allowing for better OCR.

Raster-to-Vector / Optical Character Recognition (OCR)

Both text mylar separates and line work mylar separates are scanned in as TIF images. These raster images must be converted to a vector format before becoming ARC coverages. Many companies offer raster-to-vector conversion software packages. The field was narrowed down to two vendors that were able to meet requirements and supply the project with a demo copy so that it could be tested out on sample areas. Both software programs were PC based. Due to limited funtionality and availability, no IBM AIX OCR software versions were given much consideration. There are however, UNIX OCR software that will run on SUN workstations, but these were not considered due to the fact that CARES uses only IBM RS/6000 workstations.

Hitachi's Tracer and Recognizer was chosen over Ideal's I/Vector. Both were evaluated as very good, but Hitachi's software was selected due to the increased editing functions available with AutoCad.

The Hitachi software is an application that runs in conjunction with AutoCAD and converts scanned images into vectors and text which can be used by a GIS. Tracer provides the tools to do semi-automatic conversion techniques, while Recognizer provides for automatic recognition of both graphics as well as text. OCR software translates the TIF file into a text layer in AutoCAD, this text layer then can be brought into ArcInfo as a point coverage with the textstring as its attribute. Once the vectorization is completed, the file is converted to a DXF format. The DXF formatted file is then converted to an ARC coverage using the DXFARC command.

Transformation

A series of AML programs and menus have been written to aid in the editing and error checking process. The first step in the process, after ARC coverage conversion, is to move tic features to there correct locations. The coverages can then be registered to a real world coordinate system. A computer generated USGS 7.5' quadrangle boundary provides the corect coordinate location for each soil sheet. One-third quadrangle boundaries also exist in the database for the registration of the one-third quadrangle soil sheets. Through an AML driven menu, the operator is prompted for MDNR region, USGS 7.5' quadrangle name, and soil line or text coverage name. A new coverage is created, registered to the UTM coordinates of its corresponding USGS 7.5' full or one-third quadrangle.

Edgesnapping

Even though the editing process takes place on a quad by quad basis, a county wide seamless database is needed to produce the final product. The soils data was gathered at the county level, but since each full or one-third quad was scribed onto a separate mylar sheet, the soil boundaries do not tie together perfectly. The edgesnap AML prompts the user for an edit coverage and a back coverage, then calls up an edit menu. Links are established between the edit coverage and the back coverage. The extent of arc movement can be managed by defining an area of adjustment, thus restricting the area of interpolation. Linked end nodes of arcs from adjoining coverages are snapped to the midpoint between the two nodes. Additional editing can be performed using this menu to produce a smooth transition from one coverage to the next.

Append and Identity

All full or one-third quad line coverages are appended into one county wide line coverage. A county boundary, digitized from USGS 7.5' quadrangles is added to the county wide line coverage. An IDENTITY is performed to create new coverages containing both soil polygons and soil type labels. The input coverages are the soil label point coverages, the identity coverage is the county wide soil line coverage. This produces full quadrangle seamless county wide soil coverages. Computer generated 7.5' quadrangle boundaries are added to each full quad soil coverage.

Error Checking

In arcedit, each coverage is built and duplicate label points are deleted. An AML was written to find polygons with label errors (i.e. no labels) and run IDEDIT. That AML then highlights polygon labels with incorrect attributes (i.e. attributes not matching any soil types for that county). An edit menu, called up by the AML, containing buttons for each soil type in that county is used to make corrections. Another AML was written to locate adjacent polygons with the same soil attribute. Some of the so called 'neighbor errors' can be traced back to the published county soil survey books.

Printouts are made of each soil quadrangle. These are checked by hand with the published county soil survey. When all errors have been corrected, all of the full quad soil coverages are mapjoined into a final county wide soil coverage. A DISSOLVE is done on the final county coverage to eliminate quad boundaries found on the full quadrangle soil coverages. This becomes a good error check procedure; since non-dissolution of a quadrangle boundary means that a discrepancy exists as to the correct soil type between two adjacent soil quadrangles. These errors are usually traced back to the published county soil survey books. The published errors are noted and sent back to NRCS, where a soil scientist looks at the problem and returns the solution. When these errors have been fixed the county is considered ready for certification.

Applications

Database development must have utility to warrant the investment. The SSURGO certified soils database has applications for use in both the Public and Private sectors. These uses include planning, management, and research. This section will provide thumbnail sketches for some of the applications either in progress or proposed for the Missouri soils database.

The soil surveys were initiated in the early 1900's for use in determining crop productivity of particular regions of land at the county level. Six to Twelve soil types were identified during these early surveys. A modern soil survey usually identifies between 30 and 60 different soils based on type, slope, and state of erosion. Information from the early surveys dealt purely with crop yields. Over the decades since the early surveys, the need for differing types of information expanded the information collected by soil scientists and hence the expansion of the number of soil survey users. This project is solving the inherent inconvienience of using a graphics page and separate data table since both are combined into one GIS layer. Queries can be either tabular or graphic, but the GIS query is much more convienient in both time and resources.

The completed counties of the Missouri soils database are in the process of being SSURGO certified. In compliance with SSURGO standards, a common link is present in both the coverages and in the various digital data files which have been created for the county. Soils data was obtained via Internet from the National Soils Data Access Facility located at the Survey Section Statistical Laboratory, Iowa State University. These data files include yields per acre of crops and pasture; woodland management and productivity; limitations for windbreaks, environmental plantings, recreational development, wildlife habitat, building site development, sanitary facilities; uses as construction materials; general engineering properties; physical and chemical properties; and water management. The data sets provided for the current series of soil surveys allow for a wide array of uses for this type of data.

Agricultural Uses

The soil survey contains information about expected crop yields for the various soil types. The survey also contains information on flood potential, erosion potential, and suggested management practices to mitigate some of the potential problems associated with cultivation of a particular soil type. Using querying tools provided by the various GIS packages available, information can be easily extracted from the soils data. One can easily find the prime farmland, flood hazard cropland, erosion prone tracts, or land better suited to hay production or grazing. This information can then be presented in a graphic format. A more complex query would include some additional data gathering. If crop futures are included in the database, these data can be used with yield potentials to estimate which crop will bring the greatest financial return for a particular tract of land. A farmer can then choose which crop to plant based on probable yield-price parameters.

Forestry

Soil surveys can be used by the forest industry for planning purposes. Much like agricultural crops, expected growth rates and yields of various tree species common to the county are presented by soil type. Management limitations area addressed as well. This information can be used to determine the species planted for commercial use or even restoration of the original tree cover.

Floodplain Management

As the losses due to catastrophic flooding events increase, floodplain management is becoming ever more important. Allowing recently farmed lands to revert to wetland status will become an important tool in future flood damage mitigation. Soils data allows a management organization to determine the best areas for wetland regeneration. Some of these factors include flooding frequency and duration, permeability and available water capacity, vegetation potential and suitability of a soil for use as a flood control structure. Once suitable areas for wetland development can be found programs for buyout or cooperation can be pursued to accomplish the planning goals.

Water Quality

The physical and chemical properties of soils can greatly affect the water quality within a watershed. Soil data can be used to determine natural phenomena which lower water quality and at the same time allow identification of man-made water pollution. This water pollution may be apparent in both the surface runoff and the ground water. The ground water pollution potential is a direct relationship to the permeability of the soil layers above the bedrock. Models can also be produced which calculate runoff containing pesticides and herbicides and the subsequent effects on water quality. Included in this is the use and type of sanitary facilities which best suit a given soil type.

Sanitary Landfill

Sanitary landfills must now meet stringent EPA standards. Soil data is essential in determining where likely landfill sites exist. Flood potential, soil composition, permeability rates, suitable daily cover materials all play an important part in finding a suitable site for this landuse. The digital soil survey can be used to quickly narrow the areas which could contain potential land fill sites. The narrowed search parameters allow for a more detailed ground survey of the more suitable areas and lead to a more environmentally sound decision.

Engineering

Soil and soil components are used in construction. Soil information contains generalized engineering attributes. Potential uses and limitations of the various soils are contained in the soil layer. Using the soil survey can help civil engineers to determine which areas may be more suitable to road construction, thus guiding further detailed sampling in a smaller area. This helps to foster road construction in the areas with the fewest engineering limitations resulting in a lower construction cost. Where suitable, soils can be mined for construction and fill materials. Areas prone to shrink-swell can be quickly discerned by a land developer seeking a construction site and either avoided or proper steps taken, such as piering, to assure some mitigation of the problem. Sanitation problems are also considered. Levee districts may use the soil survey as means of locating levees and dikes by locating the structure on a soil of low permeability and good packing qualities. Many potential engineering uses for soils data are contained in the digital soil survey.

Hazards Management and Mitigation

Denuded slopes are generally highly prone to erosion. A denuded slope becomes more susceptible to creep, slumps and slides. Soil surveys note these susceptibilities and allow for mitigation and management of these hazards. Soil permeability, plasticity, liquid limits, composition and water table depth are all factors in determining the susceptibility of the soil to seismic damage. The study of mass movements are well suited to the digital soil survey.

Wildlife Habitat and Recreation

Some soils are suitable only to be left in a natural state. These areas do form an important use for people-recreation, while maintaining a place for wildlife. Recreational activities are the basis of a large tourism industry in many states. Identification of areas suitable as wildlife refuges and recreational activities is an important function that can mean dollars to a local economy.

Conclusion

In the past the USDA-NRCS soil survey has been a useful tool in determining landuse and land management. Placing the soil survey into a digital format allows much easier access to the mountain of information contained within. An ArcInfo based soil database construction program is in place and running in the state of Missouri. The methodology is based on SSURGO standards issued by NRCS. CARES has completed eighteen counties in a single fiscal year using two full time positions and two to three part time positions. Once completed, the digital soils database will be useful in many different disciplines. It will be up to persons in various fields to create new uses for these data sets. Some tools are in development which will incorporate these data sets. The potential for new and different types of query and management tools is endless.


Dave Connett and Bryan Mayhan
Center for Agricultural, Resource and Evironmental Systems (CARES)
College of Agriculture, Food and Natural Resources
200 Mumford Hall
University of Missouri-Columbia
Columbia, MO 65211
Telephone: (573)882-1644
Fax: (573)882-3958
Email: connett@cares.missouri.edu
mayhan@cares.missouri.edu
http://www.missouri.edu/~careswww