Steven J.Vance

Missouri Soils Map Digitization

The Missouri Department of Natural Resources (MDNR) participates with the Natural Resource Conservation Service (NRCS) in the National Cooperative Soil Survey (NCSS). This involves identifying, documenting, and mapping soils, developing this information for publication in the official survey books (usually by county), interpreting soils information for specific applications and conducting related research.

Soils maps and manuscripts have been developed and published for most of Missouris 114 counties. The maps and interpretive information in the manuscripts represent a substantial investment (each county represents about 12 to 18 person- years of work) and are very valuable resources for natural resource planning, research, and development. There are several agencies and other interests who would like to have access to this information, and would like to have it in an electronic format which is constant with Geographic Information System (GIS) technology.

The overall task of this project is to develop in digital format individual county soil coverages for the state of Missouri using available Geographic Information System (GIS) and scanning technology, including raster-to-vector and optical character recognition software.




Introduction



In October of 1995, the Missouri Department of Natural Resources (MDNR)

contracted with the Center for Agricultural Resource and Environmental

Systems (CARES) at the University of Missouri-Columbia (MU), to develop a

statewide digital soils layer.  This contract is designed to develop pilot

projects, to evaluate existing source materials and define standards and

methodology to be used throughout the digitization of statewide soils

information.



As the number of Geographic Information Systems (GIS) and users continue to

grow at an astonishing rate, so does the demand for digital map data.

Increased numbers of GIS managers have been realizing the cost

effectiveness and quality to be found in scanning as opposed to

conventional table digitization.  In the case of the soils data, the

conventional use of a digitizing table would be slow and the quality of the

data produced would be inconsistent from person to person.  Scanning and

automatic vectorization can produce consistent, high-quality line work with

a minimum of user interaction.  Hitachi's Tracer and Recognizer produces

high-quality vectors from raster data much more consistently than manual

digitizing.



Back Ground



One of the first and most important issues of the soils project was

inventorying the source materials.  Several days were spent working with

Natural Resource Conservation Service (NRCS) personnel at the state office

located in Columbia, Missouri, going through each individual county drawer.

The end result was a comprehensive catalogue of soil maps source materials;

documenting scale, number of quads, format and completeness.  A majority of

Missouris counties have mylar text and line separates source materials at

either full or one third quad 1:24,000 scale maps, with a few counties

mapped at 1:20,000 scale.



Standards for the soils map digitization are being developed over the course

of the contract period and will be finalized at the end of the contract.

The pilot projects will be used to determine the final standards. 

CARES will strive to meet Soil Survey Geographic (SSURGO) standards in its

effort to define the final standards for this project.



Two counties and a watershed were selected as pilot projects.  Stoddard

county, located in the southeast part of the state was selected because its

source materials are full 1:24,000 quadrangle maps.  Bates County, located

along the Kansas state line in the south-central part of the state was

selected with source materials at one third 1:24,000 quadrangle maps.

Loose Creek was selected as the watershed, it is located in Osage County

and consists of about 45,000 acres.



Scanning Hardware and Software



Scanning will be accomplished by using a Contex FSS 8200 E-size scanner

connected to a IBM RISC/6000 workstation.  The software used to scan the

map separates is Contex's CADImage/Scan.  The available scanner resolutions

are from 50dpi to 800dpi.  Presently CADImage supports more than fifty

different industry standard file formats.  One of the key features of the

software is the threshold settings, in which, all gray tones lower than the

input number used will be represented as white pixels, and all gray tones

over the threshold will be represented as black pixels.  This works as

thinning and filter process.  Generally it has been determined that the

mylar line separates will be scanned at 300dpi and the mylar text separates

will be scanned at 500dpi.



Raster-to-Vector / Optical Character Recognition (OCR)



Information collection on various software companies that offered

raster-to-vector and Optical Character Recognition (OCR) software was

completed.  The field was narrowed down to two vendors that were able to

meet requirements and supply the project with a demo copy so that it could

be tested out on sample areas.  Both software programs are PC based.  No

IBM AIX versions able to do OCR were found.  There are however, UNIX OCR

software that will run on SUN workstations, but were not considered due to

the fact that CARES uses only IBM RISC/6000 workstations.



Hitachi's Tracer and Recognizer was chosen over Ideal's I/Vector.  Both were

evaluated as very good, but Hitachi's software was selected due to the

increased editing functions available with AutoCAD.



The Hitachi software is an application that runs in conjunction with

AutoCAD and converts scanned maps into vectors and text which can be used

by a GIS.  Tracer provides the tools to do semi-automatic conversion

techniques, while Recognizer provides for automatic recognition of both

graphics as well as text.  Once the vectorization is completed, the file is

converted to a DXF format.  The DXF formatted map is then converted to an

ARC coverage using the DXFARC command.  Various AML programs and menus have

been written to further aid in the editing and error checking process in

order to get the coverage into its final form.



Summary



The use of raster-to-vector and OCR software has greatly reduced the

conversion process times of converting paper/mylar maps to usable GIS

coverages.  Working with good input separates, ARC coverage can be

generated that are typically better quality than the same product digitized

by hand.  Selecting the appropriate parameters for conversions is a must, a

slight variation can greatly change results.  Scanning densities of 300dpi

for line work and 500dpi for text appear to offer the best repeatable

results from scanned data sets.  High quality GIS hardware and software are

now available at realistic prices.  Many GIS managers are now realizing the

fiscal advantages and superior results of using raster-to-vector and OCR

conversion software over conventional table digitization.  Given a well

researched approach to data base design, scanning will take less time and

result in higher quality data for the creation of many GIS layers.