HOME


Track: Database Design and Automation

Yecheng Wu
Able Software Co.
5 Appletree Lane
Lexington, MA 02173


Telephone: 617-862-2804
Fax: 617-862-2640
E-mail: ywu@ablesw.com



Automated Map Digitizing: Developments in Raster-to-Vector Conversion Technology  Paper Text

Creating a spatial database often involves the acquisition of huge amounts of data from paper maps. The acquisition is usually performed with hand-operated digitizing tablets, following procedures that are time-consuming, costly, and error-prone. Efforts to develop systems and effective techniques for automatic input of paper maps over the past twenty years have found limited success. Only recently have substantial advances in both computer hardware and software been achieved in this field. We have developed a software system, R2V for Windows and WindowsNT, for the purpose of automatic paper map input and raster image to vector conversion. Image processing techniques are developed for automated digitizing of contour maps, land use maps, parcel maps, tax maps, utility maps, as well as natural source images such as aerial photos and satellite imagery. In this paper, we will focus on techniques developed for the automated interpretation of scanned maps and practical examples of using the technology. 1. Scanning and Preprocessing - A typical map consists of different types of lines, text, and symbols in color or black/white. It is scanned in one of the following image types depending on the map quality and capability of a scanner: monochrome, gray-scale or color. The sample contour map is scanned as gray-scale image at 400 dots per inch. Algorithms are developed to de-skew an image when scanning distortion happens, to remove dark background from the scanned image using a special designed band-pass filter, and to classify and separate colors using a clustering based unsupervised classification method when a color map is going to be digitized. 2. Automatic Vectorization - An optimal thresholding algorithm is developed to convert a gray-scale image to binary form for vectorization. All lines, text, and symbols are automatically vectorized and recorded in a vector form (i.e., line segments are represented by the center X and Y coordinates along the line). The accuracy level is maintained at the original scanning resolution (in our example, 400dpi) as the center pixel is always traced and recorded. Text is first vectorized as lines and then recognized using R2Vs trainable optical character recognition engine. 3. Vector Editing and Cleaning - Broken lines are connected, closed, and line smoothing is done when noise exists in the image. Text lines are marked using a text block detection algorithm. For maps with polygons, such as parcel or tax maps, polygons are closed with a polygon generating algorithm to create polygon topology and remove redundant lines. 4. Automatic or Interactive Vector Labeling - If text attributes exist in the map, they can be automatically converted to text string with OCR functions in R2V. The text strings can be used directly to label polygons or lines. It is often used in tax map digitizing and labeling. Codes or attributes can also be assigned to lines and points using the interactive line labeling function. In the case of digitizing a contour map, elevation values are assigned automatically by going from a lower elevation contour line to a higher elevation contour with given starting and increment elevation values. 5. Save to ArcInfo Generate or ArcView Shapefile - The final digitized map with lines and points labeled is then saved to ArcInfo generate or ArcView shapefile formats for use in a GIS or mapping application. A digitized contour map with all its elevation values labeled can be directly used as a sparse DEM.



Copyright 1997 Environmental Systems Research Institute