A Sampling-Based Method

for Inspecting Third Party-Produced Maps

Alan Freiden, Island System Design and MRJ Technology Solutions, Inc.

Mark Johnson, CIA Map Library

April 8, 1997

Abstract

This paper describes methods for reducing data conversion costs by implementing statistical quality assurance procedures. The sampling approach identifies systematic data conversion problems at much lower cost than 100% inspections. This information may then be used to help data conversion contractors in process improvement efforts.

1. The Problem

The National Imagery and Mapping Agency (NIMA) and the Central Intelligence Agency's Map Services Center (MSC) have an enormous inventory of paper maps that must be converted into digital form. The data capture process is slow and quality assurance of the digital products is costly. Therefore, these agencies require an improved process for data conversion.

This paper is concerned with the quality assurance (QA) of digital map data. It describes efforts to reduce the cost of QA through the introduction of statistical sampling techniques. Sampling can not achieve the same levels of data confidence and accuracy as 100% checking so this is an exercise in optimization. That is, how can we maximize data quality with limited resources for QA?

Optimization implies that cost measures are being traded off against some set of requirements. The NIMA digital map data are produced for a user community with varying data content requirements. Although many different map products are being produced, this paper uses the conversion of the 1:250,000 scale Joint Operations Graphics as an example. The 1:250,000 sources are converted into a digital product called Vector Smart Map Level 1 or VMap1. VMap1 is a digital data product specification conforming to the Vector Product Format Standard. VMap1 contains a number of feature classes categorized as follows:

Category
Feature Classes
Boundaries
Markers, Coastline, Political Boundaries, and Magnetic Areas
Data Quality
Metadata
Elevation
Contours, Depth, and Spot Elevations
Hydrography
Ditch, Well, Canal, River/Stream, Lake/Pond, Reservoir, Dam, Aqueduct, Rapids, Waterfall, and Jetty
Industry
Grain Silo, Mine, Chimney, Crane, Treatment Plant, Tank, Water Tower, and Disposal Site
Physiography
Ice Pack, Caves, Pass, Bluff/Cliff, Crevice, Fault, Ground Surface, Glacier, and Dunes
Population
Buildings, Built-up Area, Fortification, Monument, Stadium, Ruins, and Park
Transportation
Airport, Runway, Track, Trail, Roads, Bridge, Ferry, Tunnel, Ford, Pier, Wharf, and Railroad
Utilities
Pump Station, Power Plant, Pipeline, Power Line, and Telephone Line
Vegetation
Oasis, Trees, Cropland, Grassland, Orchards, Swamp, and Tundra

The VMap1 data are used for planning and some operations. The following table lists feature classes that users feel are critical for their missions. Some feature classes are more important than others and a sampling-based QA approach must take advantage of this.

Category
Army
Air Force
Marines
Intelligence Agencies
Boundary
X
Data Quality
Elevation
X
Hydrography
X
X
Industry
X
Physiography
X
X
Population
X
X
X
X
Transportation
X
X
X
X
Utilities
X
Vegetation

2. Statistical Background

Formal statistical sampling theory deals with the formulation of parameter estimators that minimize a function, the loss function, that describes how costly it is for estimates to be in error. Textbook statistics has a symmetric loss function in the background so that traditional measures of central tendency and dispersion are correct (that is, they minimize the symmetric loss function). The classic example of an asymmetric loss function is in the estimation of the size of a reservoir. An over estimate is less costly than an underestimate so a biased population estimator is called for. Also, the loss function may show that certain components of a complex or aggregated estimator are more important. Then, stratified or other heterogeneous sampling strategies can mitigate the importance of the sampling error introduced into the parameter estimates.

2.1 The Loss Function

The following charts illustrate the trade off in optimizing the loss function. Here, the term "Cost" is a measure of the resources needed to check data. The first chart shows that the relation between confidence and the sampling rate is a concave function of the sampling rate and that there is some error left even if all features are checked. Confidence is some measure of data accuracy. Its exact meaning is discussed later. The relation is concave because succeeding increments to the sampling rate do not contribute as much to enhancing confidence as did earlier increments. In part, this is due to the fact that sampling error varies with the square root of the sample size. The next chart indicates that increasing the sampling rate costs more and that the relation is roughly proportional. Therefore, as shown in the third chart, confidence is a concave function of QA costs. Now, add a fourth dimension to the problem. It is harder to check a complex map than a simple one. This is true because it is harder to distinguish features on a complex map than a simple one and not because there are more features. The chart indicates that it costs more to reach a given level of confidence with a complex map than a simple one. The trade is to sample complex maps (or complex areas within a single map) at a higher rate in order to reach a pre-selected level of confidence.

Cost Relationships

Figure 1 - Cost Relationships

The loss function for estimating the quality of the VMap1 digital data is subjective and hard to quantify. However, it is clear that not all categories of features are as important to users as others Therefore, QA procedures can be based on weighting categories differently so that more of the features in these categories are looked at than in others.

With good quantitative data on the costs of QA it would be possible to construct a procedure that would allow a QA manager to select a confidence level and then to generate automatically a feature sampling scheme that would minimize the cost of achieving this level of confidence. In practice, there are several conceptual difficulties to be resolved before a fully automatic process can be developed. Some of these issues are described in Section 3.

2.2 Sampling

Suppose we are evaluating a production process. We want to be sure that our machine to make #10 hex nuts is performing satisfactorily. The machine makes 1,000,000 nuts per day. It is too costly to measure each nut to ensure that it meets specifications so we are going to sample the production runs to measure the proportion of nuts that fail to meet specs.

There are two basic forms of sampling, probability samples and judgment samples. For probability samples sampling errors can be calculated and biases in selection and estimation are nonexistent. The biases and sampling errors of judgment samples can not be calculated from the sample but must be determined by expert judgment.

Probability samples are purely objective; a nut is selected at random and measured. We record whether or not it met specifications. Because the sampling is random, the selection process is unbiased. The proportion of faulty nuts is measured and the sampling error of this measurement can be calculated. Now, we know a statistic (the measured proportion of faulty nuts) and we know its sampling error. Therefore, we can formulate a hypothesis test that the proportion of faulty nuts is lower than some acceptable threshold and we can assign a confidence level to this measure. For example, we might be able to say that the proportion of faulty nuts is less that 0.01 percent with a confidence of 99 percent. We are 99 percent certain that less than 0.01 percent of the nuts are faulty.

The sampling procedure is random. We could put 1,000,000 nuts in a barrel, stir up the barrel, and then pick out the predetermined number of nuts (that is, the number of nuts that need to be measured to make the sampling error low enough to achieve the 99 percent confidence level) to be measured. Alternatively, we could compute a random sequence and use this to pick the nuts. Computed random sequences are really pseudo-random but this is good enough.

Sampling-based quality checking is the key to process improvement since it reveals systematic problems in the production process. Suppose, that a daily QA procedure shows that the proportion of faulty nuts is higher on Mondays than on other days of the week. This may imply that some maintenance problem occurs over the weekend or that the machine (or its operators) needs extra time to reach its efficient operating level.

Judgment samples depend on selecting "typical" or "representative" measurands or by selecting weighting factors that make allowances for characteristics of the population being measured that are not accounted for by the sampling itself. A stratified sample is a mix. The population is divided into segments, the segments are assigned weights judgmentally, and then the segments are sampled randomly.

2.3 Sampling Spatial Databases

The quality assurance of spatial databases is concerned with measuring the proportion of spatial features that are "correct." We want to know the correctness of features over space and across the various categories of features. The sampling of a spatial database is judgmental because we know two things. First, data conversion is more difficult for regions of a map that are more dense and second, not all feature categories are equally important to the users of the database. Therefore, the QA sampling scheme needs to be stratified over space and across feature categories. Then, we need methods to select random regions of the map to check and random features to check within categories.

2.4 Sample Size for Estimating a Proportion

Each feature is either correct or incorrect according to a set of measurement criteria. Therefore, the statistical problem is to estimate the proportion, p, of correct features from a large population of features. The binomial distribution is used for this case. The estimate of p is the proportion of correct features found in the sample and the sampling error, p, is equal to the square root of (p * (1-p)) / n, where n is the sample size. If n is large, then the binomial distribution may be approximated by a normal distribution.

Suppose we have a specification for the required value of p, the proportion of correct features. Then, for any desired level of confidence (the -level of a hypothesis test) we can calculate the sample size needed to achieve this level of confidence. This calculation is based on simplifying assumptions that are not hard to accept.

3. What is Being Measured?

The quality of a spatial database is the accuracy and completeness of its spatial features and attributes. NIMA and MSC's goal in converting paper maps is to capture in digital form all the information represented on the paper map. Then, the definition of a high-quality spatial database is one that could be used to replicate the paper map. The elements of the spatial database need to be assessed with respect to spatial accuracy, attribute accuracy, and feature completeness.

The single QA measure is the proportion of spatial features that are represented correctly in the database. To be correct, a feature has to be located in the right place, be categorized properly, and have all of its attribute values correct. Errors of commission occur if a feature is in the wrong place (or has the wrong shape), is denoted as being in the wrong feature class, or has an incorrect set of attribute values. Errors of omission occur because a feature has been overlooked and is not in the database.

There are two parts to the definition of digital map database correctness. First, the formal Product Specification describes the data dictionary for the database. The data dictionary defines feature classes, feature types, required attributes, and the valid domains of these attributes. The Product Specification rigidly constrains the feature definitions allowed in the database. Unfortunately, the data dictionary is not enough. Data capture contractors also need a set of Digitizing Guidelines that spell out in as much detail as possible the identification and interpretation of cartographic symbolization on the source maps. Together, these two documents comprise the definition of what is correct in the digital database.

It is useful to define the terms "validation" and "verification" for use in the discussion of QA. A digital database is valid if it conforms to the Product Specification's data dictionary. Conformance is necessary for correctness but it is not sufficient. A feature may match the data dictionary and still be incorrect either because it is in the wrong place or because it has valid but incorrect attributes. Verification is defined as the process of checking data dictionary-compliant features to make sure they are attributed correctly and that they replicate the feature on the map.

4. Sampling-Based QA Methods

This section describes sampling-based QA methods. Sampling is needed because a full population census is too costly. We are concerned here with the time of human QA technicians. Any automated QA method, regardless of its computational difficulty, will avoid sampling because it can be executed during off hours and can, therefore, economize on the technician's time.

Validation can be automatic. A program can read the digital database and compare its content to the data dictionary. In addition, an automated checker can quality assure the format of a delivered database. That is, a program can validate both the content and structure of digital map data.

4.1 Spatial Sampling

We want to stratify our sampling according to map complexity. This is because we anticipate that error rates are higher where map features are more dense. If we ignore that fact that errors of omission are also more likely where features are dense then we can define a spatial sampling scheme weighted by feature density from the digital database itself.

ArcInfo has commands to extract the vertices of each kind of feature (point, line, polygon, etc.) and to write these extracted points into a point coverage. ArcInfo also has a command to combine all the point features in overlapping coverages into one coverage. Then we can create a single coverage that contains all the vertices in the database. Now, construct an array of rectangles (at some resolution) that covers the area of the database being checked and tabulate the number of points from the all inclusive point coverage that fall within each rectangle in the array. This gives us a polygon coverage where the density of the source map is (or, at least, is close enough to) an attribute of each polygon. Next sort the rectangles by number of vertices and then normalize the density measure such that it sums to one over all the rectangles. A uniform (0, 1) random number is compared to the cumulative distribution of rectangles to select a rectangle to be included in the sample. Note that the probability of a rectangle being selected is roughly proportional to the feature density of the area of the source map it covers. Now, check features within this area manually.

A second form of spatial sampling makes use of random points. Random points can be generated by drawing two uniformly distributed random numbers, one for longitude (or the horizontal axis) and one for latitude. A (0,1) uniform random number can be transformed into the range of each axis. Now, draw a circle around each random point and manually check the features found within the circle. ArcInfo commands can do this selection programmatically. The sampling rate is controlled by the number of points selected and the radius of the search circle.

4.2 Feature Sampling

Feature classes differ with respect to how important they are to users and how difficult they are to verify. Therefore, a stratified feature sampling scheme is called for. We want to select features to check randomly but with a probability of selection higher for more important feature classes. A simple sampling scheme is to give each feature in a particular class a weight and then normalize the weights over all features to sum to one. Then sort the weights and select a (0, 1) uniform random number to select a feature to check manually.

This approach needs to be refined to handle Air Force requirements. Any vertical obstruction, spot elevation, tower, silo, etc. is very important to pilots. However, these feature types are in different feature classes so the simple approach described above is not adequate. We need to evaluate sampling schemes to handle this special case.

4.3 Correlated Errors

Over time, QA technicians notice patterns and correlation in errors. Sometimes, this correlation may be traced back to an element of the data capture contractor's process. This kind of insight is important for process improvement and may add to the power of a sampling approach. Adaptive sampling uses error correlation to modify feature or area selection probabilities.

At this time, little information exists to formulate an adaptive sampling approach. However, if we are to make use of this technique in the future, we must capture key information about possible error sources so correlated errors can be identified and explained. This has implications for the way data capture contractors document their work and for the way QA errors are annotated.

5. QA Workstation Concept of Operations

Optimal sampling is one way to reduce the cost of inspecting digital maps; enhancing QA technicians' productivity is another. This section describes a workstation environment for digital map database inspection. Both the concept of operations and the functionality for the workstation are preliminary. This is only a concept and not a design.

The key to the workstation concept is to give the QA technician rapid access to information and visualizations that will support human pattern recognition. Also, the workstation will serve as the executive for launching non-real time tasks (such as automatic validation routines) and for reviewing the results of such tasks. Interactive functions are to be performed as fast as possible.

5.1 Spatial Comparisons

Figure 2 shows the workstation's user interface. The screen is divided into three areas. The image area on the left displays two overlapping images. One image is the scanned source map. This is shown as a full color image but it may also be a scanned separate. A cartographic representation of the digital database being inspected is overlaid on the scanned map. The cartographic image is the output of a map production preprocessing step. It is either a raster map image or a graphics metafile. It uses a simplified symbology that has been designed to encode features in a way meaningful to an expert user. The overlay is easier to see on the computer screen than on the printed page. The images are selected by picking from a drop down list. The images must be registered so that the QA technician may evaluate feature coding and spatial accuracy very rapidly. There are controls for rapid pan and zoom (in and out).

QA Workstation User Interface

Figure 2 - QA Workstation User Interface

The "Fade" control determines a degree of transparency for the overlaid cartographic product. If the fade control is at its maximum, the cartographic image is completely transparent and only the scanned source map appears. Sliding the control downward brings out the cartographic image in stages until the scanned map disappears when the control is at the bottom. Image fading is a powerful tool used in image analysis for feature discrimination and change detection. Here, fading is used for checking spatial accuracy, finding features that have been left out, and for high-level feature coding checks. The mouse cursor is pointing to a brown line that should be the center line of the black railroad. These segments are in error since they do not line up properly.

Image fading has to be instantaneous to be effective. A preliminary feasibility study showed that fade can be implemented as described using graphics display hardware that is supplied with today's high-end Windows-based platforms. The two images are converted to a simpler form using a reduced set of colors (16 colors for the scanned map and eight colors for the cartographic metafile) and then the fade operation is performed using color lookup table animation.

5.2 Attribute Checking

The second area of the screen holds a different kind of cartographic representation of the digital database. It is a full geographic information system (GIS) display. This screen area is drawn using ArcPlot (ArcEdit, ArcView, or a MapObjects application may be better). It uses a more elaborate feature symbolization than the cartographic overlay in the first screen area and, therefore, conveys more attribute information. A feature "identify" tool is active on this window so the operator can view the complete attributes for a selected feature or set of features. This area of the screen is used for checking feature attribution.

The two screen areas are implemented as child windows. Therefore, they can be positioned on the screen independently. For example, they do not have to be overlapped if there is enough screen area for both. The spatial extents displayed by the two windows are coordinated. That is, if the active window is panned or zoomed, the other window will adjust automatically to display the same spatial extent.

5.3 Toolbar Functions

The third screen area is a toolbar. Ten tools are defined here. They are:

  1. Validate - Launch the automatic validation routines to be executed as idle state tasks under Windows NT. These tasks can run overnight or in the background.
  2. Register - Invoke a user interface to facilitate overlapping the scanned map image and the simplified cartographic metafile. These images are in the fade window.
  3. Sample - Initiate the semi-automatic sampling procedures. Some of these procedures may be able to adjust automatically the screen displays such that the QA technician is guided quickly to the map regions or features that need to be evaluated. A start up procedure for the sampling processes will support user determination of the weighting scheme for spatial and feature set stratified samples and for an overall confidence level.
  4. Pan - This tool is an electronic light table function to change the currently displayed region of the map.
  5. Zoom in - Reduce the extent of the currently displayed region of the map to magnify a selected area.
  6. Zoom Out - Expand the extent of the currently displayed region of the map to show a larger spatial extent.
  7. Identify - This tool works with the GIS display. The QA technician uses the Identify tool to select one or more features and then the attributes of these features are displayed in tabular form.
  8. DR Form - A DR is a Discrepancy Report. This button activates a database form to be filled in to document an error in the database. It is similar to the Identify tool in that it can be used to select a feature that is in error. Then, several fields in the DR record are filled in automatically (location, feature code, coverage, sheet number, and feature type) and the QA technician supplies a textual description of the discrepancy.
  9. DD Help - This button opens a standard Microsoft Windows "Help" file that contains a hypertext version of the Product Specification's Data Dictionary. This help file can be searched for feature and attribute definitions and values.
  10. Dig Spec - This button opens a hypertext version of the data capture project's Digitizing Specifications and Guidelines document. The Guidelines describe the project's interpretations of the paper map product's Cartographic Specification and any additional material used by data capture contractors to resolve feature ambiguity.
  11. Next Error - This button moves the center point of the displays to the location of the next error that has been found automatically.

The QA Workstation has a standard set of menus. The Reports menu contains commands for reviewing various kinds of database reports such as the results of automatic validation routines. In addition, it is way to access an ad hoc database reporting capability.

5.4 Automated Verification

The literature on GIS database QA says little about automatic spatial accuracy or attribute correctness checking. However, some research results reported by the Ohio State University GISOM Project are interesting. They verify some attribute correctness and topological consistency constraints automatically. For example, contour lines can not intersect. An intersection test goes beyond the usual attribute verification tests (are all the values permissible?) and also beyond general topological consistency checks such as looking for open polygons.

An automated accuracy and completeness checks can be used for checking linear features. The technique is based on the vector to raster conversion algorithms in computer graphics. The idea is to compare the lines generated by plotting the line features in the digital database to the source pixels that defined the line on the source map. Line quality is measured quantitatively. This is done as follows. First, establish a rule for defining centerline pixels on the source image. Then, count the proportion of centerline pixels that are intersected by the vector line. Note that each pixel has a finite spatial extent and that the vector line has no thickness. If the proportion of centerline pixels crossed by the vector line does not exceed a threshold value, then flag the segment where this occurred for later operator evaluation. See the example in Figure 3. The centerline quality of this line is one.

 Automatic Centerline Evaluation

Figure 3 - Automatic Centerline Evaluation

6. Process Improvement

Quality assurance is the empirical foundation for process improvement steps. The systematic errors found while inspecting a number of digital maps reveal aspects of the data development process that require investments in new or enhanced technology and practice. The goal of process improvement is to produce more digital data at lower cost quicker than before. In addition, insights from analyses of paper map conversions are applicable to the formulation of new data capture methods to be applied to future data sources.

Our experience to date indicates four broad areas for process improvement efforts. They are:

Acknowledgments

The authors would like to thank Andrea Keilholtz, David White, and Norm Peck of MRJ Technology Solutions, Inc. for many helpful comments. This paper has not been subjected to Agency review and, therefore, does not reflect the views of the Agency and no official endorsement should be inferred.

References

  1. Clelland, Richard C., et al, Basic Statistics with Business Applications, (John Wiley, 1966).
  2. Deming, W. Edwards, Some Theory of Sampling, (Dover Publications, 1966 - Reprint of 1950 edition).
  3. James, Davis E. "Spatial Data Quality Assessment Tools for Environmental Applications," Proceedings of the Thirteenth Annual Esri User Conference, 1993.
  4. Knuth, Donald, Seminumerical Algorithms, Second Edition (Addison-Wesley, 1981).
  5. MIL-STD-2407, Vector Product Format
  6. MIL-U-89035, Urban Vector Smart Map
  7. MIL-V-89033, Vector Smart Map Level 1
  8. Phuyal, Bishnu P., Robert W. Schmidley, and J. Raul Ramirez, "Automated Quality Assurance for GIS Data Conversion," GIS/LIS Annual Conference, 1996.

Alan Freiden
Island System Design
311 Fort Howell Drive
Hilton Head Island, SC 29926-2765
Telephone: (803) 342-3830
email: afreiden@digitel.net

Mark Johnson
CIA Map Library
Telephone: (703) 742-8071