Adrien Litton

Quantifying Scanner Accuracy

How do you know your scanner is accurate? Scanner accuracy is a major issue when automating data for GIS.It is important to know that you are not bringing inaccurate data into your data- base.

There are many factors that affect the accuracy of your database. The original cartography itself must be accurate. The availability of stable media is vital. These things are relatively easy to determine and control. If you are scanning paper, you know that the document will stretch and shrink as the outside environment changes, so you can take that into account when you input your data, or try to obtain a more stable media, such as mylar, to scan. If you are working with hand drawn maps that are not precisely to scale, you are aware of the inaccuracies inherant the data and can decide that it is acceptable or that you should try to obtain a more accurate source.

But what about the scanner? How do you know that the scanner is maintaining the accuracy of the original once you get the data into digital form? Most scanner vendors boast an accuracy of plus or minus 0.04%. But what, exactly, does this mean? Generally speaking, a scanner accuracy rating of plus or minus 0.04% means that a 36 inch long doc- ument will produce a 36 inch long raster image (once the pixels are converted into inches) within 0.04% of the overall 36 inches. In other words, the overall length of the documents will be within 0.0144 inches of 36 inches. This sounds pretty good, but it says nothing about a point in the cen- ter of the document.

For GIS purposes, you must be able to prove the accuracy of a coordinate on the scanned image rel- ative to its true location on the ground. For most GIS a difference of 0.018 inches between a points digital location and its true location at the scale of the map is acceptable, although most gov- ernment GIS require the difference be less than or equal to 0.005 inches. These are pretty strict requirements and not all scanners are up to the task. You must be able to test your scanner and prove that it is accurate enough to meet your GIS accuracy requirements.

Accuracy in GIS. 

Accuracy has always been one of the key issues 
involved with data capture in GIS. Whether you are 
purchasing data, digitizing features from a base 
map, or just key punching in coordinates, you are 
probably very concerned about the accuracy of the 
data you are inputting. Different GIS have differ-
ent accuracy requirements, but nearly everyone 
wants their input data to be as accurate as possi-
ble. A number of factors affect accuracy: avail-
able source data, media stability, etc. These 
problems are well known, well understood and eas-
ily handled; either you accept the accuracy level 
available, or you obtain better source material.



But what if you have very accurate source material 
on very stable media, such as mylar, and you want 
to maintain that accuracy by scanning, rather than 
digitizing? How do you know that you aren't going 
to introduce an unacceptable level of error into 
your database with the scanner? How do you quan-
tify the accuracy of the scanner itself? These are 
very important questions that very few GIS users 
have the answers to. Many GIS users actually have 
mis-information about the accuracy of their scan-
ner inadvertently provided to them by the scanner 
vendors.



GIS requires more stringent accuracy standards 
than most other large volume scanning applica-
tions. For extreme precision applications, such as 
medical and scientific scanning or photogrammetry, 
special (and expensive) scanners are manufactured, 
but this is not practical with GIS. You can pur-
chase a drum scanner and be secure that it is accu-
rate enough for your GIS, but the price is higher 
than most GIS annual budgets, and the operating 
expenses tend to be prohibitive as well. The more 
cost effective solution is a direct feed scanner, 
which has a low operating overhead and can be pur-
chased for somewhere between $9,000 and $30,000, 
depending on resolution and accuracy requirements. 
Feed scanners are much more practical from a cost 
perspective, but their very design has a tendency 
to bring accuracy into question.



For most product types, prospective hardware buy-
ers can rely on the manufacturer's specifications 
to help them with such questions, but this is not 
the case with the issue of the accuracy of a feed 
type scanner. The manufacturers' specifications 
are usually not only unhelpful, they can be mis-
leading as well. This is not an intentional misdi-
rection on the part of the manufacturers, it is 
simply that the common standard of measuring the 
accuracy of a scanner is inappropriate to GIS.



Scanners.

Since scanner manufacturers don't test accuracy in 
a manner that is appropriate to GIS, it is necessary 
that GIS users integrating scanned data be more edu-
cated than the manufacturer in this regard. You must 
know how to measure accuracy for yourself and how to 
interpret your measurements. You will need to take 
these measurements before purchasing a scanner, or 
scanned data from a service bureau, as well as tak-
ing periodic measurements to ensure that an accu-
rate scanner remains accurate. In order to do this, 
you must be fully aware of how the scanner actually 
works.



Feed type scanners essentially consist of a row of 
cameras, a light source and a roller mechanism. 
Documents are fed into the scanner and data is 
captured when light is reflected off the document 
into the cameras as it moves across the camera 
row. The cameras must be precisely aligned in 
order to eliminate error across the width of the 
image, or in the X direction of the scan. Any mis-
alignment of the camera row will produce incorrect 
merging of data between cameras and horizontal 
stretch of the data due to the curvature of the 
lens. The roller mechanism is also a precision 
device. It controls the speed at which the data 
pass over the cameras. If the rollers cannot feed 
the document over the camera row consistently, 
there will be error in the vertical axis of the 
scan, or Y direction. In order to obtain usable 
results, you must measure the accuracy in both the 
X and Y directions. You must also measure the 
overall distance of a given pixel from its correct 
location since the X and Y alone may not indicate 
the maximum amount of error.



Testing Scanner Accuracy.

Scanner accuracy can only be tested effectively by 
measuring actual pixel locations systematically 
within the output data. To do this, you need a test 
grid with a consistent pattern that is known to be 
accurate. An 8 mil mylar grid that is guaranteed 
to be accurate from Bishop Graphics, Westlake Vil-
lage, CA is ideal for this. The grid should be 
replaced every five years. Measurements must be 
taken digitally since comparison of plots will not 
produce quantitative results and plotters tend to 
introduce a number of additional variables into 
the accuracy equation.



You can use ArcInfo to do all of the raster pro-
cessing, report creation, and evaluation. You will 
need ArcInfo and ArcInfo's Grid module to per-
form the test with ArcInfo.



The first step of testing the scanner is to scan 
the grid. The Bishop Graphics test grid has 1/10" 
grid lines as well as 1" lines. The 1" lines are 
much bolder. Scan the document at the scanner's 
highest available optical resolution so that the 
1/10" lines drop out and all that remains are the 
1" lines. Different makes of scanners have varying 
methods of thresholding, but most of them can do 
this with relative ease.



To properly register the image, perform a first 
order polynomial warp on the raster. You must be 
precise. Display the image on the screen and view 
the corner intersections so that you can see indi-
vidual pixels to determine the exact pixel loca-
tion of the intersection. The pixel locations of 
the corner intersections must be linked to the 
actual inch location on the original grid. With a 
perfectly accurate scanner this would produce a 
new grid with whole number coordinates for the 
grid intersections.



All extraneous data must then be processed out of 
the warped image, including the lines between 
intersections. The desired output is a centroid of 
each intersection with no additional points. The 
ArcInfo Grid commands FOCALSUM, REGIONGROUP, ZON-
ALCENTROID, and GRIDPOINT can accomplish this.



Once you have the centroid information of each 
intersection, you can perform database queries to 
create a report of the centroid distances from 
their ideal locations. Select the X coordinate of 
each point and determine the difference between 
the actual location and the nearest whole number. 
This will give you the actual difference in X (DX) 
for that point from it's true location. The same 
principle applies to the Y coordinates for each 
point to determine the difference in Y (DY). 
Determining the maximum DX and the maximum DY will 
be helpful in selecting out any extraneous points 
that were not eliminated in the raster processing.



Once you have created your report you will have 
all the error in the scanner and the precise loca-
tion of that error in tabulated form. This is 
really all you need to determine whether the maxi-
mum error of the scanner is within the tolerance 
of your GIS, but it isn't very helpful when it 
comes to troubleshooting the problems with the 
scanner. For that you will want a surface, which 
is a little easier to render visually.



Create a continuous surface of the X and Y errors 
independently. Use the DX and DY values as spot 
elevation items for each surface. ArcInfo's ARC-
TIN and TINLATTICE commands are helpful for creat-
ing the surfaces. A lattice cell size of 0.1" is 
sufficient resolution for displaying the surface. 
Also, create an overall error surface by combining 
the X and Y lattices with the distance formula (d = 
the square root of [dx squared plus dy squared]). 
The ArcInfo Grid commands SQRT and SQR can per-
form this function.



Evaluating the Results.

Once the surfaces are created, reclassify them 
into five classes to indicate acceptable to unac-
ceptable value ranges. This will aid in displaying 
each surface. A color representation of the clas-
sified lattice will provide an excellent visual 
reference for identifying problem areas in the 
scanner. Green, blue, cyan, white and red are very 
good colors for indicating the error in a display, 
with green being the most accurate and white and 
red being unacceptable.



Look at the views of each lattice individually, 
starting with the X lattice. This view shows you 
where there are problems with the cameras. Camera 
alignment problems appear as vertical lines run-
ning down the view. In most scanners, all error is 
very easy to see. You will have green or blue bands 
along the center of each camera and it will move 
out to cyan, white or even red where two cameras 
meet. This is typical. The pattern appears this 
way because each camera has a curved lens which 
tends to be very accurate at its center and gradu-
ally degrades toward the outer edges. It will be 
easy to tell if one camera is mis-aligned because 
it will have a displaced center band and greater 
error at the edges. Sometimes the entire row is a 
little off, beginning with the first camera, each 
camera has more error than the last. This is 
because the first camera is a bit off, the second 
is adjusted to the first, and so on. This type of 
error appears as a gradual shift in error from one 
side of the view to the other. Nearly all camera 
row errors on feed type scanners can be adjusted. 
It may require that a scanner repair technician be 
called to align the cameras properly.



Next look at the Y axis view. This view shows you 
what is happening to the document as it moves 
through the scanner. Every time there is a change 
in the speed of the rollers or the document slips 
in the feed, a new band will appear. These bands 
may show slight inconsistencies or dramatic 
errors. If the document pulls more on one side or 
another due to poor roller design, there will be a 
large amount of error on one side of the view with 
horizontal peaks indicating roller speed inconsis-
tencies or document slipping.

There is not a lot that can be done about errors in 
the Y axis. Unlike the camera row, the roller 
mechanism is a relatively fixed entity. Some scan-
ners have software or firmware that compensate for 
changes in roller speed and adjustments can be 
made, but these adjustments are limited. Large 
documents may slip in even the best constructed 
roller mechanisms and have to be held throughout 
the scan to keep the weight of the mylar off the 
rollers. All feed type scanners have a certain 
amount of inaccuracy, this is an inherent aspect 
of feed scanning. It is up to the user to determine 
whether the amount of error is acceptable.



The final view that you must look at is the overall 
error. This is the view created by combining X and 
Y errors. Because of the possible differences in 
direction of X and Y error, it is possible for 
error to "average out" so that an unacceptable in 
either the X or Y may produce acceptable error 
overall, or, acceptable error in both X and Y may 
total up to equal unacceptable error overall.



Conclusion.

Once you have determined the amount of error and 
know exactly where it is, you can decide what to do 
about it. If the error is along the camera row, it 
can probably be fixed by a technician. If the 
error is in the roller, you may require a scanner 
with a more sophisticated roller mechanism. 
Whether your scanner proves accurate or not, you 
now have all the information you need to determine 
whether to integrate the scanned data into your 
GIS.



Acknowledgements

Jay Magenheim, President, IDEAL SCANNERS AND SYS-
TEMS, Rockville, MD

Dean Dietrich, Scan-Graphics, Inc., Broomall, PA



Appendix 1.

Potential Workflow:



1. Scan test document at 500 dpi.

2. Convert image to Grid:

	Arc: IMAGEGRID test.rlc test_gd

3. Create links to warp the Grid to Digitizer 
inches:

	Arc: GRID

	Grid: MAPE TEST_GD

	Grid: GRIDPAINT TEST_GD

	Create windows around the corners of the Grid 
with the Pan/Zoom button. Be sure you are close 
enough to see individual pixels.

	Grid: GRIDPAINT TEST_GD

	Grid: &RUN GP.AML test.lnk

	Place the cursor at each corner intersection 
and click on the first mouse button. When you get 
to the last intersection, hold down the control 
key and click on the third mouse button.

	Edit the file test.lnk so that you have the 
true inch corner locations after the pixel corner 
locations. When you are done the file should look 
like this:



	4998.673 4657.284 1 1

	4986.451 25009.861 1 42

	19949.672 24997.232 30 42

	19987.211 4499.944 30 1



4. Warp the Grid to Digitzer inches:

	Grid: TEST_WP = GRIDWARP (TEST_GD, TEST.LNK, 1)

5. Process out the lines so that only points 
remain:

	Grid: test_pt = gridpoint (zonalcentroid 
(regiongroup (con (focalsum (test_wp, circle, 3) 
ge 20, 1, setnull (test_wp)))))

6. Create an INFO report from the new point cover-
age:

	Arc: &run scantest.aml test

7. Create surfaces for DX and DY:

	Arc: arctin test_pt test_tnx point dx

	Arc: arctin test_pt test_tny point dy

	Arc: tinlattice test_tnx test_ltx

	Arc: tinlattice test_tny test_lty

8. Create a surface for the overall error:

	Grid: test_dst = sqrt (sqr (test_ltx) + sqr 
(test_lty))



Appendix 2.

AML's



/* GETPOINT.AML

&args gp_file

&if [null %gp_file%] &then ~

 &return USAGE: GP 



&set fil = [open %gp_file% stat -append]

&if %stat% = 0 &then ~

 &type File %gp_file% has been opened.



&type

&type Please select points...

&type



&getpoint

&if [write %fil% [quote %PNT$X% %PNT$Y%]] = 0 
&then ~

 &type %PNT$X% %PNT$Y% written to file: %gp_file%.



&do &while %pnt$key% ne 9

 &getpoint

 &if [write %fil% [quote %PNT$X% %PNT$Y%]] = 0 
&then ~

 &type %PNT$X% %PNT$Y% written to file: %gp_file%.

&end



&if [close %fil%] = 0 &then ~

 &type %gp_file% has been closed.



&ret



/* SCANTEST.AML



&args scanner feat

&severity &error &ignore

&type %feat%

&if [null %scanner%] &then ~

 &RETURN USAGE: SCANTEST.AML  {Feature 
Type}

&if [null %feat%] &then &set feat = line

&if %feat% = line &then

 &do

 &if ^ [exists %scanner%_tr -cover] &then ~

 &RETURN You must have a transformed grid cover-
age.

 &end



&set scanner = [trans %scanner%]



&call CHECKS



build %scanner%_pt point

addxy %scanner%_pt 

additem %scanner%_pt.pat %scanner%_pt.pat dx 4 ~ 
12 f 3

additem %scanner%_pt.pat %scanner%_pt.pat dy 4 ~ 
12 f 3

additem %scanner%_pt.pat %scanner%_pt.pat x 3 3 i

additem %scanner%_pt.pat %scanner%_pt.pat y 3 3 i

&set reportscanner = Report For %scanner% Scanner

&data arc info

 arc

 FORMAT $NUM1,4,12,F,5

 FORMAT $NUM2,4,12,F,5

 FORMAT $NUM3,4,12,F,5

 FORMAT $NUM4,4,12,F,5

 SELECT %scanner%_PT.PAT

 CALC X = X-COORD + .5

 CALC Y = Y-COORD + .5

 CALC DX = X-COORD - X

 CALC DY = Y-COORD - Y

 ASEL

 RESEL DX GT .005

 RESEL DX LT .05

 ASEL DX LT -.005

 RESEL DX GT -.05

 ASEL DY GT .005

 RESEL DY LT .05

 ASEL DY LT -.005

 RESEL DY GT -.05

 OUTPUT ../[LOCASE %scanner%.rpt] INIT

 REPORT %scanner%.RPT 

 N

 X

 6

 TRUE X

 ======

 X-COORD

 [UNQUOTE `']

 ACTUAL X

 ========

 DX

 A

 DIFF IN X

 =========

 $NUM1

 10,TEXT GRAND $NUM1

 MAX X:

 ======

 Y

 6

 TRUE Y

 ======

 Y-COORD

 [UNQUOTE `']

 ACTUAL Y

 ========

 DY

 A

 DIFF IN Y

 =========

 $NUM2

 10,TEXT GRAND $NUM2

 MAX Y:

 ======

 [UNQUOTE `']

 [QUOTE %scanner% SCANNER ACCURACY TEST]

 N

 PROGRAM %scanner%.PRG

 CALC $NUM1 = 0

 CALC $NUM2 = 0

 PROGRAM SECTION TWO

 CALC $NUM3 = $NUM1 * -1

 IF DX GT $NUM1 XOR DX LT $NUM3

 CALC $NUM1 = DX

 ENDIF

 CALC $NUM4 = $NUM2 * -1

 IF DY GT $NUM2 XOR DY LT $NUM4

 CALC $NUM2 = DY

 ENDIF

 PROGRAM SECTION THREE

 REPORT %scanner%.RPT Y 55

 ~

 COMPILE %scanner%.PRG

 RUN %scanner%.PRG

 DELETE %scanner%.RPT

 Y

 DELETE %scanner%.PRG

 Y

 Q STOP

&end 

&return



/********************************************

/* ROUTINE CHECKS

/* Checks existance of redundant info files

/* or items and deletes them.

/********************************************



&routine CHECKS

&if [exists %scanner%_cn -cover] &then ~

 kill %scanner%_cn

&if %feat% = line &then 

 &do

 &if [exists %scanner%_pt -cover] &then ~

 kill %scanner%_pt

 &end

&if [exists [locase %scanner%].rpt -file] &then ~

 &sys rm [locase %scanner%].rpt

&if [exists %scanner%.RPT -info] &then ~

 &set report = .true. 

&else &set report = .false.

&if [exists %scanner%.PRG -info] &then ~

 &set program = .true.

&else &set program = .false. 

&if %program% = .true. or %report% = .true. &then

 &do

 &data arc info

 ARC

 &if %report% &then

 &do 

 ERASE %scanner%.RPT

 Y 

 &end

 &if %program% &then

 &do 

 ERASE %scanner%.PRG 

 Y 

 &end

 Q STOP 

 &end

 &end

&return

Adrien Litton
Scanning Specialist
Environmental Systems Research Institute
380 New York St.
Redlands, CA 92373
Telephone: (909) 793-2853
Fax: (909) 793-5953
Email: alitton@Esri.com