There are many factors that affect the accuracy of your database. The original cartography itself must be accurate. The availability of stable media is vital. These things are relatively easy to determine and control. If you are scanning paper, you know that the document will stretch and shrink as the outside environment changes, so you can take that into account when you input your data, or try to obtain a more stable media, such as mylar, to scan. If you are working with hand drawn maps that are not precisely to scale, you are aware of the inaccuracies inherant the data and can decide that it is acceptable or that you should try to obtain a more accurate source.
But what about the scanner? How do you know that the scanner is maintaining the accuracy of the original once you get the data into digital form? Most scanner vendors boast an accuracy of plus or minus 0.04%. But what, exactly, does this mean? Generally speaking, a scanner accuracy rating of plus or minus 0.04% means that a 36 inch long doc- ument will produce a 36 inch long raster image (once the pixels are converted into inches) within 0.04% of the overall 36 inches. In other words, the overall length of the documents will be within 0.0144 inches of 36 inches. This sounds pretty good, but it says nothing about a point in the cen- ter of the document.
For GIS purposes, you must be able to prove the accuracy of a coordinate on the scanned image rel- ative to its true location on the ground. For most GIS a difference of 0.018 inches between a points digital location and its true location at the scale of the map is acceptable, although most gov- ernment GIS require the difference be less than or equal to 0.005 inches. These are pretty strict requirements and not all scanners are up to the task. You must be able to test your scanner and prove that it is accurate enough to meet your GIS accuracy requirements.
Accuracy in GIS. Accuracy has always been one of the key issues involved with data capture in GIS. Whether you are purchasing data, digitizing features from a base map, or just key punching in coordinates, you are probably very concerned about the accuracy of the data you are inputting. Different GIS have differ- ent accuracy requirements, but nearly everyone wants their input data to be as accurate as possi- ble. A number of factors affect accuracy: avail- able source data, media stability, etc. These problems are well known, well understood and eas- ily handled; either you accept the accuracy level available, or you obtain better source material. But what if you have very accurate source material on very stable media, such as mylar, and you want to maintain that accuracy by scanning, rather than digitizing? How do you know that you aren't going to introduce an unacceptable level of error into your database with the scanner? How do you quan- tify the accuracy of the scanner itself? These are very important questions that very few GIS users have the answers to. Many GIS users actually have mis-information about the accuracy of their scan- ner inadvertently provided to them by the scanner vendors. GIS requires more stringent accuracy standards than most other large volume scanning applica- tions. For extreme precision applications, such as medical and scientific scanning or photogrammetry, special (and expensive) scanners are manufactured, but this is not practical with GIS. You can pur- chase a drum scanner and be secure that it is accu- rate enough for your GIS, but the price is higher than most GIS annual budgets, and the operating expenses tend to be prohibitive as well. The more cost effective solution is a direct feed scanner, which has a low operating overhead and can be pur- chased for somewhere between $9,000 and $30,000, depending on resolution and accuracy requirements. Feed scanners are much more practical from a cost perspective, but their very design has a tendency to bring accuracy into question. For most product types, prospective hardware buy- ers can rely on the manufacturer's specifications to help them with such questions, but this is not the case with the issue of the accuracy of a feed type scanner. The manufacturers' specifications are usually not only unhelpful, they can be mis- leading as well. This is not an intentional misdi- rection on the part of the manufacturers, it is simply that the common standard of measuring the accuracy of a scanner is inappropriate to GIS. Scanners. Since scanner manufacturers don't test accuracy in a manner that is appropriate to GIS, it is necessary that GIS users integrating scanned data be more edu- cated than the manufacturer in this regard. You must know how to measure accuracy for yourself and how to interpret your measurements. You will need to take these measurements before purchasing a scanner, or scanned data from a service bureau, as well as tak- ing periodic measurements to ensure that an accu- rate scanner remains accurate. In order to do this, you must be fully aware of how the scanner actually works. Feed type scanners essentially consist of a row of cameras, a light source and a roller mechanism. Documents are fed into the scanner and data is captured when light is reflected off the document into the cameras as it moves across the camera row. The cameras must be precisely aligned in order to eliminate error across the width of the image, or in the X direction of the scan. Any mis- alignment of the camera row will produce incorrect merging of data between cameras and horizontal stretch of the data due to the curvature of the lens. The roller mechanism is also a precision device. It controls the speed at which the data pass over the cameras. If the rollers cannot feed the document over the camera row consistently, there will be error in the vertical axis of the scan, or Y direction. In order to obtain usable results, you must measure the accuracy in both the X and Y directions. You must also measure the overall distance of a given pixel from its correct location since the X and Y alone may not indicate the maximum amount of error. Testing Scanner Accuracy. Scanner accuracy can only be tested effectively by measuring actual pixel locations systematically within the output data. To do this, you need a test grid with a consistent pattern that is known to be accurate. An 8 mil mylar grid that is guaranteed to be accurate from Bishop Graphics, Westlake Vil- lage, CA is ideal for this. The grid should be replaced every five years. Measurements must be taken digitally since comparison of plots will not produce quantitative results and plotters tend to introduce a number of additional variables into the accuracy equation. You can use ArcInfo to do all of the raster pro- cessing, report creation, and evaluation. You will need ArcInfo and ArcInfo's Grid module to per- form the test with ArcInfo. The first step of testing the scanner is to scan the grid. The Bishop Graphics test grid has 1/10" grid lines as well as 1" lines. The 1" lines are much bolder. Scan the document at the scanner's highest available optical resolution so that the 1/10" lines drop out and all that remains are the 1" lines. Different makes of scanners have varying methods of thresholding, but most of them can do this with relative ease. To properly register the image, perform a first order polynomial warp on the raster. You must be precise. Display the image on the screen and view the corner intersections so that you can see indi- vidual pixels to determine the exact pixel loca- tion of the intersection. The pixel locations of the corner intersections must be linked to the actual inch location on the original grid. With a perfectly accurate scanner this would produce a new grid with whole number coordinates for the grid intersections. All extraneous data must then be processed out of the warped image, including the lines between intersections. The desired output is a centroid of each intersection with no additional points. The ArcInfo Grid commands FOCALSUM, REGIONGROUP, ZON- ALCENTROID, and GRIDPOINT can accomplish this. Once you have the centroid information of each intersection, you can perform database queries to create a report of the centroid distances from their ideal locations. Select the X coordinate of each point and determine the difference between the actual location and the nearest whole number. This will give you the actual difference in X (DX) for that point from it's true location. The same principle applies to the Y coordinates for each point to determine the difference in Y (DY). Determining the maximum DX and the maximum DY will be helpful in selecting out any extraneous points that were not eliminated in the raster processing. Once you have created your report you will have all the error in the scanner and the precise loca- tion of that error in tabulated form. This is really all you need to determine whether the maxi- mum error of the scanner is within the tolerance of your GIS, but it isn't very helpful when it comes to troubleshooting the problems with the scanner. For that you will want a surface, which is a little easier to render visually. Create a continuous surface of the X and Y errors independently. Use the DX and DY values as spot elevation items for each surface. ArcInfo's ARC- TIN and TINLATTICE commands are helpful for creat- ing the surfaces. A lattice cell size of 0.1" is sufficient resolution for displaying the surface. Also, create an overall error surface by combining the X and Y lattices with the distance formula (d = the square root of [dx squared plus dy squared]). The ArcInfo Grid commands SQRT and SQR can per- form this function. Evaluating the Results. Once the surfaces are created, reclassify them into five classes to indicate acceptable to unac- ceptable value ranges. This will aid in displaying each surface. A color representation of the clas- sified lattice will provide an excellent visual reference for identifying problem areas in the scanner. Green, blue, cyan, white and red are very good colors for indicating the error in a display, with green being the most accurate and white and red being unacceptable. Look at the views of each lattice individually, starting with the X lattice. This view shows you where there are problems with the cameras. Camera alignment problems appear as vertical lines run- ning down the view. In most scanners, all error is very easy to see. You will have green or blue bands along the center of each camera and it will move out to cyan, white or even red where two cameras meet. This is typical. The pattern appears this way because each camera has a curved lens which tends to be very accurate at its center and gradu- ally degrades toward the outer edges. It will be easy to tell if one camera is mis-aligned because it will have a displaced center band and greater error at the edges. Sometimes the entire row is a little off, beginning with the first camera, each camera has more error than the last. This is because the first camera is a bit off, the second is adjusted to the first, and so on. This type of error appears as a gradual shift in error from one side of the view to the other. Nearly all camera row errors on feed type scanners can be adjusted. It may require that a scanner repair technician be called to align the cameras properly.
Next look at the Y axis view. This view shows you what is happening to the document as it moves through the scanner. Every time there is a change in the speed of the rollers or the document slips in the feed, a new band will appear. These bands may show slight inconsistencies or dramatic errors. If the document pulls more on one side or another due to poor roller design, there will be a large amount of error on one side of the view with horizontal peaks indicating roller speed inconsis- tencies or document slipping.
There is not a lot that can be done about errors in the Y axis. Unlike the camera row, the roller mechanism is a relatively fixed entity. Some scan- ners have software or firmware that compensate for changes in roller speed and adjustments can be made, but these adjustments are limited. Large documents may slip in even the best constructed roller mechanisms and have to be held throughout the scan to keep the weight of the mylar off the rollers. All feed type scanners have a certain amount of inaccuracy, this is an inherent aspect of feed scanning. It is up to the user to determine whether the amount of error is acceptable. The final view that you must look at is the overall error. This is the view created by combining X and Y errors. Because of the possible differences in direction of X and Y error, it is possible for error to "average out" so that an unacceptable in either the X or Y may produce acceptable error overall, or, acceptable error in both X and Y may total up to equal unacceptable error overall. Conclusion. Once you have determined the amount of error and know exactly where it is, you can decide what to do about it. If the error is along the camera row, it can probably be fixed by a technician. If the error is in the roller, you may require a scanner with a more sophisticated roller mechanism. Whether your scanner proves accurate or not, you now have all the information you need to determine whether to integrate the scanned data into your GIS. Acknowledgements Jay Magenheim, President, IDEAL SCANNERS AND SYS- TEMS, Rockville, MD Dean Dietrich, Scan-Graphics, Inc., Broomall, PA Appendix 1. Potential Workflow: 1. Scan test document at 500 dpi. 2. Convert image to Grid: Arc: IMAGEGRID test.rlc test_gd 3. Create links to warp the Grid to Digitizer inches: Arc: GRID Grid: MAPE TEST_GD Grid: GRIDPAINT TEST_GD Create windows around the corners of the Grid with the Pan/Zoom button. Be sure you are close enough to see individual pixels. Grid: GRIDPAINT TEST_GD Grid: &RUN GP.AML test.lnk Place the cursor at each corner intersection and click on the first mouse button. When you get to the last intersection, hold down the control key and click on the third mouse button. Edit the file test.lnk so that you have the true inch corner locations after the pixel corner locations. When you are done the file should look like this: 4998.673 4657.284 1 1 4986.451 25009.861 1 42 19949.672 24997.232 30 42 19987.211 4499.944 30 1 4. Warp the Grid to Digitzer inches: Grid: TEST_WP = GRIDWARP (TEST_GD, TEST.LNK, 1) 5. Process out the lines so that only points remain: Grid: test_pt = gridpoint (zonalcentroid (regiongroup (con (focalsum (test_wp, circle, 3) ge 20, 1, setnull (test_wp))))) 6. Create an INFO report from the new point cover- age: Arc: &run scantest.aml test 7. Create surfaces for DX and DY: Arc: arctin test_pt test_tnx point dx Arc: arctin test_pt test_tny point dy Arc: tinlattice test_tnx test_ltx Arc: tinlattice test_tny test_lty 8. Create a surface for the overall error: Grid: test_dst = sqrt (sqr (test_ltx) + sqr (test_lty)) Appendix 2. AML's /* GETPOINT.AML &args gp_file &if [null %gp_file%] &then ~ &return USAGE: GP