David T. Hansen
In ArcView, a variety of digital sources can be displayed for a visual assessment of data quality. These include GIS data themes, images, and primitive graphic objects or shapes. ArcView has recognized that GIS themes from different sources have different display requirements. When the units of a View are identified, the display window calculates a display scale. Based on the display scale, we can set thresholds for display as a theme property.
Figure 1 shows the dialog box for the theme display property in ArcView.
This permits the avoidance of clutter in the display for the ArcView
application. The application developer also has the opportunity to
prevent display of themes at scales that are not appropriate for the data.
For many users of the application, this may be the first clue that some of the
data may not be useful for the analysis that they want to perform. The
ArcView application developer must make an assessment of the spatial data
quality for the theme and set appropriate
scales.
The Content Standards for Digital Geospatial Metadata identify a report on the positional accuracy of the stored coordinate values in representing the true location of the feature on the ground. Location of a feature represents one of several spatial characteristics that may be important for spatial representation of the feature. These characteristics include actual dimensions of the feature, boundary conditions of the feature, or topological relationship of the feature to other objects. This information represents spatial data quality. Measured or estimated accuracy values for these characteristics can be useful in evaluating the GIS data theme for display, use, and analysis. Where an estimate of accuracy can not be made, it may be possible to identify the amount of uncertainty associated with the location, dimensions, or boundary conditions of the feature.
Visual display of spatial data quality is one method for communicating data quality information. Criteria for effective visual communication of data quality to the user have been identified and described by Beard and Mackaness. These include ( Beard and Mackaness, 1993):
ArcView provides a set of relatively easy to use tools for displaying and evaluating spatial data quality.
An evaluation of GIS data themes is a comparison of the digital representation against our model of the real world features. Real world geographic features are represented in GIS as points, lines, polygons, or raster cells based on their representation in source documents, on the digital capture method used, and on geoprocessing required for the data. Our digital representation is constrained by the geometric characteristics of these features.
Increasingly, data is being incorporated into GIS which is not strictly coordinate based. This includes image files, remote sensing data, surveys, and other measurement based systems. Bringing data sets captured from these different sources together for both display and analysis is part of the process in GIS. Information from these other data sources are used to adjust and to update traditional GIS data themes and are used in spatial analysis. Effectively capturing and displaying information on the accuracy and uncertainty in these spatial characteristics is important in the application and use of that data.
One method for identifying positional accuracy of GIS data themes is comparison of the digital data with independent sources of higher accuracy (FGDC Workbook, 1995).
Figure 2 displays an image in ArcView with GIS themes of roads and railroads. The digital raster graphic (DRG) for the Courtland California 7.5 min quad is displayed as a back ground image. The GIS data themes were captured from a 1:100,000 scale source. Displacement of features between the data sources can be clearly seen in the ArcView display.
One surrogate for identifying the display resolution for GIS data themes is source scale. This can be based on the minimum resolution at which features can be identified or on map accuracy standards for that scale. The former National Map Accuracy Standards requires that 90 percent of well defined points be within 0.085 cm (1/30 inch) of their true location for map scales larger than 1:20,000 or 0.051 cm (1/50 inch) for smaller map scales. Estimates based on minimum resolution or source scale have been used as guides where no other information has been available.
At a scale of 1:100,000, the maximum expected accuracy of well defined points is about 50 meters following the former National Map Accuracy Standards. ArcView permits the measurement of features in the View display. The area displayed shows a displacement in these digital features of about 20 to 40 meters. This displacement is less than might be expected for linear features captured at this scale. The 1:100,000 scale map series for the United States was compiled from the best available source for the area. For this portion of California, the best available source was the 1:24,000 scale map series.
Figure 3 is an ArcView display of a portion of the digital orthoquad (DOQ) for the Courtland California area with surface hydrography and roads. The orthoquad image was developed in 1998 from NAPP photography flown in June of 1993 using a 1:24,000 scale digital elevation model (DEM) developed in 1998. The blue lines shown in this display are from a surface hydrography GIS theme captured from the USGS 7.5 quad for Courtland California. This 1:24,000 scale quad was initially prepared in 1973 with a photo revision in 1993. The red lines shown are roads from the 1:100,000 scale digital files shown in previous figures. The source of the roads theme is the USGS digital line graph (DLG) files developed from the 1:100,000 scale map series. The 1:100,000 scale map for this area was compiled in 1978 and this particular area is based on the 1968 1:24,000 scale map for Courtland.
Comparison between these sources for this portion of the Courtland quad show a displacement of 8 to 12 meters for the hydrography theme and 15 to 30 meters for the road theme. Arrows in the figure indicate areas showing the maximum displacement. The DOQ was constructed based on the North American datum of 1983(NAD1983). The GIS data themes were both developed using the former datum (NAD1927). From the visual inspection of the data, it is not possible to identify what displacement is due to differences between the two datums, changes in the river channel or road location between the dates of the various sources, scale differences in the sources, or errors in digital data development. However, it is possible to estimate the accuracy or uncertainty in the location of the roads and hydrography relative to the DOQ image. It provides a guide as to what the minimum scale should be for display of these two themes. Estimating the accuracy of position based on source scale alone can be misleading.
Another method of identifying positional accuracy is comparison of the digital data to the source documents (FGDC Workbook, 1995). This has usually been interpreted as a visual comparison of plots of the digital theme against the features shown on the source map. It can also include the evaluation of information contained in the source documents such as symbol size, legend descriptions, and map unit descriptions. The source documents can provide information on the accuracy or uncertainty in the position of features.
Historically, compiled maps often used symbology to indicate accuracy or uncertainty in spatial characteristics such as feature position. This can include symbol size, pattern, and color. Accuracy in location is only one of many feature characteristics that the symbols may be indicating. McGranaghan identifies a variety of graphical methods for displaying data quality (McGranaghan, 1993). Uncertainty in feature position may be indicated by a wide dark symbol or a fine very light symbol. Proper interpretation of symbology requires other information such as a descriptive legend, map unit descriptions, or other information in the source documents.
Atwater used symbology to indicate uncertainty in the position of features in his study of geology in the Sacramento - San Joaquin Delta of California (Atwater, 1982). Atwater recognized uncertainty in the location of late Holocene stream channels, tidal boundaries at about 1850, and boundaries in geomorphic units. He described this uncertainty in his map legend and report accompanying the maps. The map compilation uses different line weights and line symbols to indicate this uncertainty. Line symbols represent the ambiguity or uncertainty in boundary placement as follows:
| Map Feature | Line Symbology | Uncertainty (meters) |
|---|---|---|
| Boundaries between Geologic Units | Heavy Solid Line | Within 150 | Heavy Dashed Line | Greater than 150 | Dotted Line | Inferred much greater than 150 |
| Late Holocene Channel | Light Solid Line | Within 450 | Light Dashed | Greater than 450 |
| Tidal Line at about 1850 | Wide Hachured Line | Within 300 | Hachured Line with ? | Greater than 300 |
Features from Atwater's compiled maps were digitally captured including attributes for the uncertainty associated with each of the symbols.
Figure 4 is a display in ArcView of a portion of the digital data with the arcs symbolized to represent the uncertainty indicated by Atwater in the map legend and report.
Atwater compiled his maps on 1:24,000 scale USGS topographic maps for the area. Compilation on controlled base maps at a scale of 1:24,000 could lead to an assumed accuracy of about 12 meters for features that could clearly be located on the base map. Map compilation at this scale and the size of the map symbols provide a basis for another estimate in the positional accuracy of the features. Table 2 identifies the approximate accuracy of feature location based on symbol line width and the uncertainty indicated by Atwater.
| Map Feature - Size (cm) | Symbol Width on Ground (meters) | Uncertainty from Atwater (meters) |
|---|---|---|
| Geologic Units - 0.10 | 24 | 150 | Holocene Stream Channels - 0.035 | 9 | 450 |
| Tidal Line at 1850 - 0.20 | 48 | 300 |
These different estimates of accuracy or uncertainty in feature position were combined for
display with the spatial accuracy extension. Figure 5 shows a portion of the
Courtland quad displaying these features.
This display illustrates that the features with the greatest uncertainty, late Holocene stream channels, were compiled on the source map with a pen size reflecting greater accuracy than the other features. The late Holocene stream channels are displayed in the wide blue line with the dashed pink line. The width of the blue line in the display is about 450 meters. The boundaries between the geologic units have the least uncertainty in their location. These are the black lines in the display with a narrow purple line. The width of the black line in the display is about 150 meters. The width of the purple line is about 24 meters. The tidal line is represented by the dark green line with the dashed red and yellow lines superimposed on top. The width of the green line is about 300 meters. The red dashed line is about 48 meters representing the symbol size on the compiled source map. The yellow line is about 15 meters wide which was the accuracy assigned in the spatial accuracy extension based on the source scale of 1:24,000.
Other GIS data sets often have attributes with measured values for spatial characteristics such as width, length, or size. Real world dimensions of features in GIS themes provide the opportunity to evaluate our digital representation against actually measured values. Where features have repeated measured values, an accuracy value can be reported for that measurement. Global positioning systems (GPS) provide the ability to report feature coordinate position with an accuracy value. Operated in survey mode, GPS receivers can provide distance and area with estimates of accuracy for these derived values. On the ground surveys provide internally consistent measured or derived values for distance, direction, and area. New remote sensing and geophysical techniques are additional sources of measured values. For example, they are being used to capture elevations for surfaces and vegetation canopies. They are being used to determine relative contacts and areas in the land and water interface. Surveys and other measurement values reside independently of a GIS coordinate system. These values can be used to generate the features in a new coordinate system (Durgin, 1993).
Figure 6 shows the road and rail GIS themes from 1:100,000 scale DLG files with the symbol size set to approximate the width of the features and the uncertainty in their position. The ArcView display scale for this figure is 1:10,000 with the Courtland DOQ in the back ground. The spatial accuracy extension set the symbol size for these themes for this display scale. The lines representing rail lines have an outer red line about 50 meters wide representing the uncertainty in location with a black line representing the U.S. standard gage width of 1.4 meters. The secondary road shown in the display has a red line representing a road width of 10 meters on top of a blue line representing the uncertainty in the position of the road of 50 meters. The road theme contains multilane highways also represented as a single line. Multilane highways in this area have a width of from about 50 to 300 meters including the road median and pavement.
Figure 7 shows a U.S. Coast Guard navigation light along the Sacramento River on the DRG. The spatial accuracy extension has been used to set the marker size for the GIS data layer representing these navigation aids. Three sizes have been set. The outer brown circle represents an assumed accuracy in capturing the USCG coordinates for the light of 20 meters (Actual display diameter is about 15 meters). The inner blue circle represents an assumed accuracy of 10 meters (Actual display diameter is about 8.5 meters). The solid red circle represents the assumed size of the light at 2 meters (Actual display diameter is about 0.8 meters). The open circle is the symbol on the DRG identifying the position of the light and is about 26 meters in diameter.
For several local areas, underlying surveys control the development of GIS data bases. High quality information from these sources can serve to control and update our other GIS themes. Part of the underlying control for most map series in the United States is the geodetic network reported by the National Geodetic Survey (NGS).
Figure 8 shows the NGS control station, Courtland, along the Sacramento River as a red triangle on the DRG and DOQ and a picture of the marker. The Courtland marker has been in place since 1931. It has served as reference point for repeated geodetic surveys for both horizontal and vertical control in the Delta area. In 1998, the horizontal and vertical coordinates of NGS stations for the Delta were updated to current horizontal and vertical datums. This high order survey achieved an accuracy of 2 cm for the network. These stations and the survey network are the basis for updating other locations within the area. The red triangles are the locations of the Courtland station based on NGS coordinates on the separate images. The red triangles are about 12 meters in size.
Figures 5, 6, 7, and 8 were prepared using the spatial accuracy extension. Properties of View scale, View units, and symbol size provide the basis for assessing and assigning accuracy, uncertainty, or feature dimensions. This information is stored by the extension as comments for the theme. The assigned values are used to set the symbol size for display of the themes.
This ArcView extension is designed to assess and to report the accuracy, uncertainty, or actual dimensions of features that we are representing as digital themes. It operates in the View document window. Themes that have point, line, or polygon topology may be assigned accuracy values with the extension. The extension is designed to display the theme with a symbol size that approximates the values assigned for accuracy, uncertainty, or real world dimensions. The View units must be set and the values assigned to the theme feature must be in the same units as the View units.
The extension is run from the View GUI. Items in the Spatial Accuracy menu initiate all actions for the extension. Figure 16 shows the options under this menu.
Figure 9 shows the initial dialog for assigning an accuracy value.
In this dialog, the user selects the point, line, or polygon theme for assigning values for accuracy, uncertainty, or actual feature dimensions. The user has the option of either assigning an overall value for the theme or for assigning different values based on the attributes for the theme.
When the user selects the option of assigning an overall value for the theme, a second dialog opens. Figure 10 shows this dialog.
This dialog is for entering an overall value of accuracy for the theme. The units for this value are the units of the View. The dialog provides help in estimating this value if the source scale of the GIS theme is known. The extension calculates two different estimates of accuracy based on a source scale. These estimates are reported in View units. In this example, the source scale of the GIS theme is 1:24,000 and the View units are meters. The first estimate is based on the former National Map Accuracy Standards. For a scale of 1:24,000, 90 percent of well defined points should be within 12 meters of their true location. The second estimate is based on a displacement or symbol size on the source map of 0.1 cm. In this example, a displacement or symbol width of 0.1 cm represents 24 meters on the ground. Based on this value, the user can measure the feature on the source map and calculate a value. For controlled GPS data or where measurements have been made in the field, source scale is not appropriate and the reported value is entered. For this particular theme, Atwater reported values of uncertainty in his source documents and this value (150 meters) is entered. Once a value has been entered, a report on spatial accuracy, uncertainty, or the dimensions of features is written to the comment section of the theme. The theme name and the value associated with the theme is stored in lists for access as the user changes the display scale of the View.
Assigning values for accuracy, uncertainty, or actual dimensions to individual theme features takes a few more steps. The extension uses the theme attribute table for this assignment and the Legend Editor. Only themes containing attributes for lines or points can use this option. Since the extension sizes the symbols to approximate the real world dimensions of the features in the View and polygons use fill symbols, polygon attributes are excluded. To represent different values of accuracy or uncertainty in polygon boundaries, the boundaries must be represented by a line theme. Accuracy or uncertainty in the identification of the polygon or the extent of the feature represented by the polygon can also be effectively modeled in raster format. The uncertainty in the position or identification of polygon boundaries for the Atwater data is described in Visualizing Uncertainty Captured from Source Documents in GRID (Hansen, 1998).
When accuracy values are based on theme features, the user selects a field from the theme attribute table. Figure 11 shows the dialog for selecting a field from the theme attribute table.
The dialog will display the number of unique values and for numeric fields the minimum and maximum range of values for the selected field. The user has the option of symbolizing the theme either by the unique value of a field or by classifying the values for the field. Once a field has been selected, the extension sets an initial legend for the field and opens the Legend Editor. With the Legend Editor, the user selects desired colors, the legend classification method, and the number of classes.
Figure 12 shows the Legend Editor for classifying and assigning symbology. This is the normal Legend Editor for ArcView. For the extension, two legend classification methods are permitted. The user may symbolize the theme based on unique values for the field or may classify and symbolize the field with graduated color. Other classification methods are not useful for this extension. The extension resets the symbol size based on the values assigned to the feature. The dialog that is open with the Legend Editor provides information for the user. Once the Apply and Done buttons are clicked, the extension evaluates the user assigned legend and proceeds to the final step of assigning values for accuracy, uncertainty, or dimensions. Before this final step, a legend file containing the parameters of the legend is saved.
In this final step, the user assigns accuracy based on the values in the theme's
legend. Figure 13 shows this step.
This step opens two dialogs. One dialog displays the assigned values for each class in the legend. The main dialog displays each class in the legend and a field for entering a value for accuracy, uncertainty, or dimensions. Once a value has been entered, this value is assigned to any legend classes on which the user clicks. The dialog also provides a field for entering the source scale of the theme. Entering a source scale provides the same information that entering a scale does for assigning an overall accuracy value.
When the user has finished entering values, a spatial accuracy report is written to the comment section for the theme. This report identifies the field used to assign spatial accuracy or uncertainty, the legend values, and the accuracy values. These values are also stored in lists that are used by the extension to adjust the symbol size for the theme features as the View display scale changes.
The extension relies on changes in the View display scale to set the symbol size for themes that have been assigned values. A list of themes that have been assigned values is presented for selection. The themes that are selected from this list will have their symbol size adjusted to approximate the accuracy value. The extension recalculates the symbol size in points based on units of the view, the assigned value, and the display scale. The View units determine the conversion value between point size and View units. One point is considered to be 0.035277 cm (1/72 inch). As the user zooms in or out of the View, the extension recalculates the point size for the selected themes or features of the themes and updates the legend. This calculation follows this formula:
Point Size
= (Accuracy Value / Display Scale) / Point Conversion Value
The point size calculated for the symbol approximates the value of accuracy or uncertainty in the View display. The line symbol will be slightly larger than the assigned value. Marker symbols depend on the pattern cell used for the symbol. The display size of the marker symbol will be about 25 percent smaller than the value of accuracy or uncertainty. This size will vary depending on the particular marker symbol used.
Themes selected from the display list will have their symbols adjusted as the display scale changes. The user can adjust the legend at any time for the themes with the Legend Editor. Changes must be consistent with the assignment of accuracy values to the theme. This is particularly true where the user has assigned accuracy values to a theme's features. If the number of classes or a different field is selected for symbolizing that theme, the extension will reset the legend to the saved legend or .avl file.
One of the main intentions of this extension is the reporting of accuracy, uncertainty, or dimensions assigned to the theme. When values have been assigned to a theme, an accuracy report is written to the comment section of the theme.
This report can be written to a simple text file at any time by the user. In addition, if a theme is deleted from the View that has an accuracy report, the user is prompted to save this accuracy report as a text file. This text file is an initial spatial accuracy report. It is up to the user to review, update, and add other information to this report to effectively describe the spatial accuracy of that theme.
For themes that have been assigned accuracy values at the feature level, the user can prepare a table. Figure 15 shows a table prepared by the extension.
This table contains the following fields:
ArcView is an excellent platform for viewing spatial data in a variety of formats from different sources. The data developer or evaluator can view separate sources and make an assessment of accuracy or uncertainty based on the best source for those features. The spatial accuracy extension is designed as an exploratory tool for displaying what is known or can be inferred about the spatial accuracy or uncertainty of a theme. It allows the user to assign a value for accuracy, uncertainty, or dimensions, evaluate that assignment in the View display, and report on that assessment. Any theme with point, line, or polygon topology can be used. In ArcView, multiple themes can be generated from the same GIS data source and assigned different accuracy values for visual display and evaluation. The spatial accuracy extension:
Salient spatial characteristics of the features can be used in ArcView for display of the feature. Qualitative as well as quantitative information on accuracy, uncertainty, and dimensions can assist in describing the quality of the digital model in representing the real world features. Displayed visually, qualitative information expressed as a value is useful in an exploratory environment. Uncertainties and ambiguities which are readily apparent in the source material can be represented as different accuracy values and visually displayed. Relationships inherent to the data and variation in characteristics such as source scale, location, size, or boundary conditions can be displayed and explored. Values assigned using this extension can be reported as part of a spatial accuracy report. This report needs to be reviewed and updated to include other relevant information about the spatial accuracy of the data.
Our GIS data model is expanding to include additional information about the features that we are representing. The geodatabase model includes characteristcs of the data which can be used in display and analysis. Where information is available on the spatial quality of digital data, it becomes part of the properties of the data. Dutton describes a hierarchical model for controling GIS analysis based on source scale (Dutton, 1996). Commonly accepted and adopted properties on spatial quality can determine data formats appropriate for analysis and control tolerances for geoprocessing. They can determine the precedence of data in GIS overlay or intersection operations. The information can determine how the data is combined with other digital objects.
The computer desktop environment is robust and supports GIS and geoprocessing. The development of graphical user interfaces (GUI) has brought GIS into active use not only at all levels of government but also into schools, community groups, and the public. All types of data are being brought into the GIS desktop to address a range of local, regional, and global issues. The accuracy, uncertainty, or dimensions associated with this data can assist in these applications. As a user community including software developers, we need to identify spatial characteristics that are important for our digital models of real world features. We need to discuss appropriate methods for reporting spatial properties of the digital features and displaying those properties. I am looking for collaborators from the user community in the development and review of a guide on assessing spatial accuracy (Hansen, draft guide 1998).
Atwater, Brian, Geologic Maps of the Sacramento - San Joaquin Delta, California. Miscellaneous Field Studies Map MF-1401. Denver CO: U. S. Geological Survey, 1982
Beard, Kate and William MacKaness, Visual Access to Data Quality in Geographic Information Systems. Cartographica. Vol 30 No. 2-3:1993
Durgin, Paul M., Measurement Based Databases: One Approach to the Integration of Survey and GIS Cadastral Data, Surveying and Land Information Systems, Vol. 53, No. 1 Pg. 41-47, 1993.
Dutton, Geoffrey, Improving Locational Specificity of Map Data -- a Multi-Resolution, Metadata-Driven Approach and Notation, International of Geographic Information Systems; Vol. 10, No. 3, Pg. 253-268, 1996.
Federal Geographic Data Committee, Content Standards for Digital Geospatial Metadata. Washington, D.C., June, 1994 (Revised 1998).
Federal Geographic Data Committee, Content Standard for Digital Geospatial Metadata Workbook. Version 1.0: March 24, 1995
Hansen, David T., Visualizing Uncertainty Captured from Source Documents with GRID., Proceedings of the Eighteenth Annual Esri International User Conference, San Diego, CA, July 1998.
Hansen, David T., Guide on Assessing Spatial Accuracy of Digital Geospatial Data, Draft Standard Guide for review by ASTM D18 Technical Committees; ASTM, 100 Barr Harbor Drive, West Conshohocken, PA 194428-2959, April, 1998.
McGranaghan, Matthew, A Cartographic View of Spatial Data Quality. Cartographica. Vol30 No 2-3:1993