Abstract: One of the most important and challenging tasks of GIS implementation is performing quality assurance and quality control (QA/QC) of converted data. Pressure from schedules, a significant increase in workload, and the tediousness of the task all contribute to the difficulty of allotting QA/QC the attention it deserves. Fortunately, digital quality assurance routines can be developed to perform many routine checks and analyses. This paper will describe the process and illustrate the Arc Macro Language (AML) scripts that were used for quality assurance of coverage, SDE, and SQL Server data at the City of Franklin, WI.
The City of Franklin, Wisconsin, initially retained PlanGraphics, Inc., an independent geographic information systems (GIS) consulting firm, to assess existing geographic information utilized within several cooperating agencies which could be incorporated into a city-wide GIS. After inventorying, designing, and planning for the Enterprise GIS, the City began its implementation. As a first step, the City contracted for conversion of existing spatial data and hard copy maps into ArcInfo coverage and ArcSDE layer formats and for conversion of attribute data into Microsoft SQL Server 2000 format. Digital data consisted of geodetic control, planimetric/topographic, cadastral, street centerline addresses, water system, sanitary sewer system, and property attribute data. The spatial data was primarily stored in GenaMap file format, with some data in AutoCAD format. Property attribute data was stored in a Computer Aided Mass Appraisal (CAMA) system UNIVERS database maintained by the Assessor’s office. Hard copy maps detailed zoning, selected administrative boundaries, and neighborhood areas. The data converted in this project was obtained from several sources—the Milwaukee County Automated Mapping and Land Information System (MCAMLIS) program, the City of Franklin Public Works Department, the City of Franklin Assessor’s Office, and the Southeast Wisconsin Regional Planning Commission (SEWRPC).
PlanGraphics’ task during the data conversion process was to ensure that the converted data complied with the physical database design previously commissioned by the City and that the data itself was accurately converted when compared to the original source data.
The success and timely completion of the GIS data conversion project for the City of Franklin depended on the establishment of an efficient process for transmitting, reviewing, and accepting products from the contractor. The long-term value of the GIS data conversion project was also dependent on a comprehensive and thorough quality checking (i.e., quality assurance) performed uniformly on all components of the delivered data.
The contractor delivered the converted data in five delivery areas based on U.S. Public Land Survey System (PLSS) sections, along with the original source data and digital orthophotos (DOQs). PlanGraphics reviewed each delivery according to an established QA process, which included automated checks performed in ArcInfo and manual checks performed in ArcView. These tests examined edgematching of features and attributes between adjacent map sheets, topology (i.e., connectivity between graphic elements, contiguity between adjacent boundaries, and calculation of area or length), line quality (i.e., smooth transitions from straight line to curvilinear segments, etc.), feature coding, and feature types.
Acceptance of the converted data was based on one of four outcomes:
1. Rejected—The converted data had too many structural errors and required general rework.
2. Corrections to be made by the Contractor—The contractor was required to make specified corrections to the data before acceptance.
3. Accepted—The converted data was accepted as submitted.
4. Corrections to be made by the City of Franklin—The submittal was acceptable, however, minor corrections were noted that would be made by the City of Franklin.
The data conversion review was performed in two stages—Pilot Study Review and Final Digital Review (Figure 1). A pilot study area of one PLSS section with representative layers was selected for initial data conversion. After acceptance of the pilot study area, the contractor delivered the remaining digital data in four delivery areas, each of which also incorporated all previous delivery areas. The data in each delivery was verified using the QA process. The data in the most recent delivery was also checked against the previous delivery to verify that all required changes were made and that no problems were created with the conversion of the new data.
Figure 1. Data Conversion Review
The process of verifying the converted data fell into two categories—visual and automated. Both methods were used to fully examine the delivered data. A brief overview of the visual process is explained in the following section. The automated process will be described in detail in the remainder of this document.
A comprehensive visual examination of the converted data was performed in ArcView 3.2a to—1) verify feature placement; 2) verify data elements; 3) verify feature coding; 4) compare the converted digital files against the original graphic and attribute files; and 5) assess the quality of data provided by the contractor. Any errors or discrepancies were flagged in a new QA shapefile and reported to the contractor.
The visual QA process involved an examination of the features of each ArcInfo coverage, including annotation placement/attribution and edgematching between PLSS quarter sections, compared to the original source data. The visual assessment was performed systematically across the extent of the coverage via panning and zooming methods. As the coverage was assessed, the corresponding original source data was viewed in the ArcView map window or in hardcopy format. The SDE layer was then visually compared for feature content and placement versus its associated ArcInfo coverage. Checkplots were also created as needed.
The visual QA process also included an examination of the SQL Server attribute tables. The assessment was performed in MS SQL Server Enterprise Manager by opening each table to verify that the table contained valid data per column. Again, any errors or discrepancies were noted and reported to the contractor.
The automated QA process used in the data conversion project utilized batch files and Arc Macro Language (AML) routines to perform systematic checking of spatial data and attribute tables in ArcInfo 8.1 Workstation. The automated QA process was used to validate the attribute parameters established in the physical database design and to confirm that all major system files necessary to access, display, and manipulate the graphic data, such as feature class tables, were installed. Reports were generated for each of the checks performed during the process.
AML was chosen as the programming language for the QA process since known routines were already available that could be adapted to the Franklin project. These routines were used as the base for the development of the QA AMLs.
Table 1 lists the batch files and AMLs used in the automated QA process.
Table 1. Automated QA Process Routines
QA Check |
Routine |
Description |
ArcInfo Coverage |
covqa.bat |
Initializes the ArcInfo coverage AMLs |
covqa_setup.aml |
Creates output CSV and TXT files for coverage results |
|
covqa.aml |
Checks the coverage against the design |
|
SDE |
sdeqa.bat |
Initializes the SDE AMLs |
sde_setup.aml |
Creates an output CSV file for SDE results |
|
sde_extract.aml |
Extracts information from the table schema |
|
sdeqa.aml |
Checks the SDE layer against the design |
|
SQL Server |
sqlqa.bat |
Initializes the SQL Server table AMLs |
sqlqa_setup.aml |
Creates an output CSV file for SQL Server table results |
|
sql_extract.aml |
Extracts information from the table schema |
|
sqlqa.aml |
Checks the SQL Server attribute table against the design |
The ArcInfo coverage parameters that were checked during the automated QA review included feature class, precision, projection, dangle distance, fuzzy tolerance, polygon topology errors, item definitions, annotation subclasses, labels, and dangle errors. In all, a total of 58 coverages were checked during the process.
The COVQA.AML was used to verify the coverage specifications, as detailed in the physical database design, by first describing the coverage; comparing the results to the design; and generating a report listing any errors found. The AML checked for the following information:
1. Double precision. The coverage had double precision.
2. Map projection. The coverage was in Wisconsin State Plane NAD27 coordinates.
3. Dangle errors. There were no dangle errors, which might indicate polygon closure problems.
4. Duplicate or missing point labels. Each polygon had a single label point.
5. Existence of inappropriate feature classes and feature attribute tables. There were no missing feature classes or attribute tables or any additional feature classes.
6. Feature attribute table. The feature attribute table was formatted correctly.
7. Annotation levels and subclasses. The annotation subclass was correct.
8. Tolerances. The specified tolerance was used in processing the coverage.
Two reports were output detailing the AML results. One report detailed the results in an easy to read text format, which was output for each coverage checked. The other report provided a summary of all of the results in CSV format. An example of the detailed report is found in Table 2, while Table 3 presents an example of the summary results. A similar summary results table was also output during the SDE and SQL QA processes, described later in this document.
Table 2. Example of Detailed Report
Date: 03 Dec 01 14:40:11 Monday Projection Information Coordinate System Test Precision Precision Test Tolerances Dangle Distance: 0 Tolerance Test Dangles Dangle Test Label Errors Labels Test Annotation Topology Topology has not been edited POLYGON Topology Test Attributes POLYGON Feature Table Test—Item Number POLYGON Feature Table Test—Item Format POLYGON Feature Table Test |
Table 3. Summary Results
COV QA Results |
|||||
Date: 03 Dec 01 14:40:11 Monday |
|||||
Table |
Test |
Status |
Error |
Layer Description |
Design Description |
ALDERMANIC |
Coordinate System Test |
PASS |
|||
ALDERMANIC |
Precision Test |
PASS |
|||
ALDERMANIC |
Tolerance Test |
PASS |
|||
ALDERMANIC |
Dangle Test |
PASS |
|||
ALDERMANIC |
Labels Test |
PASS |
|||
ALDERMANIC |
Annotation Test |
N/A |
|||
ALDERMANIC |
POLYGON Topology Test |
PASS |
|||
ALDERMANIC |
POLYGON Feature Table Test |
PASS |
The first step in the coverage QA process was to create a text file for each feature class, detailing the expected attributes. The required name and format for each item within the feature class were obtained from the database design. Table 4 provides an example of the information contained in the text file. During the automated QA process, as each coverage in a delivery unit was checked, an output text file for each feature class was created. The output text file was then used to check the required database design item format found in the text file created in the first step of this process against the converted data item format.
Table 4. Item Format
AREA,8,18,F,5 |
The next step during the process was to initiate the COVQA.BAT file. The COVQA.BAT file launched the COVQA_SETUP.AML, which created a run list from which to process each coverage with the COVQA.AML. The list took the form of ARC "&r covqa.aml <coverage>". The list was generated from the directory of coverages. The COVQA_SETUP.AML then created the output CSV and TXT files for the QA results.
The final step during the process was to initiate the run list for the COVQA.AML from within the batch file. Once the COVQA.AML was launched, the AML performed the following items:
After the AML process was complete, the technician interactively verified the results by spot-checking the results against the coverage using a combination of viewing problem areas in ArcEdit and examining attributes in ArcInfo Tables. Any discrepancies found were sent to the contractor in charge of converting the data.
Examples of the processes that the coverage QA routines performed are listed below:
1. Wrote to output files
/* Write out header information &sv date = [Date -vfull]&sv outstat = [write %outfile% [quote Date: %date%]] &sv ws = [Show workspace] &sv outstat = [write %outfile% [quote Workspace: %ws%]] &sv outstat = [write %outfile% [quote Coverage: %cov%]] |
2. Checked coverage parameters using the DESCRIBE command (i.e., coordinate system, precision, tolerances, dangles, label errors, annotation, and topology)
/* Check coordinate system &if %prj$name% = STATEPLANE and %prj$fipszone% = 4803 and
%prj$datum% = NAD27 and %prj$units% = FEET and %prj$spheroid% =
CLARKE1866 &then |
/* Check precision &if %dsc$precision% = DOUBLE &then |
/* Check tolerances &if %dsc$qfuzzy_verified% &then &if %dsc$qdangle_verified% &then &if %dsc$fuzzy% = 0.10000 and %dsc$dangle% = 0 and
%dsc$qfuzzy_verified% and %dsc$qdangle_verified% &then |
/* Check dangles &if %dsc$polygons% > 0 AND %dsc$qtopology%
&then |
/* Check polygon labels &if %dsc$polygons% > 0 AND %dsc$qtopology% &then |
/* Check annotation &if %dsc$annotationsubclasses% > 0 &then |
/* Check topology &if %dsc$qedit% &then |
3. Compared the contents of text files containing the expected record layout vs. the record layout as delivered.
/* Check items /* Check to see if there is a feature table in the design |
The SDE layer parameters that were checked during the QA review consisted of attribute items, item format, primary keys, foreign keys, and feature classes. The SDEQA.AML was used to verify the SDE layer specifications, as detailed in the physical database design, by comparing the layer to the design and generating a report listing any errors found (Table 5). In all, a total of 96 SDE layers were checked for each delivery area during the process.
Table 5. SDE Summary Report
SDE QA Results |
||||
Date: 03 Dec 01 14:54:15 Monday |
||||
Table |
Item |
Status |
Layer Description |
Design Description |
ALDERMANIC |
ALDERMANIC table |
PASS |
||
NEWDEV |
NEWDEV_KEY |
FAIL |
NEWDEV_KEY PK Numeric (8) N |
NEWDEV_KEY PK Numeric (4) N |
As a first step in the SDE QA process, a text file was created for each SDE layer detailing required item names and formats from the database design. Table 6 depicts a typical text file created at this stage containing the following—column name, key type, data type, precision, scale, and column definition.
Table 6. Item Format
ALDRCT_SHP,,SE_INTEGER,N DIST_ID,,Numeric (1),N DIST_ID,PK,Numeric (1),N |
Utilizing SQL Server Query Analyzer, a text file was generated for the SQL Server tables in the database, detailing the items found in Table 6. Due to the output format of the text file generated from the SQL statements, the text file required reformatting using MS Access before it could be used in the QA process. A new text file was then generated.
An additional text file was created containing the number of rows that each SDE layer contained. The text file was also generated using SQL statements in SQL Server Query Analyzer.
The next step in the process was to initialize the SDEQA.BAT file. The batch file launched the SDE_SETUP.AML, which created a run list from which to process the information for each SDE layer. The list took the form of ARC "&r sdeqa.aml <sde layer>". The list was generated from a directory of files that were imported into SQL Server. The SDE_SETUP.AML then created the output CSV file for the QA results.
The SDEQA.BAT file then launched the SDE_EXTRACT.AML, which extracted and reformatted information from the table schema text file generated from SQL Server. The extracted information was written to individual text files, one for each SDE layer.
The final step in the SDE QA process was to initiate the run list for the SDEQA.AML from within the batch file. The AML performed the following items:
1. Set initial path variables
2. Opened output files
3. Opened comparison files
4. Began writing to output files
After the AML process was complete, the technician interactively examined the results in SQL Server to verify that they were accurate. Any discrepancies found were sent to the contractor in charge of converting the data.
Examples of the processes that the SDE QA routines performed are listed below:
1. Wrote to output files
2. Checked the number of rows in the SDE layer against the number of records in the corresponding coverage
/* Check to see if record numbers match between SDE layer and coverage /* Start ArcEdit session/* Select features &if %recnum% = %selnum% &then &sv outstat = [write %outfile% [quote %sde%,%sde% table,PASS]] &else &sv outstat = [write %outfile% [quote %sde%,Number of features do not match,FAIL]] |
3. Compared the contents of text files containing the expected record layout vs. the record layout as delivered.
/* Check items /* Initially open the SDE text file and design text file to
count the number of items |
The SQL Server attribute table parameters that were checked during the QA review consisted of attribute items, item format, primary keys, and foreign keys. The SQLQA.AML was used to verify the SQL attribute table specifications, as detailed in the physical database design, by comparing the table to the design and generating a report listing any errors found (Table 7). In all, a total of 30 attribute tables were checked during the process.
Table 7. SQL Summary Report
SQL QA Results |
||||
Date: 03 Dec 01 16:18:17 Monday |
||||
Table |
Item |
Status |
Layer Description |
Design Description |
BLDG_ADDRSS |
BLDG_ADDRSS table |
PASS |
||
CONDO |
UNIT_NUM |
FAIL |
UNIT_NUM PK Character (4) N |
UNIT_NUM PK Character (4) N |
As a first step in the SQL QA process, a text file was created for each SQL attribute table detailing the item format from the database design. Table 8 depicts a typical text file created at this stage containing the following—column name, key type, data type, precision, scale, and column definition.
Table 8. Item Format
BLDG_KEY,,LONG INT,N BLDG_KEY,FK,LONG INT,N BLDG_KEY,PK,LONG INT,N PRCLADD_KEY,,Numeric (9),N PRCLADD_KEY,FK,Numeric (9),N PRCLADD_KEY,PK,Numeric (9),N |
The next step in the process was to initialize the SQLQA.BAT file. The batch file launched the SQL_SETUP.AML, which created a run list from which to process the information for each SQL attribute table. The list took the form of ARC "&r sqlqa.aml <sql attribute table>". The list was generated from a directory of files that were imported into SQL Server. The SQL_SETUP.AML then created the output CSV file for the QA results.
The SQLQA.BAT file then launched the SQL_EXTRACT.AML, which extracted and reformatted information from the table schema text file generated from SQL Server as described in the SDE QA process. The extracted information was written to individual text files, one for each SQL attribute table.
The final step in the SQL QA process was to initiate the run list for the SQLQA.AML from within the batch file. The AML performed the following items:
1. Set initial path variables
2. Opened output files
3. Opened comparison files
4. Began writing to output files
After the AML process was complete, the technician interactively examined the results in SQL Server to verify that they were accurate. Any discrepancies found were sent to the contractor in charge of converting the data.
Examples of the processes that the SQL QA routines performed are listed below:
1. Wrote to output files
2. Compared the contents of text files containing the expected record layout vs. the record layout as delivered.
/* Check items /* Open the SQL text file and design text file to count the
number of items |
Results of the QA process were submitted to the contractor in digital form. Acceptance/Rejection of a delivery area was communicated to the contractor using the results found in the visual and automated QA processes. The QA process was finalized after all delivery areas were accepted and a composite of the City was delivered.
Automated and visual QA procedures that were utilized in the data conversion process for the City of Franklin, WI, assisted in the assessment of converted data. The visual QA process assessed the converted data versus the original source data. The automated QA process assessed the converted data against the established physical database design. The automated QA process assessed the ArcInfo coverage, SDE layers, and SQL Server attribute tables using AMLs. Using the AMLs greatly decreased the time it would have usually taken to assess the data and increased the confidence that the data was accurately and completely checked.
This effort also proved that AML can serve as a tool for a wide range of validation tests of data files, formats, and contents. Examples of the techniques used in these reviews include:
1. Writing to output files
2. Comparing the contents of text files containing the expected
record layout vs. the record layout as delivered
3. Checking coverage parameters using the DESCRIBE command (i.e.,
coordinate system, precision, tolerances, dangles, label errors,
annotation, and topology)
4. Checking the number of rows in the SDE layer against the
number of records in the corresponding coverage.
AML was found to be a flexible tool that was well suited for this type of QA, both in its programmatic capabilities and its ability to extract information from coverages and text files. The type of routines that were used in this project can be adapted to work with other such projects that utilize ArcInfo coverages and text-based information.
PlanGraphics’ Project Team—Franki Jenkins, Annie Metcalf, Tami Rudnick, and Mike Walls