Stewart Dary, GIS Manager, Orange County Property Appraiser's Office, Orlando, Florida and David van Blankenstein, Safe Software, Inc., Surrey, BC, Canada

GIS Conversion Project from Vision to Esri

A major component of the Orange County Property Appraiser’s (OCPA’s) long-term re-engineering plan involved the conversion of its property appraisal land base from Vision to ArcInfo. This was performed in a joint venture between OCPA and Safe Software using the Feature Manipulation Engine (FME). The task was accomplished by side stepping the Vision GIS structures and hitting the underlying relational table structures directly with the FME to convert the over 4 million geometric objects cleanly into ArcInfo. Among the challenges of converting this data set was the sheer volume of data, dealing with a data set that could not be easily spatially partitioned, and cleaning and transferring the county’s critical parcel-related information. In addition, the Property Appraisers Office applied rigorous quality control and assurance procedures to ensure that all information was transferred correctly and to identify data problems on the Vision side for correction before final conversion. This paper will discuss the life cycle of the conversion and the many challenges and successes met along the way.


 

 

 

 

 

GIS Conversion Project from Vision to Esri

 

 

Stewart Dary, GIS Manager

Orange County Property Appraisers Office

Orlando, Florida

David van Blankenstein

Safe Software Inc

Surrey, BC, Canada

 

 

 

June 2, 2000

Abstract

A major component of the Orange County Property Appraiser’s (OCPA’s) long-term re-engineering plan involved the conversion of its property appraisal land base from Vision to ArcInfo. This was performed in a joint venture between OCPA and Safe Software using the Feature Manipulation Engine (FME). The task was accomplished by side stepping the Vision GIS structures and hitting the underlying relational table structures directly with the FME to convert the over 4 million geometric objects cleanly into ArcInfo. Among the challenges of converting this data set was the sheer volume of data, dealing with a data set that could not be easily spatially partitioned, and cleaning and transferring the county’s critical parcel-related information. In addition, the Property Appraisers Office applied rigorous quality control and assurance procedures to ensure that all information was transferred correctly and to identify data problems on the Vision side for correction before final conversion. This paper will discuss the life cycle of the conversion and the many challenges and successes met along the way.

 

 

Figure 1. Orange County, Florida

Figure 1. Orange County, Florida

Table of Contents

 

1.0 Introduction *

1.1 Business Case *

1.2 Implementation Plan *

1.3 Technical and Process Overview *

1.4 System Environment *

2.0 Translation from Vision to E00 *

2.1 Vision Database *

2.2 Translation by FME *

2.1 Polygon Construction *

3.0 Quality Assurance and Error Resolution 12

5.0 Conclusion 16

1.0 Introduction

 

1.1 Business Case

The Orange County Property Appraiser's (OCPA) Office resolved to convert its Geographic Information System (GIS) from Vision to the Esri data model and software products largely for two reasons:

The OCPA staff required proper tools to rapidly develop software applications to access the knowledge held in the GIS. Meanwhile, the labor market offered few candidates with Vision programming experience. Moreover, Vision did not have an "out-of-the-box," user-friendly software package for the desktop PC, an item deemed critical by the Property Appraiser. The Property Appraiser's Office determined that an easy-to-use desktop package was essential to the rapid development and distribution of sophisticated spatial analysis tools to non-programmers in the organization, one that required minimal training. Prior to the conversion, OCPA GIS users were constrained to a customized Vision application that displayed raster tax maps generated the previous evening. Access to "live" GIS data was not available. Changes to Vision map templates involved lengthy programming efforts and the Vision software provided limited topological rules, leading to data integrity issues (for example, the database contained open parcel polygons, duplicate arcs, and road centerlines that at times failed to intersect).

OCPA needed a software package from a market leader that offered an advanced data model for use with a commercial relational database management system (RDBMS) and had a full range of GIS products that could serve the spectrum of "power," "moderate," and "casual" GIS users in the organization. This point was particularly important as OCPA prepared to re-engineer its entire GIS parcel database. The database that existed in 1998 was developed in the late 1980s through a process of scanning, digitizing, and rubber sheeting. Limited survey data was available with ground control. The plan for 1999 was twofold: coordinate with Orange County and cities in the county to densify the county’s rectangular survey network to a half-mile grid, and employ coordinate geometry (COGO) procedures to enter property boundaries into the GIS using original source documents (i.e., plats and deeds). Esri offered the best software solution to accomplish these goals.

To translate the information from Vision into the Esri GIS solution, OCPA required a software tool with the following properties:

The application selected was the Feature Manipulation Engine (FMEÒ ) developed by Safe Software Inc. The FME is capable of reading directly from the source Oracle database which contained the Vision database, assembling the extracted features into consistent geometry and topology, and writing this directly into all of the Esri supported formats. This provided OCPA with an application that not only supported the translation of the source data, but also provided a powerful toolbox for identifying and resolving problems in the source database. A translation process could be defined and the conversion automated. Further, the FME currently provides a versatile data translation portal, allowing information from the new GIS database to the outside world in a range of industry standard GIS formats, such as AutoCAD DWG/DXF, MicroStation DGN, and MapInfo to name but a few.

 

1.2 Implementation Plan

The plan to re-engineer the parcel database required a four-step process:

Before the conversion began, OCPA staff analyzed the existing Vision data schema. Little documentation had been collected since the late 1980s. In addition, the Vision layers and "networks" had to be mapped into the desired ArcInfo coverage and feature class schema. This process was accomplished as a first step and involved both individual research and collaborative team meetings.

 

1.3 Technical and Process Overview

The process of translating the information out of the Vision database, into E00 files and finally into the SDE products involved three data stores and three automated and one manual process. This is shown in Figure 2.

 

 

Figure 2: Process Diagram

Figure 2: Process Diagram

 

  1. Vision Database: This database consisted of the OCPA database for Orange County, stored in Oracle and maintained by Vision. The database contained point, line, polygon, node and annotation data modeling lots, parcels, blocks, subdivisions (all cadastral information), and other general layers such as major roads, building outlines, section maps, hydrology. The layers of the database also had associated business attributes indirectly joined to the geometries.
  2. Vision to E00: This collection of FME mapping files processed the Vision database into E00 files. During the translation the geometries, the associated business attributes, and topology where output. In addition any errors or geometric inconsistencies where detected and identified;
  3. E00 Results: The output of the FME translation process consisting of E00 files translated form of the Vision database, plus any topological or geometric inconsistencies found in the Vision database;
  4. QA Process: A semi-automated QA process applied to all of the translated E00 files to search for any inconsistencies in the result. ArcInfo 7.2 was used for both manual and automated driven analysis (with AML) of the translated result. Any problems detected where researched and the resolved, with the corrections being fed back into the Vision database.
  5. Cov2Sde: Once the Vision database had reached a target level of correctness it was archived, and a final translation to E00 performed. These E00 files then formed the base for the population of the target SDE database. The SDE utility cov2sde was used to load the coverage files into SDE tables;
  6. SDE Database: The SDE database resulting from step 5;
  7. SDE to GIS/CAD Formats: A collection of FME mapping files satisfying requirements for extractions of data from the SDE database. The result of the query (by layer and spatial extent) is translated into a requested format. These formats include Shape, E00, MicroStation Design Files, MapInfo native format, or AutoCAD DWG/DXF files. If the format is capable of supporting attributes then they are included. In addition a CSV table of cadastral information related to the output geometries could also be created.

The Vision database and GIS system ran in a Solaris environment, with translations by FME being performed on both Solaris and NT. The Vision GIS processing environment consisted of application software, proprietary data stored such as symbology and annotation placement information, and an application database in Oracle. The Oracle database formed against the Vision schema contained the geometry, topology and links to business data for the county. Vision uses a network-based topology, with data distinguished and grouped by layers and networks. This is discussed in greater detail in section 2.0

 

1.4 System Environment

After evaluating different hardware options, OCPA elected to implement a three-tiered solution. This consisted of an enterprise UNIX server supporting SDE 3.0.2 running on top of Oracle 8.0.4 to centrally store the spatial and property attribute information. An NT server functioned as a workgroup ArcInfo license manager, while NT workstation clients had ArcInfo 7.2 for data maintenance. ArcView 3.2 was used for data display and analysis. The data conversion was accomplished using FME on both the NT and UNIX servers. This two-pronged approach enabled the Office to take advantage of multiple CPUs, disk resources. This was important in consideration of the magnitude of the database.

 

 

2.0 Translation from Vision to E00

 

The Vision database was translated using FME for Esri, version 2.3a in both Solaris and NT environments.

 

2.1 Vision Database

There were two options for recovering this data from Vision with the FME; via GINA files (the Vision export format) or by directly accessing the Oracle schema and reconstructing the topology directly from the geometry. The latter was chosen primarily because the approach was required that would remove the need for Vision in the future (the schema can be accessed even after the Vision software has expired), and for quality assurance productivity (more robust and interactive tools where available in Esri suite of products).

A simplified version of the schema for a Vision database is shown in Figure 3. The g_master table contains a row for every feature in the database, uniquely identified by a combination of feature number, type, layer and network. This table identifies a single feature within the Vision database. Collections of features of common theme are grouped by their layer and network number from the g_master table. This query can be sub-grouped by geometry type. The types supported are point, line, polygon and node. The rows returned from the g_master (master or primary) tables are joined to the g_coord (coordinate) table by feature number. Depending on the geometric type, the join to the g_coord table is either one to one (points or nodes), or one to many (lines and polygons). The join from the g_master to the business table is always one to one, and the join to the g_label, (annotation or labeling) is one to many.

The Vision Oracle schema uses a central feature table that is classified by layers and networks. Simple and complex geometries are retrieved by selecting all coordinates for a given feature (identified by a unique integer) for a given layer-network combination using an SQL select statement against a coordinate table. Given the type of geometry (point, line, polygon or annotation) the correct FME features were constructed. Depending on the type of geometry being constructed, various topological constructions were performed by the FME. No construction was required on the point data. For linear features the detection of intersections was detected and inserted. For polygonal and donut features, intersections where detected and inserted, and invalid polygons where identified and stored separately from the valid polygons. Finally, all point, line or polygon annotation was re-constructed and placed into a text subclass. In some cases the annotation placement tools of the FME were used when the detailed placement information could not be recovered from Vision (some cases existed where annotation placement was stored by Vision in non-schema tables that were not accessible). In addition business attributes for the given feature where joined to the geometry.

 

Figure 3: Vision Data Model

 

Figure 3: Vision Data Model

Figure 4. List Showing Separate SDE Tables for Each Major Feature Class

Figure 4. List Showing Separate SDE Tables for Each Major Feature Class

 

2.2 Translation by FME

The translation exercise for FME became a query, interpret, validate and write loop. The iteration on the loop was against each layer, network, and feature type combination. All of the data would be selected from Oracle using a multi-join SQL query. The returned data was converted into its geometric type and transformed into the correct coordinate system (a shift and scale was required to bring the Vision coordinate system into alignment with the state plane). If the geometric type had a linear component, then group and self intersection tests where performed on the lines. If the geometry type was polygonal (including donuts) then tests where performed to detect open polygons, polygons without labels and polygons with multiple labels. For the layers converted, the polygon sets either formed partitions or partitions with holes (in all cases overlapping areas where detected and resolved).

The FME drives all translations with either system generated or custom mapping files. Mapping files are ASCII text files that are interpreted by the FME at run time. Depending on the process to be custom developed, a good architecture can greatly help in development and data problem resolution.

The mapping files contains five primary components:

  1. Reader: In a mapping file the reader is a call to method within the FME that supports the reading of a given source format data set. In this case the source is an Oracle database using either ODBC or SQLNet;
  2. Reader Definition: The reader definitions consist of the schema of the source data set. This component of the mapping file defines what the source data looks like. In this case the source data consisted of a collection of Oracle tables definitions, similar in syntax to a standard data dictionary language (DDL);
  3. Geometric Processing: After the reader has input the source data, the features are in the FME as format neutral. The features can now be interpreted and transformed by the FME into the correct geometry, topology and structures;
  4. Writer Definition: The writer definition consists of the schema of the target data set. This component of the mapping file defines what the resultant features will look like. In this case the target format was E00;
  5. Writer: In a mapping file the writer is a call to method within the FME that supports the writing to a given format data set. In this case the target is E00 files;

Architecturally both the reader and writer and their definitions can be replaced with other source and target formats. The real work is done in step 3, which is essentially format neutral. Figure 5 shows the flow of information through the FME as it is transformed from Vision into E00 features.

 

 

 

 

Figure 5: Data Flow through FME

Figure 5: Data Flow through FME

For the OCPA implementation the reader definition for each table was formed from a template which applied a SQL query to the extraction of features from the database. This template was altered to fit each query for features and provided a clean and generic way of modeling the source data.

 

2.1 Polygon Construction

The amount of erroneous data found in linear and polygonal features in Vision was in the order of 0.01 percent. However, when in consideration of the volume of data in the set (more than 2 million) this resolves to over 20,000 features. Most of these errors were easily located and resolved. However, inconsistencies in the network based topology of the source data meant that line intersection coupled with polygon formation was required.

The largest polygonal data set is the lot layers, which is comprised of over 1,000,000 arcs and 200,000 centroids. The processing that the FME performed to insure topologically integrity prior to writing to E00 was:

  1. Line intersection
  2. Linear Self Intersection
  3. Polygon Construction
  4. Centroid overlay

The first and third operations are global, in that to correctly perform line to line intersection and polygon construction, all information on the geometry is required. You cannot divide and conquer either problem. When the entire data set was loaded and being processed, the memory consumption of both real and virtual memory was high, and the result was a disk I/O bound process. To speed up the processing the data had to be partitioned. The Vision database was seamless, so it provided no direct partition. However, there were business attributes that provided the foundation for a partition. The following steps where taken:

  1. Township/Range: Each centroid from the original lot data set had the township and range encoded as an attribute. Township and ranges are used in the United States to partition and organize land and land ownership. A township/range block consists of a square mile, with townships changing horizontally along constant latitude, and ranges changes vertically along a line of longitude.
  2. Maximum Lot Size: The maximum linear size of a lot in Orange county was determined. Alternately this can be posed as "find the maximum size of the minimum bounding box for all lots"
  3. Bounding Box for Township/Range: Select all centroids for a given township/range. Replace each of the centroids with a box with the dimension of the maximum lot size. From this set create a polygon from the outer boundary;
  4. Select Lines: Select all lot lines for the County and clip out only those contained partially contained in the outer boundary from 4;
  5. Clean and Construct Polygons: Perform linear intersection, self intersection and construct polygons on the clipped line set from 5;
  6. Overlay Polygons with Centroids: Overlay the centroids for the township/range, preserving only those polygons from step 5 that contain a centroid. Output this data to E00;
  7. Seaming: Seam the collection of township/range E00 files in ArcInfo using the UNION operation.

This approach allowed for all polygons to be recovered from the Vision database. In addition this approach allows for multiple processors to be employed to solve the polygon formation problem in parallel. A side effect of this approach was that lots that were incorrectly assigned to the wrong township/range were easily identified. In conclusion this performance issue was resolved by breaking the County down into a collection of puzzle pieces which were reassembled into the greater County.

 

3.0 Quality Assurance and Error Resolution

The OCPA quality control consisted of a set of topological rules and review procedures designed to ensure proper data integrity for use with the Esri products. Staff decided to first convert the Vision data into ArcInfo export files before ultimately storing the features in SDE/Oracle. The data was converted into coverages so the diagnostic capabilities of ArcInfo could be capitalized during quality assurance. Once imported, ArcInfo coverages were created using commonly used themes (e.g., parcels, lots, subdivisions, etc.). OCPA adhered to comprehensive quality assurance procedures using ArcInfo (ArcEdit, ArcPlot, and INFO) and ArcView software to assure the long-term viability of the spatial and attribute data in an Esri environment. ArcView was employed to qualitatively check the converted data (e.g., search for proper location of parcels and correct parcel attributes) while ArcInfo was used to check topological integrity (e.g., Do the polygons close? Do the arcs intersect? Are the internal IDs consistent and stable?, etc.), a critical test for use later on. The quality assurance procedures were as follows:

  1. Visually examine the spatial data and attributes for proper location, missing polygons, arcs, points, anno, mapscale, etc. in ArcView and ArcEdit
  2. After importing the e00 file, issue an ArcInfo "describe" command for an overview and ensure that the number of label points (centroids) for a poly coverage is 1 less than the number of polygons
  3. - Note: polygon area values should not change after a "build"

  4. In ArcEdit, test the LAB and PAT files (polygon coverages) to ensure correct topology by verifying that:
  1. In INFO (or Arc: list cover.pat), check the PAT file to ensure that:
  1. Perform a Clean & Build on the coverage to ensure that:
  1. Compare the feat_num for a specific ArcInfo poly to the same Feat_num in VISION (is the feature number associated with that poly in the VISION database the same as in the FME output?)
  2. The following should create labels and correct topology for all polygons after an import:

Once these rules were satisfied, the data was loaded into SDE using SDE administration scripts. The parcels and lots coverages for the dense urban areas first had to be joined into township-level coverages before the loading could begin.

It is important to note that a number of iterative test conversions were performed over the course of the summer and early fall of 1999 in order to tune the conversion software, record the processing times, refine our quality assurance procedures, and review the Vision data for errors. This approach uncovered a number of problems with the Vision data. Data deficiencies ranged from duplicate arcs, disconnected street centerlines, and broken hydrography arcs to open parcel polygons and miscoded parcels and subdivisions. We thus were able to clean up much of the data before loading coverages into SDE.

The converted ArcInfo coverages were loaded into SDE after all QA was completed. There were several valuable observations/lessons learned during the data loads and subsequent tuning and use of the data. Above all, SDE offered rapid access to a seamless countywide database consisting of over 300,000 parcel records and 27 SDE tables. Spatial data and attributes could be retrieved within seconds anywhere in the county. Tuning the database and separating the data layers into separate feature class tables was necessary to optimize this performance. OCPA staff also developed custom AMLs to import and export the data for parcel maintenance. FME was extended to support mapping that would export the data out of SDE into several formats: E00, Shape, DXF, DGN, and MIF (for data requests and GIS partner data updates). ArcInfo had commands to export data out of SDE as shape and coverage, even DXF files. However, converting to DGN and MIF posed more of a challenge. FME provided a powerful tool to accomplish all of these tasks using one software product.

Among the lessons learned, we noted that:

Figure 6. Sample of OCPA Internet Mapping Application Using Data Converted with FME Software

Figure 6.  Sample of OCPA Internet Mapping Application Using Data Converted with FME Software

Figure 7. Screen Showing a Converted GIS Data Using ProMap, OCPA’s Custom ArcView Application

Figure 7. Screen Showing a Converted GIS Data Using ProMap, OCPA’s Custom ArcView Application

 

5.0 Conclusion

Above all, sound project planning, a trained staff supplemented by consultants for specific tasks, a team effort, lots of communication and coordination between the vendors, client and staff, and hard work enabled us to convert the entire county-wide parcel database and GIS applications within an eight month period. By December 1999, the Orange County Property Appraiser's Office could boast a variety of successful GIS applications, a clean transfer of legacy data from Vision to a seamless parcel database in SDE using the FME, a custom ArcView application called "ProMap" for data display, plotting, and analysis, a custom Visual Basic/ArcODE parcel maintenance application, and an Internet Map Server using a VB/MapObjects 2.0 program (Figures 6 and 7). [Be sure to visit the Property Appraiser's Office web site at www.ocpafl.org to see some of the results, and www.safe.com to view the FME]. Of course, this is only the beginning. We plan to enhance the functionality of our web server and custom ArcView programs and expect to develop a more sophisticated Intranet map server for the Office and for wireless GIS access to the field appraisers’ laptop computers.

Our collaboration with Safe Software, Esri, and Panda Consulting, Inc. helped make this possible.

 


Stewart Dary
GIS Manager
Orange County Property Appraiser's Office
Orlando, Florida and
David van Blankenstein
Safe Software, Inc.
Surrey, BC, Canada