Implementing the IBangHeadHere Interface: Developing an ArcGIS-based multi-user editing system

Paper #: 1079

Authors: Todd Crane and Brad Hibner, Geographic Data Technology, Inc.

Date: Wednesday, July 10, 2002

Time: 8:30am - 12:00pm, Room 22 (SDCC)

 

Abstract

This paper describes lessons learned during the process of developing a large scale, multi-user editing system with ArcGIS and the Geodatabase. Discussion will cover the following topics: identifying system requirements, prototyping components in the "new world" of ArcGIS, application and database design for the weary, system implementation, and comparisons to ArcInfo Workstation and ArcView 3.x platforms. We will then elaborate on successes, failures and resolutions encountered during development and implementation.

Introduction

In the winter of 2000/2001, Geographic Data Technology, Inc. purchased the geography assets of Compusearch, Inc. Prior to the acquisition, Compusearch had provided geographic databases as well as marketing segmentation and analysis solutions in Canada and had been GDT's primary data source for Canadian street, address and postal information. The purchase provided GDT with all products, resources, trademarks and other assets associated with Canadian street geography, postal and census boundaries, and points of interest. This transaction brought GDT one step closer in fulfilling the company’s strategic objective of providing data coverage across the Western Hemisphere.

Background

Prior to GDT’s acquisition, the Compusearch staff used ArcInfo and ArcView as their primary editing and maintenance environment with coverages as the primary data structure. Master coverages were stored on a UNIX server along with the ArcInfo software. The editors accessed the software and data through an X Windows emulator. To facilitate multiple editors working on the same coverage, local areas were checked out of the master coverage; after completion, each piece was checked back in to the master coverage.

Becoming part of the GDT family meant additional responsibilities, different production schedules and a shift in content requirements. Most notably:

Since the acquisition, GDT and GDT-Canada have been investigating a new editing and production environment that includes ArcGIS software. The integration team has focused on workflow refinement, data modeling, data storage and access (editing and analysis) and product creation. The long-term plan is to migrate completely to the new world of ArcGIS, but there are too many variables and unknowns that prevented us from simply "throwing the switch" from ArcInfo to ArcGIS at once. Our analysis has revealed that, while powerful in many aspects, ArcGIS still needs more time to mature before being a completely robust and fully functional GIS environment.

The Status

GDT’s comfort level with, and utilization of ArcGIS over the past year has increased significantly. We have successfully been able to:

That being said, there are benchmarks that need to be completed before GDT can continue with a migration towards ArcGIS. These include, but are not limited to:

ArcInfo Workstation and ArcView 3.x will remain in production for quite some time at GDT as there is a) significant customer demand for product in coverage and shapefile format and b) functional gaps in the current ArcGIS environment. As ArcGIS matures, functionality increases, and the user base grows, there will be a slow migration to the brave new world of ArcGIS for more aspects of the editing and production environment at GDT.

The Analysis

Over a six-month period, two engineers were allocated to evaluate ArcGIS and ArcObjects. The technical analysis did not happen in isolation of the company’s business processes. How would these new tools affect the workflow? The data model? While workflow analysis and data modeling are discrete activities, the overall development process was cyclical: build/test a tool; explore different data representations; assemble alternative process; repeat.

First, a review of the ArcGIS tools, then data modeling and workflow:

ArcGIS Tools

Time was spent analyzing the three new applications (ArcMap, ArcCatalog and ArcToolbox) as well as the Geodatabase and the underlying ArcObjects. Generally speaking the analysis of the tools fell into four broad categories: editing (ArcMap), conversion (ArcCatalog, ArcToolbox), data management (Geodatabase) and extensibility (ArcObjects). Performance, consistency and repeatability and ease of use were evaluated for tools across all categories.

Editing

ArcMap serves as the primary end-user application for editing and mapping. Overall strengths include a large number of editing tools (from reshaping features, to feature creation), the ability to edit multiple layers from different data structures (shapefiles and Geodatabase feature classes) at one time, to interoperability with other applications. In this regard, ArcMap is more than simply a union of "ArcEdit and ArcPlot".

With so many options, menus, tools and places to customize ArcMap, it is not unlikely that new users will become overwhelmed quickly. While ArcMap is a powerful and flexible tool for experienced users, customization becomes necessary when entering into a formalized production environment where tasks are clearly defined and are generally repeatable. Simplifying the user interface speeds travel along the learning curve. Try locating tools for specific editing tasks on a common toolbar. If more than one toolbar is needed, place them close to each other on the screen. The goal is to minimize the amount of "air time" (time spent moving hands between keyboard and mouse) and "mousing" (time spent moving the mouse back and forth across the screen).

Utilizing the editor "extension" of ArcMap makes custom tool development easy. The editor object manages many details that enable the programmer to write a minimal amount of code to enhance the editing experience within ArcGIS.

Another positive attribute of ArcGIS is it’s support of data from a variety of sources: raster or vector, local or internet-served, on-the-fly projections, etc. This flexibility promotes editor efficiently by enabling multiple resources to be utilized during the course of an edit session.

Conversion

ArcCatalog/ArcToolbox wrap a lot of familiar functionality but fall short in certain areas. The flexibility in connecting to a variety of data sources – local, locally networked or over the Internet – is quite powerful and serves as a useful management tool. The "batch" functionality of ArcToolbox is quite handy, especially when converting many layers at once. A caveat here is that it would be handy to have more abilities for appending data to existing datasets, rather than simply converting to a new dataset.

The basics of importing between shapefiles, coverages and geodatabases are readily available, but we found scenarios where these import/export utilities broke down. Converting a coverage to a geodatabase feature class and back to a coverage truncates fields when the number of fields is large. Additionally, we encountered an instance were the simple data loader ignored a particular field when converting from a shapefile to a geodatabase feature class resulting in NULL values for the all-important ID field.

Coordinate precision is another interesting topic. The coverage, shapefile and geodatabase feature class all handle geometry and coordinate precision differently. When importing to the geodatabase, care should be taken to ensure the destination feature class has its "xy scale" set comparable with the expected accuracy of the input data source. Successful transition from an ArcEdit (coverage) environment to a geodatabase environment requires an almost religious adherence to tolerances.

Character Sets/Codepages. While documented in ArcInfo workstation and ArcView, there is surprisingly little information available about how to handle codepage conversion issues. This can get tricky in cross-platform environments (UNIX/Win machines), with database character sets, operating system codepages and application assumptions. Care should be taken that all settings are consistent and that processing (even importing!) can affect "extended" ASCII characters.

While there are many conversion utilities (taken from ArcInfo workstation), there are no "extras" in the Toolbox for mixing and matching data formats – e00 to GDB directly would have been useful. Hopefully ArcGIS 9 will address some of the inherent assumptions made behind the scenes in ArcToolbox.

Data Management

No analysis of the Geodatabase (the enterprise model) is complete without an understanding of the RDBMS concepts and technology underneath. As expected, those experienced with ArcSDE will find Geodatabase access quite similar in structure. While Oracle or SQL Server serve equally well as development databases, it is important to note the additional expenses (capital and personnel) and overhead for deploying and maintaining RDBMS technology.

The geodatabase, based on ArcSDE, is quite stable and is built for scalability and performance. It is a blessing and a curse to have the underlying database be responsible for data access. ArcSDE and the geodatabase can focus on what it is good at doing – asking the database to store and retrieve information. The blessing is that an optimized database works well, the curse is that when it’s not tuned, the entire application suffers. Experienced database administrators are really critical at this stage. This may be an issue for organizations lacking DBA skillsets.

Quality Control. The ability to perform QC as close to the edit as possible is critical, especially as the number of editors grow and the scale of the geography increases. There are many encouraging aspects inherent in the use of the geodatabase that facilitate the integration of business rules within the production cycle. The ability to model relationships, both attribute and topological (at 8.3), allow the system to enforce integrity throughout the entire process. Custom feature classes and feature class extensions allow these important characteristics to be managed at a "lower" level (closer to the database), rather than relying on the applications ("above") to enforce the rules. This "server-side" control allows deployment of multiple editing tools without jeopardizing or sacrificing data integrity.

Versioning. More information please. Details, statistics, guidelines and recommendations with hard numbers are not readily available. Answers to questions like "How often should the database be compressed?" or "How many versions should be in use?" require a lot of testing. Scheduling downtime (non-editing time) may be another critical factor. It’s hard to simulate projected users, number of edits, and system demand without having benchmarks to review. Implementing a successful versioning model will make or brake the workflow processes.

We would like permissions to be handled at a more granular level. Currently, users either have read-only access or full access to add/update/delete. It is common at GDT for editors to have only update privileges (not add or delete). Custom code is needed to control permissions beyond the "ON" or "OFF" scenario.

Extensibility

Customizations to the ArcGIS environment are necessary. With so many options to choose from, it is sometimes hard to figure out which path to take. For old-time ArcInfo/AML and ArcView/Avenue programmers, the ArcObjects learning curve can be quite intimidating at first. (AML à Avenue à COM à VBA à VB à C++ à ATL) Those familiar with COM programming will pick up on the syntax faster, but navigating through the comprehensive Object Model diagrams takes determination.

Some simple tasks, like performing an operation on the selected layer in the Table Of Contents requires a bit more programming than in ArcView. The samples, while extremely well done, could benefit from more functionally oriented samples that compare simple operations to help transition programmers from the old world of ArcInfo and ArcView to ArcGIS.

ArcObjects is comprehensive. As usual, there is usually more than one way to accomplish a task. For the novice, there may be too many ways to do something, which contributes to the fear factor of transitioning to ArcGIS. Fortunately, the documentation is good and the samples are plenty. (Re)Read the documentation and study the samples; sooner or later the Object Models start to make sense. The Add-ins and help support for Visual Basic and Visual C++ development facilitate development activities.

The ability to program with different languages is another tremendous benefit. Allowing multiple entry points helps unify projects and promote communication. Users with different technical skillsets can share work regardless of the programming language used for coding. Using a scripting language such as VBA also makes application prototyping easy. Subsequently, engineering can rewire the functionality in C++ for performance or other institutional reasons.

Note that the flexibility we’ve been asking for is here, but it comes at a cost. It’s one thing to get all the software and database installed and running, but it takes a bit more work to make it perform efficiently! Quality integration is the key to success. Also remember that as the number of software components from different vendors increases, the software maintenance and upgrade schedule becomes more complicated. While not surprising, it’s important to consider when developing a long-term transition plan.

Measuring performance is mixed. While certain aspects need improvement (loading an application, navigating the tree view in ArcCatalog, etc.), others, like individual feature editing is quite responsive, especially when used with ArcSDE-based implementations of the Geodatabase. No longer does the number of features stored in a layer directly affect response time – access speed can be tuned at the database level.

Data modeling

As mentioned before, migration to the geodatabase provides the flexibility in specifying coordinate precision appropriate for each database. This control has advantages over the double precision, floating point numbers used by coverages. What do you do with those extra digits of precision that mean nothing in terms of accuracy? You can store just what you need in the geodatabase. We won’t go into selecting an appropriate scale here, but note that conversion of datasets with different accuracy may necessitate short-term data clean up work for successful loading.

Topology. Here’s a big change: the concept of topology has moved from an inherent aspect of the coverage to one of an extension in the geodatabase. True, it is possible to compute the spatial relationships between features "on the fly" fast enough for certain application requirements, but there are some basic functions currently missing in the new world that has certainly slowed the acceptance of ArcGIS. Remember the "select path" function in ArcEdit? Well, it is not available in the Geodatabase without building a geometric network or writing custom code. For GDT, we’ve not been able to justify the overhead of building and maintaining a geometric network just for this. Also, the topology extension is in sight (due at 8.3) and will be followed by linear networks. As an ArcGIS 8.3 Technology Preview site, GDT has been evaluating the topology extension and how it fits into the overall design of the database. At this point, the topology extension will allow users to define topology rules among and between feature classes. After an edit, the user can validate the changes and get a report of any instances that violate a particular topology rule. This functionality does not directly help the editor while performing an edit, but can serve as QC after the edit but prior to "posting" back to the parent version. To aid the editing experience, topology rules should be used in conjunction with other editor settings, such as the snapping environment.

There are a number of Esri presentations available (Morehouse, Jackson, Shaner) that discuss potential mapping of coverage subclasses and shapefile classes to Geodatabase feature class types. See the conference proceedings and technical workshops regarding the topology extension for more information.

Tile Management – part 1: Another obvious difference between the old world and new world is the shift from managing data in files to managing the data in a database management system. This move to a centralized, seamless database has its advantages: remove potential file size limitations and related file I/O overhead; no need to check in/out data into intermediate files; simplifies data access; finer granularity of data locking. Note that while the physical need for "tile management" (librarian, ArcStorm) has been removed, there is still a need for "synchronized editing". Therefore, tiling becomes a logical artifact of workflow management, not an underlying database constraint.

Workflow

Tile management – part 2 (a.k.a. versioning): Now that we can store all the streets in North America in one layer the idea of "tile management" is now placed on the workflow. In other words, just because we have the entire database available to us for editing, doesn’t mean we should. It does mean that we need a plan for how we are going to edit the database. Who will be editing what (or where) and when. As mentioned earlier, selecting an appropriate versioning model is critical to success. Scheduling work to reduce the need for "reconciliation" is necessary in order to minimize the amount of downtime required for compressing the database. (Compressing requires an exclusive lock on the layer, which means no editing during this database event.) Allocating work to editors has become a planning exercise that takes into consideration thematic and geographic constraints. The more things change, the more they stay the same…

Building and maintaining a database to support transportation and navigation applications requires some degree of network/routing functionality. While we wait for linear networks in the geodatabase, our workflow must consider how and when to perform such tests. Whether utilizing NetEngine, ArcView Network Analyst or other tools, GDT must accommodate functionality not currently supported in the ArcMap/Geodatabase environment.

Conclusion

The process of developing a large scale, multi-user editing system with ArcGIS involves a thorough analysis of both the available software and internal business processes. The key is mapping out a clear migration plan with reasonable expectations. Careful and targeted application of the technology will undoubtedly increase productivity and throughput, but recognize that there may be limitations and constraints. A successful deployment of ArcGIS takes time, patience and research. GDT has made great progress along this development path, and through our long-standing partnership with Esri, is working to ensure that we are in a strong position to help our customers enter into the brave new world of ArcGIS.