Michele R. Aurand
Dean Tyler
Dog catches its tail in GIS life-cycle at MMS
Abstract:
The GIS life-cycle at MMS began when ArcInfo was introduced as a mapping slave to an
established relational database. When SDE became available, our dog caught its own tail as
Oracle data, processed into coverages, converted to shapefiles, returned to Oracle tables. We had
reached the solution we originally wanted - or had we? Now the question is - can a relational
database designed to record legal documentation about spatial entities finally be integrated with a
database for mapping spatial features? Do our old concerns about data structures and geographic
features really go away or not?
Introduction:
The term GIS life-cycle is sometimes applied to the flow of spatial data from the real world, through the GIS computer, and back out to the real world as maps that decision-makers analyze and base policy upon. However, the term GIS life-cycle can also be used to describe a process in software design and development, for either a complete GIS system, or for a customized interface built on top of an off-the-shelf GIS package. This paper will use GIS life-cycle in the latter vein - it will discuss a journey in software development for a customized GIS mapping system built on top of Esri's commercial GIS software.
Software Development Life-cycles:
A software development life-cycle defines stages of activity that project members progress through in order to achieve a desired software product. The classic Waterfall Method distinguished stages of problem analysis, design, coding, and testing which ought to be advanced through sequentially in order to develop a product (Figure 1.a). At each step a decision might be made to stop, continue, or spend more time at the current stage. More recent thought in software life-cycles has emphasized prototyping, and a spiral flow to progress in systems development (Figure 1.b). Coding for one portion of a software system might progress parallel with problem analysis on a second portion, and software design on a third. At each finished step, feedback becomes available to influence decisions and work in other parallel stages.
Looking back over the history of our customized mapping system TIMS (Technical Information Management System) at the Minerals Management Service (MMS), we can summarize that our software development life-cycle followed a spiral flow pattern, with parallel stages of analysis, design, and coding on separate portions of the mapping system. The initial starting point, or state of the world, was a large Oracle database storing legal records and reports concerning all types of activities associated with offshore oil management: leasing, wells, platforms, reserves, pipelines, companies, etc. The goal was an automated mapping system. Problem analysis suggested the in-house development of a customized user interface over an ArcInfo GIS engine. GIS layers were created by writing PRO*C programs to extract records from Oracle tables and write formatted output to twin files - one file in ArcInfo "generate" command format including the spatial information, and a twin file with nonspatial information for ingest into an INFO attribute table. A basic menu system providing display options on a small set of map coverages was developed in the first prototype and released to users. Requests for new options and coverages came in. Spiral development ruled, and mapping system upgrades were released several times a year to end-users (Figure 2). The development of the TIMS system progressed, following the direction of its original vision, and always obeying its classic data formula of extracting feature data from Oracle with PRO*C programs, formatting flat files, and generating ArcInfo coverages.
Not unexpectedly, the software development life-cycle of a customized user interface, like TIMS, does not have as much independence as the life-cycle of a stand-alone software product (not that we can any longer say that any software product is truly stand-alone). Because our TIMS mapping system was built with ArcInfo tools, its life-cycle must react to developments in the life-cycle of its parent software. And although ArcInfo has a good reputation for upward compatibility, new options in the parent GIS, which are part of the natural life-cycle development of that software, impact the development of any customized interface products.
The introduction of new data structures in ArcInfo version upgrades, including at one time both grids and regions, caused the TIMS product life-cycle to jog and veer unsmoothly in attempts to keep pace with the parent and utilize these new options and their techniques. Raw data types which had previously been too costly to transform into coverages were finally implemented with new parent-system options; and other costly work-arounds were retooled using processing steps made available after ArcInfo version upgrades. But perhaps the biggest curve that Esri threw out to its customers who develop customized GIS interfaces was ArcView. ArcView made us jog. And ArcView forced us to veer. ArcView was a whole new off-the-shelf software to build a customized GIS on top of: new GUI style, new programming language, new data structures and file types.
MMS and TIMS didn't jump on the ArcView wagon - we stepped aboard it cautiously. We opted for parallel development of the ArcInfo TIMS and a new ArcView TIMS. Basically, our in-house mapping system divided into two customized GIS products, each advancing through a separate but related cyclical life-cycle, each reacting to the evolution of its own parent GIS software. And each also reacting to upgrades in the MMS parent system - the Oracle database, which had been and still was the master of digital information (Figure 3). Not only did the mapping interfaces have to react to upgrades in ArcInfo and Arc/View, they also had to react to Oracle database changes, like new tables, dropping unique keys, and the addition of columns to existing tables. When Oracle database changes are enacted, TIMS GIS life-cycle rotations have to be redirected from developing new modules to fixing old modules in order to make them compatible with a new Oracle reality.
Dog Catches its Tail:
The introduction of ArcView, while it did add another software package to develop in, and a whole new GIS life-cycle to maintain, did not alter the underlying GIS formula at MMS. Oracle was still the master. In-house GIS code extracted spatial and nonspatial data from Oracle tables and created ArcInfo coverages. ArcInfo coverages were simply transformed into shapefiles for the ArcView system. Not that much had really changed. And when MMS purchased the newest software in the Esri lineup - Spatial Database Engine (SDE) Version 2.1, our first prototype sprouted from the same old paradigm: extract data from Oracle, build coverage, create shapefile, now import shapefile into SDE. Converting records from a topological (ArcInfo coverage) or nontopological (ArcView shapefile) data structure into records in relational database tables may seem novel if the original data source is a map coverage, but when the original data source is a relational database, self-awareness begins to dawn with a realization that you are treading on familiar ground, or as the title of this paper has mused, your dog has finally caught its own tail.
Our GIS life-cycle was born of Oracle, and with SDE, it returns to Oracle. But, to do as our first prototype did, to leap from Oracle tables, to coverage, to shapefile, and back to Oracle tables, is to, in fact, build a winding artificial construct that merely serves to return us back to our own beginning - the relational database, albeit in a slightly different format (Figure 4). And the path, this construct in GIS data processing, is littered with wasted time and wasted space. The number of steps required to pull data out of Oracle, massage it through all of its intermediate forms, and put data back into Oracle, ranges from a low of three programs and about one hundred lines of code, to upwards of a dozen programs and ten thousand lines of code. No one ever said TIMS was small. Now add in the disk space required to store intermediate data forms, and the unique ids added at each extra processing step: the Oracle primary key, the ArcInfo internal id and user id, and the SDE feature id. The TIMS GIS systems and their duplicate data loads strain an institution already deluged by more data than it can load into its relational database.
Changing the GIS Paradigm:
Considering the tight time constraint managment imposed for the release of an ArcView version which utilized the SDE software, a shapefile load into SDE was the most realistic solution available. However, as the SDE life-cycle rotates into its second passage through the requirements analysis and design stages, we not only have time to re-analyze the first SDE implementation, but we also have the opportunity to rethink the entire role of GIS Mapping at MMS. Instead of merely upgrading from SDE 2.1 to SDE 3.0, with all the changes that entails, we look out upon the possibility of deconstructing our entire GIS data life-cycle construct - of tearing apart the journey from Oracle, to ArcInfo coverage, to ArcView shapefile, and back to Oracle/SDE tables.
And this is not just an opportunity to excise programs which are no longer needed, to save disk space and eliminate redundant data, and to reduce processing time between the source data and the spatial data product, although all of these cool things can be accomplished. Instead, it is a grander moment. It is the best opportunity that the GIS Mapping task at MMS will ever have to pull itself up from servitude to the Oracle relational database and finally make GIS an equal partner in data management and information technology.
By-Product No Longer:
In order to equalize the balance of power between the relational database for legal documents and the Oracle/SDE database for mappable features it is necessary to integrate the two worlds of view. Oracle legal and Oracle mapping can be integrated if we explicitly define a crosswalk between the answer to the questions of (1) what are legal features of interest, and (2) what are the spatial features of interest.
In a relational database, the definition of the features of interest in actually quite explicit. One could even say that a relational database "maps" out the definition of its features by listing the properties, or attributes, that a feature can possess, and by storing relationships between one feature and other features in the database through parent/child and composite/component links between tables.
Unfortunately, a GIS does not so clearly map out the definitions of its features, nor does it store information about the lineage of those features or the relationships between the features in one map layer and features in other layers. "What is the spatial feature of interest" is one of the primary questions in any application in cartography or GIS, but it is a question so basic that the cartographer or GIS technician often forgets to ask it, and instead relies on assumptions or previous explanations. What is a well? The answer seems simple, basic, maybe even trivial, at first. Yet, there is actually a very important decision which had to be made at the beginning of investigation in order to define the spatial feature that is a well, before wells could be drawn on a map, or stored in a GIS layer.
At TIMS, we are quite lucky that the definitions of our spatial features are preserved in the PRO*C code which extracts data from Oracle and formats it for ingest into ArcInfo coverages. These programs are our crosswalk between the legal feature of interest and the spatial feature of interest for numerous topics of investigation in the MMS database. For some topics, the crosswalk between legal feature and mappable feature is a simple one-to-one correspondence where spatial location and aspatial attributes are inherited from Oracle record to spatial feature (Figure 5).
For other topics, the definition of the spatial feature of interest requires a complicated series of mathematical manipulations upon the legal feature. For example, the legal definition of a well is preserved in the Oracle database as an entity that possesses a top location (x, y, and z), a bottom location (x, y, and z), a well name, a status, a mudline depth, etc, etc, and may contain other legal entities such as well completions and directional surveys, to name only a few of the links explicitly defined between the relational tables for wells and other Oracle entities. For mapping purposes, we have historically decided that a legal well feature is either 1 or 3 spatial features (Figure 6). If the well is straight, meaning the angle of inclination between its surface location and its bottom location is vertical or less that three degrees off vertical, than that well is mapped as a single point feature. However, if the well is a slanted well, where the angle of inclination from its top to its bottom is greater than 3 degrees off vertical, then that well is mapped as three spatial features: a surface point feature, a bottom point feature, and a well line feature, which is the straight line segment drawn between the surface coordinate and the bottom coordinate. We can call this translation between the legal feature and the spatial feature or features of interest our function geographic.
Now at MMS, we have defined functions geographic between many of our Oracle legal features and our GIS mapping features. Yet, these functions geographic have always been orphans in the wider scope of information technology - important to the GIS task in its need to create spatial features from Oracle legal features, but irrelevant to the master Oracle system which has evolved according to its own requirements and has normally been indifferent to the struggles of its GIS step-child. Now, however, with the introduction of SDE, and GIS turning to the relational database as its data storage structure, GIS mapping has its excuse to ask for integration. GIS will be using Oracle table space; it will be using Oracle processing memory. We can either continue wasting time and space, or we can integrate.
With the next turn of the software life-cycle for SDE, we can create one of two possible realities for our Oracle/SDE data. If we continue in the current GIS data paradigm, the result will be two sets of Oracle tables for each mappable feature - the Oracle legal table design, with all of its records, and the SDE/Oracle tables, duplicating the same nonspatial information and storing coordinate information in the SDE format in an associated Spatial Table (Figure 7.a). If we integrate, the Oracle legal table will be spatially enabled, and its associated SDE Spatial Table will be populated by applying the rules of the function geographic for that feature type by means of an Oracle Forms trigger that impels inserts and updates to the parent business table to cascade down into inserts and updates against the associated spatial table (Figure 7.b).
Looking ahead to the Next GIS Life-cycle:
Integration of Oracle and Mapping is not guaranteed for the TIMS system at MMS. Currently we are developing a prototype in SDE 3.0, while we continue maintenance life-cycles on the ArcInfo mapping system, development life-cycles on the ArcView mapping system, and maintenance and development life-cycles on the SDE 2.1 version. Unlike other mapping decisions which only had to react to developments in the Oracle system life-cycle, an integrated GIS will need to coordinate development with Oracle decision-makers who have yet to conceive of mapping as an equal partner in information technology.
The positive benefits that wait behind integration are exciting. Currently we update GIS map layers once a week. It actually takes so long to compare a complete Oracle table to its ArcInfo coverage and determine which features need to be added, changed, or deleted, that we normally delete the entire coverage and completely recreate it from Oracle tables instead. Now, for the first time, we have the potential for instantaneous update of spatial features. When an Oracle legal record is inserted, a new Forms trigger will be able to create its SDE spatial records. In addition to gaining instantaneous update, maintaining spatial data in SDE will grant us use of Oracle's data management and backup tools, which is less costly than maintaining in-house AML programs and unix scripts.
The benefits are exciting, and of course there are problems to solve as well, which are even more exciting to us programmer-types. That basic question of defining the geographic feature of interest remains with us. For currently implemented modules, feature definition requires life-cycles of code adaption between programs which read from Oracle tables and output flat files in formats ArcInfo can ingest, to new programs which will read from Oracle, perform the function geographic, and write to Oracle. For new modules which have not yet been implemented in the customized GIS, there will be design cycles in translation between the legal feature of interest and the spatial feature of interest. Maybe the designers of the Oracle legal tables will actually think about geographic feature definition, but the GIS department won't cross its fingers on that one. However, at the least, the function geographic will be brought to center stage and take on a more visible role as the crucial step in GIS development that it is.
Acknowledgements:
We would like to acknowledge Information Technology Division/Systems Application Branch Mapping Unit Supervisor Paul Rasmus and GIS Specialist Leonard Coats for their leadership on the mapping task at MMS. And we would like to thank Senior Systems Analyst for Quality Assurance Robert J. Whitaker for his timely assist in making this paper web-worthy.
Author Information:
email: Michele_Aurand@mms.gov
attn: Michele R. Aurand
Senior Information Systems Specialist
U.S. Dept. of Interior
Minerals Management Service
Mail Stop 4061
1201 Elmwood Park Blvd
New Orleans, LA 70123
phone: 504-731-3043
fax: 504-731-3004email: Dean_Tyler@mms.gov
attn: Dean Tyler
Senior Information Systems Specialist
U.S. Dept. of Interior
Minerals Management Service
Mail Stop 4061
1201 Elmwood Park Blvd
New Orleans, LA 70123
phone: 504-731-3073
fax: 504-731-3004