Techniques for Integrating Oracle Spatial with ArcSDE and the Geodatabase

Robert L. Gage

In the course of implementing an integrated system for management of facilities information at NASA Langley Research Center, many issues were encountered in integrating the various Esri and Oracle technologies. This paper will summarize the solutions and techniques that were discovered. Topics addressed include the use of database views to simplify map generation and web development, using multi-versioned data in non-ArcSDE clients, and options for implementing custom object behavior using both client side COM/ATL and server-side PL/SQL and Java.

Introduction

In the process of designing and implementing a system for management of facilities data at NASA Langley Research Center, many design issues were addressed that may be of interest to others. In particular, the choice to implement our geodatabase on ArcSDE using the Oracle Spatial SDO_GEOMETRY data type was made early in the process based on the desire to be able to leverage as much as possible from existing expertise in Oracle. In particular we wanted the capability to perform queries involving the spatial data from within Oracle for report generation and non-graphical web applications. There have been significant difficulties in following this approach and it is recommended that others use the binary geometry format unless they have the technical resources and time to dedicate to solving these problems. With that said, there are many advantages to using the SDO_GEOMETRY format and some of the knowledge gained from this experience is presented in this paper.

Database Design

It is highly recommended that you normalize your data just as you would for non-spatial data. In order to take advantage of the nature of a relational database, normalization is an important process that should be followed before making any compromises to the design for performance reasons. This will ensure that data editing is reduced to the minimum logical set of attributes for each entity and will help to guarantee the quality of the data. To review the normal forms:

First:	Remove repeated attributes or groups of attributes.
Second:	Remove attributes dependent on only part of the unique identifier.
Third:	Remove attributes dependent on attributes that are not part of the unique identifier

In the first form, there should not be any repeated field names. For example, if a parcel can have more that one listed owner, the parcel feature class should not have fields OWNER_NAME_1 and OWNER_NAME_2, instead there should be a separate owner class and a relationship class should be used to associate them to the parcel.

In the second form, all attributes in the class should describe the one and only object identified by the whole unique identifier. For example, if a building is uniquely identified by NUMBER and STREET_NAME, the attribute for TRASH_PICKUP_DATE is probably associated only with the STREET_NAME rather than the full address.

In the third form, there should not be any attributes that are dependent on other attributes that are not part of the unique identifier. For example, if the building class contained the fields OWNER_NAME and OWNER_PHONE, these should become a new class describing the owner.

Note that strict normalization is not always possible. In particular, ArcSDE requires an OBJECTID field that must act as the unique identifier regardless of whether a natural unique identifier exists or not.

In the process of designing and normalizing your schema, the following is suggested:

Design using UML modeling tools.
Don’t design each class in a vacuum, understand the big picture first.
Use attribute names that best describe the properties of the class.
Eliminate derivative or redundant information.

The process of UML modeling tends to lead naturally to a normalized design, provides an excellent visual reference to explain the design to others, helps to understand the impact of future design changes, and by using the schema generation wizard, accelerates implementation.

Try to gather as much information as possible about the entire system before beginning the design. Understand what information is needed, what is currently available, and what are the overall objectives of the business or organization. Try to anticipate future needs. You may find that integration of your spatial and business data leads to new possibilities for both sides to leverage the additional information.

Choose attribute and relationship names that make the most sense to someone editing each feature class. The tree view in the object inspector provides an intuitive means to edit relationships. Keep relationship names simple. Preferably just the name of the related class (in the plural form if appropriate).

Watch for redundant attribute names in your classes and use a sample set of your data to look for redundancies in the actual attribute values. Try editing some sample data and note if changing one field always leads to changes in another field. This is a sign that there is a dependency that needs to be removed. Derivative information can be calculated as needed for maps and reports or can be computed on the fly using a database view. Other sources of redundant information are attributes that describe properties inherent to the geometry or properties of a spatial relationship between geometries. For instance if a building exists within a parcel, an attribute (or foreign key) in the building class identifying the parcel would be redundant because that information can be determined uniquely from the spatial relationship.

Finally, after normalizing your schema, consider the average use of this data and make compromises in the design to accommodate performance issues. In particular, if a spatial relationship like the one described earlier is used often, it may make sense to include the redundant information in the form of a relationship (i.e. a foreign key in the building class will identify the associated parcel feature). To avoid conflicting data, be sure to use database constraints, triggers, or custom object behavior to either limit the editor’s chances of making a mistake or preferably eliminate the need to edit the redundant information entirely.

Database Views

While there are many advantages in using a normalized schema, the usefulness of the data in this form is somewhat limited because generalized SQL queries are not fully supported in either ArcMap or ArcIMS. These clients usually restrict the query form to the select clause and the where clause (and possibly the from clause). Because ArcSDE allows read access to any database tables or views containing an SDO_GEOMETRY column, database views can overcome these limitations by allowing these clients to access the data through a view that provides the information in the desired form. There are, however, some limitations in using views.

Non-versioned Views

In the case of non-versioned data, views work exceptionally well. The sdetable command can be used to create views:

sdetable -o create_view -T wetland_view -t wetland -c "objectid,shape,classification, CASE WHEN classification LIKE '%FO%' THEN 'FORESTED' ELSE CASE WHEN classification LIKE '%SS%' THEN 'SCRUB SHRUB' ELSE CASE WHEN classification LIKE '%EM%' THEN 'EMERGENT' ELSE CASE WHEN classification = 'U' THEN 'NONE' ELSE 'WATER' END END END END" -a "objectid,shape,classification,general_class" -u sde_dba

You’ll need to alter the registration of the view and tell ArcSDE what column contains the unique identifier (OBJECTID):

sdetable -o alter_reg -t WETLAND_NEW_VIEW -c OBJECTID -C USER -u sde_dba

You can also create or update the view manually in the database:

CREATE OR REPLACE VIEW wetland_view AS SELECT /*+ FIRST_ROWS */ objectid, shape, classification, CASE WHEN classification LIKE '%FO%' THEN 'FORESTED' ELSE CASE WHEN classification LIKE '%SS%' THEN 'SCRUB SHRUB' ELSE CASE WHEN classification LIKE '%EM%' THEN 'EMERGENT' ELSE CASE WHEN classification = 'U' THEN 'NONE' ELSE 'WATER' END END END END GENERAL_CLASS, OBJECTID SE_SDO_ROWID, SE_ANNO_CAD_DATA FROM wetland;

Note that we must include the SE_SDO_ROWID and SE_ANNO_CAD_DATA columns when manually replacing the view. Also, note that if you manually create the view without using the sdetable –o create_view command line tool, you will also need to populate Oracle Spatial’s metadata table and use the sdetable command to register the layer.

The syntax for sdetable -o create_view is fairly flexible and allows for joins, calculations in the select clause, aliases, etc. If you need to perform an operation not supported by this interface, then use the manual method to replace the view with the more complex query.

Note that you can also use Oracle Spatial operators to perform spatial joins, compute new geometry, etc:

CREATE OR REPLACE VIEW room_exterior AS SELECT SDO_GEOM.SDO_DIFFERENCE(r.shape, mr.diminfo,f.shape, mf.diminfo) shape, r.objectid room_oid, f.objectid floor_oid FROM room r, floor f, user_sdo_geom_metadata mr, user_sdo_geom_metadata mf WHERE mr.table_name = 'ROOM' AND mr.column_name = 'SHAPE' AND mf.table_name = 'FLOOR' AND mf.column_name = 'SHAPE' AND SDO_RELATE(r.shape, f.shape, 'mask=OVERLAPBDYDISJOINT+OVERLAPBDYINTERSECT querytype = JOIN') = 'TRUE'

This view returns only the parts of the room polygon geometry that exist outside the floor polygon with which it overlaps.

All ArcSDE clients can use these views. An advantage to this approach is that once a database view is constructed, it can be used by an ArcMap user to develop a production quality map and by a web developer using ArcIMS to publish the data on the web. In both cases the job was made significantly easier because the data was presented in the desired output format rather than the format in which it is edited and maintained.

Multi-versioned Views

Multi-versioned data presents some challenges to using views because the data is spread across the business table and its associated add and delete tables. The ArcSDE sdetable –o create_mv_view command can be used to create a view which presents a particular state of the associated layer:

sdetable -o create_mv_view -T BUILDING_MV -t BUILDING -u sde_dba

The resulting view selects the records from the business table which have not been deleted in the current state lineage and unions the records from the add table which were added (and not subsequently deleted) in the current lineage. The current state and associated lineage are based on a SQL session variable that defaults to the state identifier of the default version at the time the session was initiated.

The syntax for this command does not allow for more complex queries such as joins, so it is necessary to build a more sophisticated view using this view as a base. For example, here we would like to create a view based on the multi-versioned view we created earlier and join the GIS_DBA.BUILDING table to pick up the FIXED_ASSET_CODE column:

sdetable -o create_view -T BUILDING_VIEW -t SDE_DBA.BUILDING_MV,GIS_DBA.BUILDING -c "SDE_DBA.BUILDING_MV.objectid,SDE_DBA.BUILDING_MV.bldg_num, SDE_DBA.BUILDING_MV.lafb_num,SDE_DBA.BUILDING_MV.shape, GIS_DBA.BUILDING.FIXED_ASSET_CODE" -a "objectid,bldg_num,lafb_num,shape,fixed_asset_code" -w "GIS_DBA.BUILDING.BLDG_NUM = SDE_DBA.BUILDING_MV.BLDG_NUM" -u sde_dba

Unfortunately the command line tools do not recognize the multi-versioned view, so it is necessary to create a view using the business table and then replace this view with a new one which references the multi-versioned base view.

sdetable -o create_view -T BUILDING_VIEW -t SDE_DBA.BUILDING -c "OBJECTID, BLDG_NUM,LAFB_NUM,SHAPE" -u sde_dba sdetable -o alter_reg -t BUILDING_VIEW -c OBJECTID -C USER -u sde_dba CREATE OR REPLACE VIEW BUILDING_VIEW AS SELECT /*+ FIRST_ROWS */ SB.OBJECTID, SB.BLDG_NUM, SB.LAFB_NUM, GB.FIXED_ASSET_CODE, DECODE (GB.FIXED_ASSET_CODE, 'AFB', 'Building', 'BDG', 'Building', 'EXT', 'Building', 'OST', 'Facility', 'SST', 'Facility', 'TRA', 'Building') STRUCTURE_TYPE, SB.SHAPE, SB.OBJECTID SE_SDO_ROWID, SB.SE_ANNO_CAD_DATA FROM SDE_DBA.BUILDING_MV SB, GIS_DBA.BUILDING GB WHERE SB.BLDG_NUM = GB.BLDG_NUM(+);

Note that the last three columns should always be SHAPE, SE_SDO_ROWID and SE_ANNO_CAD_DATA. Additionally we added a dependent column called STRUCTURE_TYPE that is determined by decoding the FIXED_ASSET_CODE.

Use the SQL grant statement to give select privileges on the multi-versioned view and grant select access to the feature class and the final view using ArcCatalog.

This approach does have some drawbacks. The multi-versioned view contains a union that appears to make it difficult for some clients to properly determine the unique identifier. This manifests itself in different forms in each of the clients. In ArcMap, if you open the table on the view it appears to be empty, although you can classify the features and produce maps. In ArcView, the selection of features appears to be non-functional. In ArcIMS you cannot load the layer using Author, but you can manually add it to the XML.

To overcome these problems, we can implement a database trigger to maintain a simple base table that always reflects the default version of the specified layer. The trigger responds to changes in the STATE_ID field of the VERISONS table for the DEFAULT record, determines the changes based on the lineage and applies these changes to a simplified incarnation of the layer’s table. This simplified table contains only the OBJECTID and STATE_ID fields. A view can then be constructed by joining this table to the business and add tables (along with any additional tables) to get the desired attributes. This eliminates the ambiguity caused by the union, allowing all the clients to function properly. Additionally this means that clients such as ArcIMS will always see the current default version rather than the state of default at connect time.

COM versus Database Triggers

The use of triggers in the database has wider implications than solving the database view dilemma. In fact they can be used as an alternative to COM programming in many circumstances, particularly when using the Oracle Spatial SDO_GEOMETRY data type. In the example provided earlier of a building existing within a parcel, it was indicated that it may be advantageous to maintain a foreign key within the building class that will identify the associated parcel. This leads to redundant information being stored, but it is desirable because queries of this relationship will occur often and performing a spatial join for each query would be performance limiting. The problems associated with storing redundant information include the additional work required to maintain the information and the possibility of ambiguity resulting from inconsistent values. To eliminate these problems we would like to place this burden on the software rather than the human operator. To deal with these types of relationships, we developed a custom feature class extension called a container. As edits are made, containers automatically evaluate their position relative to their parent and child classes and update the relationships (foreign keys) as needed to keep this information synchronized with the spatial relationship. This feature class extension was developed in C++ using the COM/ATL template provided by the code generation wizard.

An alternative approach would be to develop a database trigger to perform the same functionality. Using Oracle Spatial’s SDO_RELATE operator, the containment can be determined within the trigger and the appropriate changes can be made to a foreign key field. There are several alternatives to implementing this approach. One alternative is to implement the trigger on the business and add tables’ geometry column and have it update the appropriate foreign keys as changes occur. Updating the foreign keys in this manner would bypass the multi-versioning and could therefore lead to incongruities if the user should undo any edits. The correct way to implement this alternative would be to apply the change just as ArcSDE would, by allocating a new state id and applying the changes to the add and delete tables. While this would work, it is dependent on a detailed knowledge of the workings of Esri’s multi-versioning system and could easily become broken in future releases. Another alternative would be to combine this problem with the multi-versioned view problem and use the same approach to solve both. By defining the trigger on the VERSIONS table, we can maintain an independent foreign key that exists in the simplified base table only. Therefore the data being edited by the operator does not contain any reference to this redundant information. This should not present a problem since the operations being performed by the editor do not typically require knowledge of this foreign key’s value. The operator’s task is simply to ensure that the contained feature (a building) is drawn inside the container feature (a parcel). When the operator saves or posts his changes to the default version, the base table for the view will be updated and all clients using the view will be able to take advantage of the foreign key. This alternative has the advantage that the core data tables (the business, add and delete tables) are not modified by the trigger and therefore coding errors cannot lead to corruption of the data.

Experience indicates that the database trigger approach outlined is easier to implement. Additionally, Esri has made changes to the behavior of their COM classes that have proven fragile to the operation of the COM/ATL code over time. This has required a significant amount of testing with each new release. The trigger approach would appear to be less sensitive to product evolution, although changes to the multi-versioning system would definitely still require modifications to the trigger code. One final consideration is that the trigger approach requires ArcSDE, Oracle, and Oracle Spatial, whereas the COM/ATL approach can be implemented on anything from a personal geodatabase up to any ArcSDE implementation. Therefore choice of implementation approach depends heavily on the particular nature and goals of the project and organization.

Conclusion

While there are significant technical hurdles to implementing a geodatabase in ArcSDE using the Oracle Spatial data type, there are also significant advantages including the ability to perform spatial analysis within a SQL statement to enhance non-graphical applications and to supplement the native capabilities of ArcSDE and ArcGIS. The ability to use database views allows users of ArcMap and developers of ArcIMS to quickly develop maps and websites without having to deal with the complexities of having to join, relate, or reclassify the raw data sources. The fact that everyone is using the same view helps minimize the possibility of error that could result from each having to reinvent the query. The use of database triggers can also solve many technical problems including the demonstrated use for multi-versioned views. Additionally triggers can be used to implement behaviors as an alternative to COM development, resulting in code which is potentially less fragile to chances in the core products. Choice of implementation method is dependent on the available resources of the organization and the requirements of the project. Some types of behavior may require immediate interaction that can only be implemented using COM, but for many applications, the use of triggers should be considered as an alternative.

Robert L. Gage
Principal Software Engineer, ConITS
ITSS, Raytheon Technical Services Company
NASA Langley Research Center GIS Team
Email: r.l.gage@larc.nasa.gov