Nick Schultz,  Tom Mittiga

InfoShop.SA: Integrating Legacy and Spatial Online Systems with SDE, CORBA, Java and WWW



 

Introduction

While current GIS products are useful tools for analysis of geographic and spatial data, they do not necessarily provide a sound platform for a scalable integration of spatial and legacy textual systems, and do not provide an application distribution model suitable for very fast, high transaction online delivery requirements of both internal applications and external (Internet) applications.

Online information delivery should be based on service delivery.

GIS products generally tend to be data centric, requiring access to low level data, spatial geometries as well as attribute data. They tend to rely on data being accurate and representative of the current legislation and business logic. In practice, legislation continually changes with time, hence changing the business logic or data interpretation. The data, however, does not change, and cannot be re-collected or modified to meet the business logic. New business logic has to be implemented to reinterpret existing data together with any new data to meet the new legislation.
Integration with other applications is usually done at the database level via ODBC and SQL. This low level data access necessitates (re-)implementing business logic within the GIS package. For large scale systems, such reimplementing of business logic should be avoided as it leads to errors, ambiguities, and high maintenance overheads.

An information custodian (system owner) should ideally provide access to published services, and not a direct connection to the database, that is, to raw data. An information custodian should not require its customers to implement the custodian's own business logic.

Legacy systems can be wrapped as objects (e.g. via an API, Messaging, socket, RPC or screen scraping) to provide this business object interface. Legacy systems are generally not SQL compliant. Those that are SQL compliant, and are high transaction rate online systems, generally use transaction processing monitors (e.g. Tuxedo, Top End). Consequently, there is no direct access to the database.

An architectural model is required that establishes a service delivery interface to business objects, and thus separates, and makes absolutely independent, the back end data access and the front-end presentation.

The proposed architectural model is suitable whether developing new or utilising existing systems, and is applicable across spatial, textual or image based systems.
 

Introduction to DEHAA and South Australian Government

The Department for Environment, Heritage and Aboriginal Affairs (DEHAA) is the major spatial information provider in South Australian Government to other agencies, external companies and the public.
Major external customer segments include the real estate industry, banks and financial institutions, surveyors, planners, plumbers, sub-dividers, and speculators etc.
 
DEHAA is the lead agency in the joint S.A. Government and private sector Spatial Alliance,

Existing Information Systems

A number of major spatial or land related systems are in use across the South Australian Government agencies. Many have up to now only been available for access within a particular agency or part thereof. Others have a substantial history of being publicly accessible as a chargeable service to account customers.
Some of the major systems currently integrated within InfoShop.SA or under investigation for future integration are the following:

LOTS (Land Ownership and Tenure System) is a large system which is used for maintaining and providing access to land title and valuation textual information including:

The system also provides a facility to place orders for the following: Customers can access LOTS via dial-up modem. LOTS is currently being rehosted from a Unisys A-Series Mainframe to a Sun platform.

DCDB (Digital Cadastral DataBase) is the fundamental spatial base reference for the State. It contains over 800,000 land parcels and associated information for the whole of the state including :

It is maintained in an Oracle database with in-house developed topological schema and extraction, locking and upload server software. Data is extracted and edited in ArcInfo.

PIERS (Plan Index, Enquiry and Retrieval System) provides for the maintenance, enquiry and retrieval of scanned plan images (File plans, Deposited plans, Strata plans, GRO etc.).

SDB (Survey DataBase) is a system for the maintenance of and enquiry on permanent survey marks.

Aerial Orthophotography consists of a series of geocoded aerial orthophotographs, currently of only a portion of the Adelaide metropolitan area.

TOPIS (Topographic Information System) contains topographic information for most of the developed area of the State. Topographic features include contours, rivers, lakes, streams, coastlines, roads and railways.

DFIS  (Distributed Facilities Information System) is a system belonging to the S.A Water Corporation. It contains water and sewer reticulation system spatial information, as well as associated scanned images of field book pages for as-constructed diagrams and internal drains (diagrams of property and house drainage).

Bore Holes and Mining Tenement information is maintained by Department of Primary Industries and Resources.

Licences for contaminated sites are maintained by the Environment Protection Agency (a Division of DEHAA).

Water License System is maintained by the Environment Protection Agency. It manages licences for water usage (dams, bore water, catchment areas).

Departmental Revenue System.  Details of the payments for information products by customers flow into the Accounts Receivable system.
 
All of the above systems provide specific information and navigation in their own right, however, when integrated they form a powerful decision making tool.

Current Problems in these Systems

The following are some of the problems in the current individual implementation of these systems or in the integration of the systems as a whole:

Market Not Being Met.

A directions paper identified three customer groups for the department's spatial systems:
  1. Operators: internal users responsible for maintenance and update of the data asset
  2. Analysts: internal and external users who work with large data volumes for complex analysis, often of a project nature.
  3. Browsers: internal and external users who access smaller volumes of data, perform simple enquiries, with online responses.
The third group, potentially by far the largest group, is currently not catered for by the existing systems and facilities. They are a prime candidate for using Electronic Commerce. This group requires the visualization of spatial information as an adjunct to non spatial information from legacy systems. It requires online response, thus services and databases should be scalable to large numbers of users and highly optimized for fast delivery times.

The sharing of databases between Analysts and Browsers may lead to unpredictable delays in online response times. Thus a design for the delivery of online spatial information services requires the combination of pragmatism and production thoroughness and robustness.

The third group, Browsers, both internal to Government as well as external (general public), needs the capability to access the systems from various locations. Consequently, the deployment of any client application needs to take into account the various desktop computers and operating systems that customers may want to use, as well as the fact that a spatial client application will typically be very graphical in nature It becomes obvious that the best chance of satisfying these requirements is through the use of Internet and Web technology.

High Level Business Requirements

The high level business requirements for government information integration fall into a number of areas: As part of the joint S.A Government and private sector Spatial Alliance, the Spatial Information Integration Services (SIIS) Project seeks to develop a spatial information infrastructure which will enable spatial information from any number of disparate data sets from all state organizations to be integrated and delivered to users and business systems anywhere across the state. The Project's goals are: Although begun by DEHAA prior to the establishment of the Spatial Information Integration Services Project, an initial implementation of InfoShop.SA will be the first deliverable of this project, and will provide a substantial foundation for future stages and further deliverables of the project.

Within DEHAA, the Resource Information Division plays a major role in the custodianship of the State's major spatial information assets which form the reference base for many other spatial data sets. It is also charged with the responsibility for the efficient delivery of these to Government and private sector users. Other Divisions of DEHAA, including Heritage and Biodiversity, and the Environment Protection Authority,  are also custodians of significant spatial information assets for which they have the requirement, in some cases legislated, that the information be made readily available to external public users.

As the primary custodian of the State's spatial information, DEHAA must ensure the maintenance procedures of the data assets are sustained to the appropriate level of quality.

As the primary provider of the State's spatial information, DEHAA must balance the requirements of

The Integrity of Delivered Information  requires that the raw data be converted to information at an appropriate level of abstraction by the application of business logic. This is to reduce the risk of misinterpretation of the data/information by users when raw data is provided to customers. In such cases, customers must implement the providers' own business logic. This undesirable situation is avoided if the business logic is applied before any information is provided.

The currency of the information is another important part of its integrity. While copies of data and databases can be created and maintained, for many applications the most up-to-date, authoritative information must be provided ( e.g. for legal or commercial reasons). Some legacy systems store the day's transactions in transient files, only applying these to the database during nightly batch runs. Legacy business logic references these transient files, so a copy of the database will never be as current as information returned via the legacy online system.

The Delivery System Management includes issues of

The differing Delivery Functionality requirements of Analysts and Browsers need to be accommodated. The system needs to be accessible by the customers. The system needs to be easy and intuitive to use with the required functionality to meet the customers' business requirements.  For Browsers, there is an increasing demand for integration of various on-line information sources.

The Delivery Value for Money depends on the price charged compared to how well the information matched the customer's requirement, and how easy it was for the customer to access, navigate, query, and interpret the results. It is a measure of  the customers confidence in the integrity and correctness of the information, whether the system is available when the customer needs it, and whether it is an enjoyable shopping experience.

Architecting to Reduce Complexity

Early attempts by organizations at integration of systems are characterized by heavily customized interfaces between individual systems. This leads to a solution of Order (NxN) complexity where each system has a specialized interface to every other system. Business logic for an individual system is spread across to the other systems, possibly mutating in the process. This complexity arises particularly when the  integration is attempted at the raw data or database level (i.e. by allowing access to the database schema), but just as probably whenever custom interfaces are developed on an ad-hoc basis. The design tends to become very monolithic in nature with very little if any reusable components.


By elevating the access to a system to the services level, where business logic is encapsulated behind a well defined interface, the complexity is reduced to Order (N). Each system manages its own business logic, and reimplimentation of business logic by other systems is no longer required. Access via the interface becomes the canonical information retrieval method for the system.

The Interface becomes the service contract between provider and customer.

The collection of individual systems now coalesce into a super-system, so that from a client application's point of view, the services are seen as one complete integrated service rather than as separate systems.

An information bus is created.

Integration of information on different systems by clients becomes relatively simple. However, it may not always be problem free, as there may be issues resulting from different data currency in the systems, or mismatches in basic query keys (e.g. name, address). These differences between systems are not easy to rectify and may often require recapturing data and/or reengineering. They need to be identified  and become part of the documentation for the interface. New business logic may be implemented in a server, and an associated interface provided, to aid in information integration tasks. Reengineering for unique identifiers, while generally expensive and difficult in the short term, is a solution which will usually reap significant benefits in the longer term.

As a service contract, not only are many possible types of clients permitted, but there may be many different implementations of servers. These can range from servers that access the database directly, to servers that are a wrapper for a legacy system, to servers that access real time data acquisition and monitoring devices. The client application developers are shielded from the complexities of the server implementation, and are able to more rapidly implement information systems satisfying business needs.

With this model, interfaces are distinct from server implementations as there may be many instantiations of servers implementing the same interface. This can be a powerful tool to make the overall system more generic. To aid genericy, meta-information services (e.g. Naming, Trading & Directory services) can be used so that clients are able to locate and access an appropriate server for their needs.

GIS Architecture

Spatial data holds some challenges in implementing a multi-tier architecture. GIS products have traditionally had a monolithic architecture, with presentation, business logic, and data storage all in the one program. A number of different proprietary data storage formats have arisen. Special spatial indexing is required for the storage and efficient  retrieval of spatial data. GIS and database companies are now delivering products that allow for the storage and retrieval of spatial data to be separated from the business logic and presentation tiers. Given this, separating the business logic from the presentation of spatial data is then the next logical architectural step.

Most GIS models tend to implement an abstraction commonly known as  layers, themes, or coverages which contain geometric entities and their textual (or numeric) attributes. Integration with other textual information sources is achieved through ODBC and SQL based connections to relational databases. However, the majority of information systems throughout the world are not in relational databases and are not SQL compliant.

For the Internet deployment of spatial information systems, a map image server which takes vector data and delivers raster images to a web browser is sometimes used. The map image server may be a full GIS product and respond to mouse clicks on the image. With this architecture, the presentation is effectively split between the web browser and the GIS product, similar to X-Windows. The GIS product still contains much of the presentation logic as well as all the business logic for the spatial data.
 
The InfoShop.SA architecture is specifically designed to allow for the separation of the data access, business, and presentation logic layers, with well defined and open interfaces incorporating spatial, textual and image information.

InfoShop.SA: A Framework for an Online Integrated Enquiry System

The InfoShop.SA mission is to define an application framework for the enquiry and delivery of integrated information from various custodians in a scalable, secure, and accountable manner: Fundamental use is made of the Object Management Group's (OMG) Common Object Request Broker Architecture (CORBA). The information services are provided through CORBA distributed objects. These objects encapsulate business logic. They may retrieve data in various ways: Whatever the source of data, the goal is to create publishable interfaces to business level objects, hence, CORBA IDL (Interface Definition Language) is used to define the interfaces to these objects. Other defined CORBA standard services can be used for meta-information services, to help in selecting an appropriate object.
 
The clients bind to objects and execute their methods by means of an Object Request Broker (ORB), which is effectively distributed object middleware. Many ORB products are available from numerous vendors. Communication with objects is achieved via the Internet Inter-ORB Protocol (IIOP) which constitutes the low level wire protocol. Interoperation between products and vendors is achieved through use of this standard.

The differing requirements of maintenance (update) and delivery of information means that sometimes it is preferable to split these into two different systems. The business requirements for DCDB maintenance require the full spatial topology (relationships between constituent polygons, lines, and nodes) be stored within a relational database. For delivery of spatial information, a non-topological storage schema (SDE) has been used in order to optimise for query and retrieval speed. Regular transactional updates (currently nightly) from the maintenance database to the delivery database ensure that the information remains current and complete.

Security is an issue for all systems and Internet deployment only heightens the requirements.  Security must be instituted at a number of levels. Network firewalls are used to protect the production host network from the rest of the government network as well as from the Internet. As part of the firewall, the bastion host is the only access point. It runs the Web Servers and Application Proxies to which the clients connect. The clients cannot connect directly to the back-end production hosts.

Additional utility servers are used for user authentication and authorisation, and to log and process usage accounting information. Information on chargeable activities are required to be fed into the Departmental Revenue system.

The security features are expected  to benefit from the incorporation of the CORBA security service standard, as products become available. Standards for credit card and other electronic payments systems are expected also to be incorporated.

Client Architecture

Many client applications can be implemented to make use of the backend information services. A client application writer will make use of the published IDL-defined interfaces. The interfaces are not language specific. The applications may be written in any language for which an ORB vendor supplies a mapping. Additionally CORBA/COM bridges may be used to extend access to Microsoft COM applications.

The main target platform for applications today is the web browser. This platform provides for easy deployment of static and dynamic pages of information. However, HTML and webserver based applications (e.g. a map image server using CGI, NSAPI, or ISAPI) are limited in their ability to provide the responsive dynamic graphic user interfaces (GUI) required for spatial information. An object oriented GUI is needed to provide the dynamic ability to select and highlight individual spatial objects (points, lines, polygons) and then query them for additional information. In contrast, a map image server cannot achieve the required responsiveness and interactivity, as user interaction with the map image (e.g. a mouse click to highlight a polygon, or to pan or zoom) requires a new map image to be generated by the map image server and delivered by the web server.

The Java Virtual Machine (JVM) has been incorporated into the two major browsers from Netscape and Microsoft. This provides an excellent platform for the easy deployment of a highly interactive graphic presentation coupled with the implementation of communications protocols such as CORBA/IIOP.

For InfoShop.SA deployment, a Java Applet has been developed to provide the presentation layer across the range of backend services. Textual based queries can be made which lead either to textual, spatial or document image responses. Easy navigation across textual and spatial is achieved. Confirmation is requested before any chargeable queries are made to a server.
A major feature is the dynamic map which provides the focus of activity for spatial applications. Multiple layers, or themes, can be viewed, and can be toggled on and off. Point, line and polygon geometry structures are returned from the spatial server. These are cached locally within the applet, along with simple attribute information. Pan and zoom capability allows the user to spatially browse an area with extra data being retrieved as needed for display. The spatial entities (points, lines, polygons) can be selected (by single clicking), and queried upon (by double clicking). The information displayed on selection is its associated attributes. On querying an entity, and after confirmation, the applet may launch a query on one of the backend servers to retrieve information. For instance:

With this form of integration, the whole is greater than the sum of the parts. The added benefits flow from the visualisation achieved from spatial browsing with multiple spatial themes, and from the assured links to the authoritative and most up-to-date information sources.

Technical Design Issues & Decisions

The following products and technologies were chosen to initially support the application framework. Further products may be included at a later date. Similar, even competing, products may be used where it makes business sense or because of integration of existing legacy systems brings along a different product set. The architecture is designed as a long lasting one, and some of these products may wane and others take their place. This is the strength of systems based on architecture rather than on specific products: The system is evolutionary and may outlive particular products, platforms and implementations.

Java

The main technical decision after choosing CORBA has been the choice of Java as the main development language. Additional features are being put into future releases of the JDK (Java Development Kit) by Sun, but JDK1.1 has proved sufficiently stable for our developments. There has been a substantial increase in productivity of programmers using Java compared to when they were using C/C++, mainly due to Java's inherent memory management.

Esri SDE

The Spatial Database Engine (SDE) product from Esri has been used to build and deploy an optimised delivery spatial database. A non-topological storage schema  has been used in order to optimise for query and retrieval speed. Regular transactional updates  from the maintenance database to the delivery database ensure that the information remains current and complete.
Lack of a Java (or CORBA) interface has hampered development. SDE currently provides only a 'C' language interface. A Java interface has been built on top of this. A Java CORBA spatial server has been created. Differences in threading models has meant that the CORBA server has had to be split, with an RMI server providing the interface to SDE. It is hoped a future version of SDE will provide either a direct Java interface or a CORBA interface.

Visigenic Visibroker for Java

Visigenic Visibroker for Java was chosen as the CORBA ORB. At the time, it was the only ORB supporting server side Java.
It allows Internet deployment via the Gatekeeper application proxy and HTTP/IIOP tunnel.
Visigenic have a COM/CORBA Bridge which can be used to provide access to CORBA objects from a Microsoft Windows COM application environment and vice-versa. Visigenic is yet to release implementations of the CORBA Security Service, Transaction Service and Trader Service.
Visigenic is now owned by Borland,

Oracle7

Oracle is the South Australian Government's preferred relational database.
The Oracle JDBC driver provides a method for Java to connect to an Oracle Database.
Oracle8 and the Spatial Data Cartridge are under investigation for possible future use.

WebServer

The main webserver in use is Apache. Others in use are Java Web Server and  Netscape Enterprise Server.
Use is made of Java Servlet support which can be supported in all three webservers.

Web Browsers

The decision was taken to target Java 1.1. compatible browsers which currently are: Differences between browsers have sometimes restricted development progress. There has especially been a problem enabling printing from applets in a consistent way.
 

InfoShop.SA - Current Project Status

InfoShop.SA is  to be the first deliverable of the Spatial Information Integration Services Project. It is scheduled for production release in June 1998. The initial release will feature integration of the following systems: Security, accounting and charging will be included. The initial target customer base will be internal government users along with existing external LOTS users. Internet access is scheduled to occur later in the year.

In March 1998, DEHAA received an Australian Government Technology Productivity Gold Award for InfoShop.SA.

Business Benefits to be Realized

The business benefits that are anticipated to be realized by the Department from the use of InfoShop.SA include:

Futures

Proposed future developments include:

Conclusion

A general framework for distributed systems has been described for the online delivery of integrated information. The information may be in various forms including textual, document images and spatial. The delivery of spatial information to a client application can be encompassed by a scalable enterprise application framework to provide a key visualization and navigation aid within the application. A standard information service interface for each information system contributes to the ease of integration between the systems and the overall integrity of information. To meet customer business requirements, the integration in many cases, needs to be with a legacy system and not with a database (whether a copy or the master database)

An instantiation of this framework is currently being implemented within a State Government context across multiple agencies. Efficiencies resulting from streamlining of workflow will result within Government agencies. The Department will be able to provide information to customers with increased information usability as well as integrity. This increases the value of the information to the customers as it better satisfies their business requirements.
 


References


Author Information

Nick Schultz
Technical Consultant
Department for Environment, Heritage and Aboriginal Affairs (South Australia)

GPO Box 1047
Adelaide S.A. 5001
AUSTRALIA

Email:  nick@dehaa.sa.gov.au
Telephone: +61 8 82049215
Facsimile: +61 8 82049017

Tom Mittiga
Manager IT Planning
Department for Environment, Heritage and Aboriginal Affairs (South Australia)

GPO Box 1047
Adelaide S.A. 5001
AUSTRALIA

Email:  tmittiga@dehaa.sa.gov.au
Telephone:  +61 8 82049010
Facsimile: +61 8 82049017