Accessing spatial data and metadata using ArcIMS and the open source language PHP

Andri Baltensweiler

Abstract

This paper presents the integration and access of spatial data and non-spatial metadata using ArcIMS and PHP, an open source server-side scripting language. The project "Data Center for Nature and Landscape" of the Swiss Federal Research Institute for Forest, Snow and Landscape WSL, Switzerland has the aim to develop a comprehensive storage and retrieval system for environmental data in a process oriented database model. This model reflects the lineage or history of data sets. Combining ArcIMS and PHP allows to access and visualize metadata such as PDF documents, images, etc. via the internet in a user-friendly and effective manner.

INTRODUCTION

The Swiss Federal Institute WSL and the Swiss Agency for the Environment, Forest and Landscape agreed on a research arrangement concerning the storage and management of data of environmental protection areas such as mires, riverine forests, semi-meadows (Grünig, 1994). This project, called "Data Center for Nature and Landscape" DNL focuses on the design and implementation of a database containing geometric objects which represent the spatial extent of these protected areas. In addition to these spatial data, base documents such as images, documents containing the criteria for selecting protection areas and governmental documents which build the legal base for the protected zones are stored as well.

The database design of the long term project DNL has to consider these different types of information. The selection of environmental protection areas consists of several steps, starting for instance with a governmental decision which initiates the selection process. In a next step, experts determine criteria for choosing appropriate areas. Subsequently, base documents are used for digitizing potential protection areas. During succeeding field work, potential areas and their spatial extents are verified, followed by a refinement of the digital objects. Finally, a legal document published by the government determines the exact position and extent of the protection areas. Based on this stepwise working progress scheme a process-oriented database model was introduced. Every step delineating protection areas is considered as an individual process which is related to a certain type of data and corresponding metadata (Brändli, 2000a).

Besides the design and implementation of a database model the project agreement includes the implementation of the access software enabling the exploration of the data via the internet. The client application has to be capable of visualizing both the spatial data and the corresponding metadata such as images and documents etc. The data have to be accessible by an easy and intuitively to use application interface.

THE DATABASE DESIGN

As already described the database scheme of the DNL project focuses on the processing steps of the data. This approach has the aim to conserve the history of the data or lineage so that the data are comprehensive and reproducible. Lineage is explained as “The recounting of the life cycle of a data set, from its collection or acquisition, through the many stages of compilations, corrections, conversions, and transformations to the generation of new interpreted products“ (Clarke and Clark,1995). The database model represents each workstep as a process which is in turn considered as a transformation process where new information respectively data are generated. For each process the necessary metadata (e.g. observer, time, used software and hardware) and the process data (e.g. imagedata, spatial extent, documents) are stored (for an example see Figure 1). Similar processes are combined to process classes which are characterized by individually similar metadata and process datatypes (e.g. spatial geometry, image file).

To describe the whole lineage of a dataset not only the process classes are required but also their interrelationship must be defined (Brändli, 2000b). This means that references to the predecessors as well as to the successors of a specific process have to be stored explicitly as part of the metadata. A detailed description of the design and implementation of the DNL database approach is presented in Baumberger and Hägeli (2000).

Process Steps

Figure 1: Metadata as a part of process steps and linking of processes

In this paper the term metadata is used not only for data which describe processes themselves but also for the data which are related to spatial features such as legal documents, images etc.

The database model and the storage of the data is implemented in an Oracle database using relational model techniques. For storage, management and access of the spatial data we apply the Spatial Database Engine (SDE) from Esri (Esri, 1998).

VISUALIZING METADATA WITH ArcIMS

Requirements

The client software for database access in our project DNL has to fulfill various requirements. The application has to provide all the common GIS functionalities such as zooming, pannig, buffering etc. In the long term project DNL many specialists with various backgrounds are involved. Since most of them are neither GIS specialists nor do they have knowledge in a database query language like SQL (structured query language), the development of an easy and intuitive application interface was a key issue. Also the data have to be accessible to a wide user community spread all over Switzerland and the data have to be served via the internet. An internet solution however rises the issue of problems of security and access control by the definition of different levels of privileges.

Another key issue concerns the design of the client tool which has to be capable of visualizing both the spatial data and the corresponding metadata by one application. The intention is to make spatial selection on geometric features possible for the user so that the relevant non-spatial metadata appear automatically. In some cases the metadata are related to a multitude of geometric features. This kind of information can be e.g. a political resolution, a scientific project or a text overview of the scientific inventories. In this case the desired data can only be retrieved by recursive programming in combination with the relevant metadata (Steinmeier, 1999). Recursive programming enables to track the lineage of processes and thus determine the specific process number which in turn defines the wanted records, e.g the information searched for.

Finally the application has to be able to deal with any kind of metadata formats, which might be either ASCII text files, PDF-documents, images such as TIFF, JPEG etc.

Implementation

Esri’s internet mapping software ArcIMS satisfies most of the requirements regarding the GIS part of the application. ArcIMS provides a highly scalable architecture and a good integration with other Esri products.

In the first phase of the project the ArcIMS standard HTML client is choosen. This client provides all basic GIS functionalities and is running on Internet Explorer or Netscape higher than version 4.x. ArcIMS 3.1 provides the required security control in combination with the standard Servlet Connector so that only people with the proper credentials can have access.

In our project DNL ArcIMS is installed on an UNIX Server running on Solaris 8 with an Apache webserver. Esri’s HTML client was adopted with minor changes in the DNL project.

However in order to visualize the metadata new functionality was added to the client by means of the open source language PHP. PHP is a serverside scripting language for creating dynamic web pages (Castagnetto et al., 1999). Unlike other scripting languages for web pages development, PHP offers excellent connectivity to the most common databases. PHP is installed as a module in the Apache webserver.

The mechanism for retrieving the desired metadata is implemented as follows (Figure 2). The standard HTML client is extended by a button which allows to query metadata for selected spatial features. The user clicks on this button and performs a spatial selection on the active layer. This action initalizes a client side JavaScript code which parses the incoming ArcXML stream from the ArcIMS application server.

ArcIMS Application Scheme

Figure 2: Scheme of ArcIMS application as designed in the project DNL

The primary key of the selected spatial features are passed to the serverside PHP script. PHP establishs a connection to the Oracle database via the Oracle Call Interface (OCI). OCI allows to open connections to an Oracle database, the execution of SQL statements and the processing of the results. Depending on the requested information PHP processes the metadata according to their various formats. Assuming the metadata to be a pdf document, PHP reads the binary data as a binary large object (BLOB) from the Oracle database and provides the data stream with the desired http header information before the document is sent to the client. The requested metadata are visualized in a new browser window (Figure 3). This whole procedure is easy to implement and provides an efficient performance.

Metadata Visualization

Figure 3: Visualizing the metadata: (1) metadata button (2) selecting spatial features of the active layer (3) corresponding metatdata appears in a new browser window

CONCLUSIONS AND OUTLOOK

The presented work conducted within the framework DNL is still at an early stage. Therefore the database design has not yet been tested extensively. However, first experiences show the promising potential of our database approach, i.e. the integration of heterogeneous data into a process-oriented model. This is a successful strategy for managing data of different types under a common heading. Furthermore, the database design is a useful tool for people who collect the data, since they have to structure the data at the beginning of collection according of the database scheme. Also the data are approved by the database regarding their integrity and comprehensiveness. Potential gaps in the data collection are recognized at an early stage.

The developed ArcIMS-PHP application with the dynamic data retrieval strongly supports the user in exploring the metadata. The user is able to investigate a complex database without knowing any SQL-statement commands. In addition the performance for visualizing the metadata is excellent.

However Esri's HTML viewer for the analysis of the spatial data is very slow. The HTML viewer consists of almost 10'000 lines of JavaScript code which are necessary for constructing and parsing ArcXML requests and responses. The result is a fat client which causes long download times via the internet. For the application in an intranet environment the HTML viewer satisfies performance requirements.

A further drawback of the HTML viewer is the lack of any feedback to the user when the client is parsing large responses. Frequently the browser simply appears to be locked.

To avoid these disadvantages data processing needs to be moved from the client-side to the server-side. This requirement is implemented by the new software release ArcIMS 4 in combination with Java Server Pages (JSP) or Java custom servlets. In the course of further development of our project DNL we intend to use this new ArcIMS architecture. In spite of our efforts to develop the PHP data retrieving application which has proven to be fast and reliable we shall exchange it in the future by the Java application programming interface JDBC. JDBC supports basic functionality to SQL databases. In combination with JSP and Java custom servlets this exchange allows us to arrive at a homogenous Java development environment.

REFERENCES

Baumberger, N. and M. Hägeli 2000. Using metadata in multistep preprocessing and longterm monitoring. Working Paper No. 15, GIS Work Session UN/ECE, Conference of European Statisticians, Neuchatel, Switzerland. http://www.unece.org/stats/documents/2000.04.gis.htm

Brändli, M. 2000a. Effective compilation and browsing of geographical metadata using common interoperability tools. Working Paper No. 20, GIS Work Session UN/ ECE, Conference of European Statisticians, Neuchatel, Switzerland. http://www.unece.org/stats/documents/2000.04.gis.htm

Brändli, M. 2000b. A process-oriented approach for representing lineage information of spatial data. 3rd AGILE conference, Helsinki/Espoo, Finnland. http://www.fgi.fi/agile2000

Castagnetto, J., H. Rawat, S. Schumann, C. Scollo and D. Veliath 1999. Professional PHP Programming. Birmingham: Wrox Press Ltd.

Clarke, D. G. and D. M. Clark 1995. Lineage. In: Guptil, S.C. and J. L. Morrsion (eds). Elements of spatial data quality. Oxford: Elsevier Science Ltd.

Esri 1998. Spatial Database Engine. Esri White Paper, Environmental Systems Research Institute, Inc. Redlands.

Esri 2002. ArcIMS 4 Architecture and Functionality. Esri White Paper, Environmental Systems Research Institute, Inc. Redlands. http://support.Esri.com/index.cfm?fa=knowledgebase.whitepapers.viewPaper&PID=43&MetaID=311

Grünig A. 1994. Mires and Man. Mire conservation in a densely populated country - the Swiss experience. Excursion guide and symposium proceedings of the 5th field symposium of the International Mire Conservation Group (IMCG) to Switzerland 1992. Birmensdorf: Swiss Federal Institute for Forest, Snow and Landscape Research.

Steinmeier, C. 2000. Operational GIS User-Interface for hybrid Geo-Data based on dynamic Data Retrieval. Proceedings ISPRS 2000 - Geoinformation for all; XIXth Congress. Amsterdam.


Andri Baltensweiler
Swiss Federal Research Institute for Forest, Snow and Landscape
Section Landscape Inventories
Zürcherstrasse 111
8903 Birmensdorf
SWITZERLAND

Phone: +41-1-739 24 96
Fax: +41-1-739 22 15
Email: andri.baltensweiler@wsl.ch
Web: http://www.wsl.ch