Donald R. Block

USING ArcInfo AS A GIS SERVER ON THE WORLD WIDE WEB



Table of Contents


INTRODUCTION

The World Wide Web (WWW) is becoming widely known as the most "user friendly" way of accessing the Internet. Its use has been growing exponentially over the last year and should continue to grow (Figure 1). The most well known and often used features of this client/server environment are the abilities to deliver integrated text, graphics, and sound files and to link them together in a hypertext fashion. Another less frequently used feature of this environment is the ability for clients to use WWW servers to run applications and receive the results from the applications.

Several WWW sites have been offering GIS applications over the net with an approach that attempts to allow interactive GIS functionality over the WWW. Although this functionality is useful, and opens new avenues for GIS applications development, none of the existing applications offer a truly interactive GIS environment. The major problems are network bandwidth limitations and WWW server resource limitations. Although network capacity is increasing, so is the demand for that capacity. The "available" bandwidth to support a truly interactive GIS application, with many users, that meets the users' response time expectations, is at least a year off.

This paper outlines a different approach to providing GIS functionality over the WWW. It makes use of the WWW features for providing a "friendly" user interface, but provides a more "batch" oriented approach than other currently available WWW GIS applications. Using this approach the users' Internet clients are not tied up waiting to connect to a busy WWW server machine or waiting for long GIS processes to complete. The users are therefor free to submit more GIS processing requests or to go on "surfing" the Internet, and GIS WWW application developers are free to develop applications that meet user's analytical needs without being driven by unrealistic response time requirements

BACKGROUND

The WWW is a network of computers that communicate using the HyperText Transport Protocol (HTTP) in a client/server relationship. Documents transferred between the client and the server are formatted using a language called the HyperText Markup Language (HTML) and bundled within the HTTP. HTML documents contain ASCII text and "embedded links" to images, audio, video, and other HTML documents. Embedded links contain Uniform Resource Locators (URLs) that define the protocol, host machine, and location of linked files. These links provide for world wide, distributed, hypertext linked, virtual documents containing text, images, audio, and video.

The National Center for Supercomputer Applications (NCSA) at the University of Illinois in Champaign/Urbana, IL developed both WWW client and server software. Although other WWW software exists, the WWW software discussed in this article is the client and server software developed and distributed by NCSA. More specifically, references to Mosaic and the WWW server in this article pertain to Mosaic version 2.4 and httpd version 1.3 respectively.

APPROACH

WWW clients typically send one of two types of HTTP requests to WWW servers. Either a request to "GET" a file or a request to "POST" information to the WWW server. In both cases the WWW client will open a connection to the server, send the request, and wait for the WWW server to return a response before closing the connection and completing the transaction. During this transaction the user is unable to use the WWW client.

WWW servers usually handle "GET" and "POST" requests differently. For "GET" requests the WWW server will typically use the location information in the URL associated with the request to find a file and return it to the user (Figure 2). These transactions are usually short since finding a file identified by a URL doesn't take much time.

For "POST" requests the transactions can take much longer since the WWW server typically acts only as an intermediary process and must wait on a gateway(Endnote 1) process to complete before responding to the WWW client (Figure 3). A long transaction will render a WWW client useless for a long period of time and also "tie up" one of a limited number of connections to a WWW server. GIS gateway processes often result in long transactions since many GIS functions are very CPU and I/O intensive. The goal in developing the GIS server gateway process discussed in this article was to break the long GIS transactions into two separate transactions: one short transaction to submit the request and one longer transaction to process the request (Figure 4).

The short transaction is actually very similar to the "POST" transaction outlined previously except it is designed for a quick response. It includes the submission of information by the WWW client through the WWW server to the GIS server and a quick response by the GIS server when it receives the information. The GIS server can respond quickly because its only tasks are to create "in progress" application product files, queue up the requested GIS application, and respond that the queuing has occurred. The "quick response" includes links to the "in progress" application products which hold processing status information while the application is running and will ultimately hold the application products when processing is complete. The "quick response", and thus the completion of this short transaction, allows the user to go on using the WWW client.

The long transaction, in this two transaction process, requires the user to come and retrieve the information when the processing is completed. The user is actively notified via Email when the processing is completed and passively notified in a status log if the user wants to check on the status. If the user attempts to retrieve the information before processing is completed the "in progress" files will be returned with processing status information.

APPLICATION EXAMPLE

A GIS application has been developed to show the feasibility of this two step approach. The application generates a population density map and an estimated total population for a circular area surrounding a user identified location in Washington, DC. The user is also allowed to enter a map title and a radius for the circular area. If the user wants active notification when the process is complete an Email address can be entered. The application involves a HTML input form requesting user information and a GIS server application that utilizes the user input to produce the map product (Figure 5).

To produce the desired results this GIS server application converts the pixel location identified by the user to geographic coordinates and then uses GIS functions to aggregate the information to produce the map product. For more information on this particular GIS server application please contact the author since the details of this application are beyond the scope of this paper.

The click to locate the center of the area of interest also submits the information from the WWW client to the WWW server. Upon receiving the information the WWW server reads the submitted information and determines it is a "POST" to the GIS server process. The WWW server then spawns the GIS server process and forwards the information to the newly spawned process. The GIS server process reads the incoming information, creates uniquely named temporary output files to hold the status log and temporary "in processing" files, starts the "batch" GIS application, and sends a response back to the WWW server stating that the request has been received and is being processed. Also included in the returned message are HTML links to the output product files, which include the "in processing" messages until the GIS application is finished making the products. The WWW server forwards this message back to the WWW client (Figure 6) and control of the client is returned to the user.

Once the WWW client receives the message that the process "has been submitted" the user can go on to use the client for other tasks or try to retrieve the map product output using the hypertext link provided. If the GIS server has not completed the processing when the hypertext link to the product is selected the user will receive an "in processing" message (Figure 7).

If the user supplied an Email address when submitting the processing request, he/she will be notified via Email when the output product is available. The Email message will include the URL for making the retrieval via the WWW. When the processing is complete the output product (Figure 8) is placed into the "linked" location reported to the user and is then available for retrieval for a period of 2 days.

CONCLUSION

Activity on the WWW will continue to increase at a rate equal to or greater than the rate of available bandwidth for the next few years. Therefore, networks will likely remain congested and network traffic will remain a major consideration when designing GIS applications for the Internet. GIS applications are typically CPU intensive and the graphic nature of these applications requires large amounts of information to be transferred between the user and the GIS process. Both these factors indicate that many GIS applications will not be feasible using an interactive WWW design because response times would be unbearable waiting for these long transactions to complete.

Providing GIS capabilities over the WWW is still possible using an interactive Mosaic "front end" and performing the GIS tasks using a batch oriented GIS server. Although not completely interactive this design provides a "user friendly" interface and allows GIS applications to be developed in today's Internet environment.

ACKNOWLEDGEMENTS

I thank Thomas Fowler for his work on converting the prototype application to allow batch processing, for developing some of the unix programs used in the GIS server, and for his constructive comments throughout the development of this tool.

List of Figures



DISCLAIMER

Although the research described in this article has been funded wholly or in part by the United States Environmental Protection Agency through contract 68-W2-0025 to Martin Marietta Technical Services Inc., it has not been subjected to Agency review and therefore does not necessarily reflect the views of the Agency and no official endorsement should be inferred.

ENDNOTES

1. Gateway processes are often written using the Common Gateway Interface (CGI) method and are often referred to as CGI processes. More information on this methodology can be found on the Internet at URL: http://hoohoo.ncsa.uiuc.edu/cgi/overview.html.


Donald R. Block
Martin Marietta Technical Services Inc. (a Lockheed Martin Company)
79 Alexander Dr., MD-4501-1B
Research Triangle Park, NC 27709
Telephone: (919) 541-4758
Fax: (919) 541-1948
E-mail: block_don@unixmail.rtpnc.epa.gov