Johnny Marshall

Developing Internet-Based GIS Applications

Abstract

Developing applications for the Internet is one of the newest challenges developers face today. The growing capability of the Internet has created a demand for applications that use geographic information systems. When building GIS applications for the Internet, a developer can choose any of a number of architectures, technologies, and methods. This paper examines several of the techniques that can be used to develop Internet-based GIS applications using MapObjects with both MapObjects Internet Map Server (IMS) and Active Server Pages (ASP). The findings and techniques discussed are based on three GIS applications developed by INDUS Corporation.

Introduction

INDUS Corporation was recently contracted by a government agency to develop an Internet-based GIS application that would integrate data from the various organizations within a community and provide an analytical and support system to participants across the community. This project was part of a pilot program and was to be deployed in several cities that met certain criteria. This eliminated the typical restrictions of a software design and development project, and allowed the freedom to make recommendations for the system's implementation based on best industry practices.

Our goal was to develop a system that was open and could be ported to the various cities with minor modifications. The general requirements for the system were to provide basic mapping, querying, and reporting capabilities, and to be deployed via an Intranet/Internet mechanism. In addition, the system was to use a collective data repository consisting of geocoded spatial, and related tabular data from various organizations within the community. To meet these general requirements, we developed an Internet-enabled GIS application built on Esri MapObjects v1.2, MoIMS v2.0, Microsoft (MS) VisualBasic 5.0, and MS IIS 4.0. After completion of the first pilot project, we continued enhancements to the application while maintaining the overall application's architecture and design, and developed a modified version for the second pilot. As we learned more about the capabilities and limitations of the Internet, we developed a third version based entirely on ASP and MapObjects v2.0. Each system could utilize any ODBC compliant database in the back-end. This paper discusses the design considerations, architecture, development and deployment and related issues of these Internet-based GIS systems developed by INDUS.

Gathering and Analyzing Requirements

Gathering requirements and understanding the users' needs are always the first step in a software development project. The primary goals of the requirements gathering phase were to:

Determine users' functional requirements
Identify the available data sets
Understand the organization's information technology infrastructure
Determine the system's audience

To accomplish this, a team was assembled to interview members of the different organizations and build a list of features based on the users' desires. Each feature was itemized with a brief general description of the functionality it was to provide and the data sets it would require. At the completion of the requirements gathering phase, the lists of functional requirements from each city were combined with the lists of required data sets into a comprehensive matrix (Figure 1).

The function/data matrix was composed of four fields: a unique identifier, a function name, required datasets, and a description. The description field provided a description of the function's input and output. The function/data matrix served several purposes: it enabled the identification of similar functions based on the required datasets, enabled the identification of similar datasets, provided a means of performing a function point analysis for the project, and provided a guide for acceptance testing. The identification of similar functions and datasets was simplified by exporting the matrix to MS Access (any database application would do) and generating queries that grouped the functions by name, by datasets, and by description.

Two types of requirements emerged from the requirements analysis phase: functional requirements and non-functional requirements. The functional requirements were those that carried out the specific tasks that met the users' requirements, and the non-functional requirements were the functions that made up the core application (i.e. security, zoom, pan, identify, etc.). We also observed that certain functionality was required from all the pilot cities. By designing a base application that provided standard GIS functionality, we could meet the requirements of all the cities with minor modifications to the base application. We then developed requirements for a base application that could provide standard GIS functionality in an Internet environment. The base application was to have an Esri ArcView (tm) look and feel and would be customizable on the client side by providing an easily modifiable pulldown menu that launched dialogs used for building custom queries.

Choosing the Right Architecture

There are basically two types of architectures for developing Internet-based GIS applications: client-side, and server-side. In a client-side Internet GIS application, the client (Web browser) is enhanced to support GIS functionality while in a server-side GIS application, a Web browser is used only to generate server requests and display the results. Client-side GIS applications are implemented typically by enhancing the Web browser with a Java applet, ActiveX, or plug-ins. Some client-side applications even require users to install a complete client application. In either case, client-side applications require software of some kind (other than a browser) to be transfered to the user (Figure 2). An example of a client-side Internet GIS application is one that runs as a Java applet. The code for the applet is transfered to the Web browser as binary instructions that provide a graphical user interface (GUI) for the GIS application. Vector-based data is then transfered to the client enabling the complex GIS functions on the client. This architecture should not be confused with a similar server-side GIS architecture that implements a Java applet to create a GUI for the GIS application. In this case, the applet is simply an interface for an image. The complex GIS calculations and data remain on the server.

Client-Side Architecture

An example of a server-side Internet GIS application is one typical of the mapping applications found on the major Internet portals. In these applications, users send a request to a server (i.e. an address), and the server processes the request and sends the results back as an image embedded in an HTML page via standard HTTP. The response is a standard Web page that a generic browser can view. In server-side Internet GIS applications, all the complex and proprietary software, in addition to the spatial and tabular data remain on the server (Figure 3). This architecture has several advantages because the application and data are centralized on a server. These advantages include simplified development, deployment, and

Server-Side Architecture

maintenance. A comparison of the advantages and disadvantages of server-side and client-side GIS applications is tabulated below.

Advantages to Server-Side Internet GIS Applications	Disadvantages to Server-Side GIS Applications
Simpler to develop	Primitive Graphical User Interface
Easier to deploy	Low graphics quality
Easier to maintain	One-click functionality from a browser
Adheres to Internet standards
Requires standard Web browser
Low bandwidth required
Advantages to Client-Side Internet GIS Applications	Disadvantages to Client-Side Internet GIS applications
Vector data can be used	Difficult to develop
Better image quality	Requires additional software
Enhanced GUI	Longer download times
	No adherence to standards
	Platform/browser incompatibility

The optimal architecture depends on the systems requirements. Our goal was to develop an application that could easily be modified and ported to various cities regardless of the platforms or network capacities of the cities. We were also faced with developing the applications on a tight deadline and budget. These primary factors, standardization and ease of development - advantages of a server-side architecture, were the two major influences on our decision to choose a server-side architecture.

The Web is a stateless environment. A Web server receives a request from a client, processes the request, and sends a response with no knowledge about the client's state unless state maintenance is used. This is similar to a common software architecture known as pipe-filter architecture. Once the architecture of the system was selected, we realized that there were two additional architectural choices pertaining to state maintenance for a server-side application. We could choose a pipe-filter approach and accomplish state maintenance on the client side, or we could choose an object-oriented approach and accomplish state maintenance on the server side. Using a pipe-filter approach, state maintenance on the client side is accomplished by storing all state maintenance variables (extents, layers, command, input variables, etc.) on the client side. This is done by one of two methods, or the combination of both; the use of browser cookies, and/or the use of hidden input tags (or paramater tags) in the HTML page. When a user sends a request to the server, all the state variables and command(s) are extracted from the HTTP request. The command is then executed on the server, a map is generated reflecting the user's current state, the new map image is wrapped in HTML (including the current state variables), and the Web page is returned to the client. State maintenance on the server side is accomplished by maintaining map and database objects on the server for the life of a user session. The state maintenance variables are directly accessed as properties of the objects that are maintained on the server. The only state variables required to be maintained on the client are those that describe the interface (i.e. last command, active layer, etc.) A comparison of the advantages and disadvantages of server-side state maintenance is listed below.

Advantages to Server-Side State Maintenance

Easier to develop
Less server processing required
Allows for complex applications

Disadvantages to Server-Side State Maintenance

Not highly scalable
Requires implementation of Session Management

Server-side state maintenance does not present a highly scalable solution but was a necessary choice because of the requirements for all the cities. The project had a non-functional requirement of allowing further analysis of query results. If we chose the highly scalable pipe-filter approach, we would have to not only regenerate each map, but execute all the prior commands performed on the map for each request a user sent. Scalability was not a critical factor because the application was to be used by a small group and would be deployed in an Intranet environment. For these reasons, we chose the objected-oriented approach of maintaining state on the server with a server-side application.

System Design

The system was designed with three distinct components: a server application, a client interface, and a data repository. It conformed to a multi-tiered layered architecture typical of server-side Internet applications (Figure 4).

System Diagram

The client layer consists of a personal computer running a Web browser. This layer provides the user interface and operates by generating requests to the application server via HTTP and displays the resulting HTML file in a Web browser. The middle layer is itself a layered system consisting of a Web server layered on an application server. The Web server receives requests from the client which are processed by the application server�s Web administration module (MoIMS WebLink), then passed to the application server. The application server makes requests to the data layer via TCP/IP and ODBC. The data layer is a data repository consisting of a relational SQL compliant database, and one or more directories of flat files in Esri shapefile format. The data repository is built and maintained through an off-line data migration process that involves updating the data tables with new data, and geocoding new shapefiles. Although an off-line process, data migration is an integral part of the system and is included in the overall system design (Figure 5).

System Diagram

In this design, data in the repository are updated via a migration process. The data repository is then accessed by the Esri map application by means of an ODBC-TCP/IP connection. The map application processes data and generates HTML files which are in turn served to a client PC running a Web browser.

Hardware

There are several hardware configurations that can support this system design. The configurations are: single computer configuration, two computer configuration, and multiple computer configuration. In the single computer configuration, the Web server, application server, and database server are installed on a single computer. In a two computer configuration the Web server is installed on one machine, and the application server and database server are installed on a separate machine. In the multiple computer configuration, each component is installed on a separate computer. The ideal configuration for a particular deployment depends on the anticipated number of users visiting the site each day, and number of maps served. Esri makes the following recommendations based on the number of anticipated daily users.

Configuration	Anticipated Number of Users
Single computer	100 - 1000
Two computer	1,000 � 1,500
Multiple computer	1,500 +

For our system we anticipated less then 1,000 users/day, thus the single computer configuration was used. The following hardware was recommended as minimum requirements for the system.

Server

Dual processor � 500Mhz
512 Mb RAM
Dual hard drive storage @ 13 Gb each
T1 (or better) Internet connection

Software

The following software was recommended for the systems' development and deployment.

OS � Windows NT server
RDBMS � optional any SQL compliant RDBMS
Microsoft Internet Information Server or Netscape Server
Esri Internet Map Server � Map/Application server
Windows 95/NT clients
Internet Explorer v4.01 or Netscape Communicator v4.05
ArcView v3.1 � Geocoding engine
Visual Basic v5.0 � Application development
Esri MapObjects � Application development
MS Visual Interdev � Client development

System Components

A server-side Internet GIS application is composed of four distinct components: Web browser/client interface, Web application/server, GIS application/Map server, and a relational database management system. Although the components are integrated into a single system, each component is distinct and should be considered separately.

Client Interface

The client interface for an Internet GIS application is typically a Web browser implementing HTML form elements or implementing a Java applet. The client component can consist of a series of static and dynamic HTML pages that may or may not be implemented using HTML frames. We designed the interface for the application using frames because of the following advantages:

the entire interface does not have to be transmitted for every request
frames could be resized and scrolled indvidually
provides a look and feel similar to Esri ArcView (tm)
provides functionality similar to a stand-alone desktop application

The application's interface is divided into four functional areas (Figure 6 ). Frame 1 is a non-resizeable, non-scrollable frame used to display a static HTML page consisting of the HTML form elements that compose the pulldown menu, the images that compose the image link button bar, and the JavaScript functions that process and submit users' actions. Frame 2 is a resizeable, scrollable frame used to display a dynamically generated HTML page consisting of the map's legend, image links that provide functionality to adjust the legend, and JavaScript functions used to process and submit user actions. Frame 3, also a resizeable, scrollable frame, is used to display a dynamically generated HTML page containing the map image, and JavaScript functions used to process and submit users' actions. Finally, Frame 4 (not shown) is a non-resizeable, non-scrollable hidden frame (sized at 0% of the browser's width) used to store JavaScript functions and text data that create the select lists for the various dialogs. We made use of the hidden frame to store string data as a means of improving performance and reducing network traffic, but later discovered that this approach had its deficiencies. There is no way to guarantee the order in which the frames are received by the browser. If the hidden frame is not completely loaded, a user request that depends on data stored in the hidden frame will generate a client-side error. Refer to Appendix A, Application Screen Shots, to view the application's GUI.

Client Interface Layout

The HTML code to generate the frame pages follows:

<HTML><HEAD><TITLE>Internet GIS Application</TITLE></HEAD>
<FRAMESET ROWS="20%, 80%">
    <FRAME NAME="Controls" SRC="static.html" SCROLLING=NO MARGINWIDTH="0" MARGINHEIGHT="0" NORESIZE>
<FRAMESET COLS="32%, 68%, *" FRAMEBORDER="0">
    <FRAME NAME="Legend" SRC="dynamicLegend.html" SCROLLING="auto" MARGINWIDTH="0"   MARGINHEIGHT="0" RESIZE>>
    <FRAME NAME="Map" SRC="dynamicMap.html" SCROLLING="auto" MARGINWIDTH="0"   MARGINHEIGHT="0" RESIZE>
    <FRAME NAME="Hidden" SRC="dynamicText.html" SCROLLING="no" MARGINWIDTH="0"   MARGINHEIGHT="0" NORESIZE>
</FRAMESET></FRAMESET></HTML>

Getting the frames initiated is simple but not an obvious task because HTML frames typically contain URLs to static HTML pages. We accomplished initiating the frames by setting the controls frame's source equal to the URL of a static HTML page designated for the controls, and setting the legend, map, and hidden frame's sources equal to the dynamic ISAPI-based URLs that generate each frame page. Using this approach, each frame can be individually loaded by changing the frame's LOCATION object's HREF property with JavaScript to respond to a user's request.

Designing the control frame was an important consideration in keeping our goal of developing an application with an Esri ArcView (tm) look and feel. The controls had to be asthetically pleasing and function in a browser environment. To accomplish this we used HTML select form objects to mimic a pulldown menu, and a set of image links with mouseover events that change the image when a the mouse pointer is over the image. This mimics a button/tool bar. The image links were designed by creating two blank image buttons (one for the onmousover event, and one for the onmouseout event) that were used as templates. Each image button was then created by placing an icon on the templates and saving the template as an individual image file. This ensured that all the image buttons would have a consistent size and color.

Map Application

The first two map applications were designed and developed around Esri MapObjects v1.2 technology using MS VisualBasic v5.0. In addition, Esri MapObjects Internet Map Server (MoIMS) was used for the Web component of the project. This was accomplished by creating a new VisualBasic project as a standard executable, and adding the MapObjects and WebLink components. Placing the map and WebLink controls on a form exposes all of the properties and methods of the components to the rest of the project. The approach here was to create two separate forms: one with a WebLink control, a timer, and a Winsock Control, and one with two Map controls - one for the map image, and one for the legend image (Figure 7 ). The rationale for this approach was based on the design architecture - server-side object-oriented. Each user was to "own" a map and legend control that would remain persistent for the life of their session. The timer, Winsock, and WebLink controls could be shared. We also created two basic modules and two class modules for the project. The basic modules contained the functionality that changed the state of the map, and the class modules were used as containers to assign and access properties of layers and labels.

VisualBasic Development Environment

We designed the application to use a text file to read the required setup variables. This use of a separate configuration file is the key to developing an open application that can easily be ported to differently configured systems. The initilization (ini) file was structured as follows:

[Network]
ServerIP =
ServerPort = 8067
AppName = City1
ControlName = ctrls_1

[Paths]
ShapeFilePath = d:\indus\nh_spatialDB\
TmpPath = d:\Scratch1\

[Database]
dataSourceName = City1_DSN

[Base Layers]
city_boundary = City Boundary;210,210,210;120,120,120
streets_fin = All Roads;0,0,0
*
*
*

[Layers]
arrest = Arrests;0,0,200
cadaddr = Calls for Service;0,255,0
*
*
*

[WebIdleTime] IdleTime = 9

In the [Network] section, if the value for the ServerIP is left blank, the application uses the Winsock control to determine the server's IP address which is embedded in the Web pages. If a value for the ServerIP is given, the application uses the given IP. This technique offers two advantages: it makes the application easily portable from computer to computer, and it allows the application to work behind a firewall. The ServerPort and AppName values are used to direct a user's request to the appropiate application when multiple applications are launched. This is accomplished by embedding the AppName in the Web pages so that the AppName can be extracted from the next HTTP request. The ControlName value, also embedded in the Web Pages, is used to assign a static control file with the appropriate AppName to a user's session.

The [Paths] section contains two values: the shapefile path, and a path to a temporary scratch directory used to write the image files. The [Database] section contains the ODBC datasource name (DSN) that was configured on the server for the application to access. Although the lack of a value for a database password limits the portability of an application, we decided to hard-code the database password. The advantage to hard-coding a password in a compiled application is that the password is not readable. The [BaseLayers] and [Layers] sections can contain one or more layers. They were setup so that each shapefile the application was to access was assigned a friendly name, an optional color, and an optional outline color. This allows the generation of maps where layers are consistent in color - a very important feature to most users. The last section of the initialization file [WebIdleTime] is used to set the time-out value (in minutes) for each user session. It is necessary to set this value to determine when to delete a user's map and legend objects and release system resources. This value was used to set the timer in the application. When the timer event fires, the user's objects are set equal Nothing, and all user created files are deleted from disk. The system cleans up after itself.

The general operation of the application is described as follows. When the application is started the application first reads the initialization file, establishes a database connection, builds VB collections that contain the global select list items, and then waits for user input. When a user submits a login form, the application queries the database to determine and set the user's access rights. The security for the application grants two levels of access: database access, or no database access. All users are granted access to the shapefiles since they are designed to contain minimum attribute data. The only requirement for restricted access is to the back-end database which contains the shapefile's sensitive attribute data. The map application first verifies that all requests are received from valid sessions. This is done by searching a VB collection for the unique session identifier. If the session is not valid and the command is a login request, the application queries the database and validates the user. If the session is not valid and the command is not a login request, the command is rejected and the user is prompted to login. Once validated, a new map and legend object are instantiated on the server and assigned a unique identifier for that user's session. The unique identifer is added to the VB collection of session identifers, and a timer is set to countdown "[WebIdleTime]" minutes. The application then generates and sends an HTML frames page to the client with embedded URLs containing the command strings that load the controls HTML file, and generates map, legend, and hidden frames. The frames page results in an immediate response to the client which is the completed Internet GIS application interface.

Session Management

Session management is a critical part of any multi-user application. It is the key to performing the correct task for the right user. A MapObjects application is designed to run as a single-user application (it uses a single threaded ActiveX control) so when accessed by multiple users in a Web environment, the requests are queued in a single thread. This means that a user's request will have to wait for previous requests to complete before beginning execution. If a user submits a simple request, such as a zoom-in, after another user's more complex request (i.e. search the map), the zoom-in request gets processed after the search map request causing the second user an unnecessary wait. Esri's solution to this situation is to launch multiple instances of the application through MoIMS. MoIMS handles the load balancing between the concurrently running applications. This solution works fine for pipe-filter, server-side GIS applications but is not suitable for object-oriented, server-side GIS applications. Object-oriented, server-side GIS applications tie a user's session to a particular CPU thread, thus a request must always return to its originating application. To overcome this problem, we designed the application to embed its name and controls file name (read from the initialization file) into each page returned to the client. By doing this we could ensure that each request would return to the correct application (Figure 8).

Application/System Thread Model

Although sessions can still get caught in a particular CPU thread, multiple applications can be run with each handling only a few sessions. This allows for acceptable performance in a multiple user environment.

Database Design

The database for the application consists of two types of data: tabular data stored in a relational database, and Esri shapefiles stored on disk. For security reasons, when we designed the database for the application, our goal was to remove as much attribute data as possible from the shapefiles and place them in the relational database. We designed the relational database in the highest possible normal form but later discovered that a highly normalized database is not suitable for a MapObjects GIS application that has high performance requirements. Queries take longer to execute because of multiple table joins. In addition, MapObjects allows only one table join (AddRelate) per shapefile. We realized that a shapefile containing all of its required attribute data would perform better in an Internet environment, but had commited our application's security design to be based on "thin" shapefiles that relied on the relational database. To improve the application's performance and maintain our security design, we de-normalized the database so that at most, one table join was required for all functionality.

Enhancing the Application

The development of the first city's prototype was a five month effort. Immediately after completion of the first application, development on a second Internet GIS application was begun with a refined approach. Our refined approach and the lessons we learned, allowed us to develop the second prototype over a period of 2 months. Although the second prototype was modeled after the first, there were enhancements in the application's query module and shapefile generating module that resulted in significant performance improvements.

The first two prototypes we developed were server-side, object-oriented applications based on MoIMS, and were not highly scalable. To overcome the scalability limitation, we designed a third application based on Active Server Pages (ASP) technology. We were able to reuse a lot of the fundamental code from the first two prototypes , but the third application was significantly different from the first two. The third application was developed as an ActiveX dll that could be run from an ASP script. This application was built with MapObjects v2.0 and ASP technology. Due to the fact that MapObjects is a single threaded control, the map control cannot be accessed directly from an ASP script. In order to overcome this, we created a new VisualBasic project as an ActiveX dll project. In addition to the standard form used to contain the map control, we added a multi-use, persistent class module to the project. This module could instantiate the map control and provide a means of accessing the map control's properties and methods. This design model provides the ideal architecture for a server-side, object-oriented application. Built on ASP technology, this solution is a highly scalable solution that allows for the development of complex Internet GIS applications.

Conclusion

With most software projects, there is more than one solution, but the best solution depends on the circumstances, and the software requirements. Developing GIS applications for the Internet is a situation where the best solution depends on the application requirements. By carefully analyzing requirements and planning an Internet GIS application, a software developer can greatly simplify the development process. The first step a developer must accomplish is to gain a thorough understanding of the application requirements. An understanding of the application requirements will allow the developer to make the right architectural choice for the application. Typically a server-side application is a good choice for developing an Internet GIS application because of advantages that include ease of development and standardization.

The structure of the spatial data, as well as the structure of a relational database is a critical factor that influences a GIS application's performance. In an Internet environment, performance is usually the most important factor, thus a developer should make performance the number one priority when designing the database. A de-normalized database will provide the best performance for an Internet GIS application. Threading and Session Management are also two very important considerations that affect performance and scalability. Complex Internet GIS applications developed with Esri MapObjects and Internet Map Server will require a developer to develop a session management strategy. Although MapObjects is a single-threaded ActiveX control, an in-process dll "wrapped" around the MapObjects control enables the control to be run under MS IIS Active Server Page technology. By using this approach, a developer can develop Internet GIS applications that perform well and present a highly scalable solution for the Internet.

Acknowledgements

Special thanks to the INDUS team that developed the core Internet GIS application and made it possible to write this paper: Robert Stropky, Phil Cotter, Hanwen Chen, and Amanda Wingo. Also a special thanks to all the other members of INDUS's staff that reviewed and provided feedback for this paper.

Appendix A - Application Screen Shots

Screen Shot

References

Internet GIS Architecture - Which Side Is Right for You?, Fred Gifford,
GeoWorld, May 1999

MapObjects Internet Map Server Reference(issued with software),
Esri, Inc., 1998

Author Information

Johnny Marshall
Software Engineer
INDUS Corporation
1953 Gallows Road, Suite 300
Vienna, VA 22182
Phone: (703) 506-6700
email: john.marshall@induscorp.com