CGI and Land Records System - Data Publishing and Access

CGI and Data Publishing

Before the development of the World Wide Web (the Web, WWW), the Internet was used only as a tool to transfer data. Since the inception of the Web, GIS professionals realize that GIS can be expanded onto the Web, and it can evolve into a new GIS technology. Different GIS techniques have been developed on the Internet. There are three types of such techniques: common Gateway Interface (CGI), plug-ins, and Internet Programming Language (IPL). The plug-ins are the GIS helpers in a browser, and it is client-side implementation of Internet GIS. It is commonly considered as fat-client. The IPL includes Java, ActiveX, etc. It is the future direction for the Internet GIS, but it is currently not mature. The County of Chester has chosen to use CGI to provide an interface for users to access its LRS.

What is CGI?

CGI stands for Common Gateway Interface. Or in simpler terms, a standard ("common") way of communicating ("interface") between processes on different machines ("gateway"). CGI is a standardized way of writing scripts that the server will run when a request for the relevant URL is received. A gateway is typically a program that transforms information from one form to another, and one use for CGI scripts is to implement gateways. For example, you may have all your data in a relational database, and want to make this information available to the Web. To do this you would write a gateway script to transform HTTP requests into accesses to your database, and translate the replies into HTML (Handley, Mark and Jon Crowcroft, 1994).

Web browsers can directly communicate with Web servers by using the Hypertext Transfer Protocol (HTTP), but not with GIS servers. The connection between GIS programs and Web servers is established using the CGI or gateway script. The CGI script allows Web servers to execute GIS programs and interpret their output information to Web browsers. The server and the CGI program work in conjunction to enhance and customize the Web�s ability to link with GIS data and functions.

Most people have seen the results of CGI, but probably don�t realize how "Static" HTML documents differ from "Responsive" HTML documents. In a "Static" HTML document the browser requests a URL from the HTTP server. The server looks to see if that document exists; if it does exist the server sends the document to the browser. "Responsive" HTML documents are built to fulfill specific request from the browser. The HTTP server looks at the data the browser sent to it and runs the appropriate script or program. The script or program processes the request with the information supplied, generates an HTML document and passes the completed HTML document back to the HTTP server. The HTTP server then sends the finished document back to the browser. Most "Responsive" CGI HTML documents have user-input forms of some sort usually containing buttons, check boxes, selections and/or text fields.

How does a CGI-based Internet GIS work? In a nutshell, CGI scripts are called by the Web server based on the user request submitted via the browser. The scripts launch the GIS server and translate the request to a format that a GIS server can interpret. The GIS server then performs the analysis and sends the output back to the CGI script. The CGI script sends the output and associated information to the Web server and browser for display.

Pros and Cons of CGI-based Internet GIS

CGI-based Internet GIS focuses solely on the server-side operation. The GIS server does all the work, and the Web browser is a user-friendly front-end interface. The CGI scripts act as the translator between the browser client and the GIS server.

The processing workload on the client side is minimal. Since all processing is conducted by the GIS server, the CGI-based Internet GIS can take advantage of the functionality of existing GIS server software such as ArcInfo.

The CGI-based Internet GIS, however, is restricted by the limitations inherent in the Web browser and the static HTML. Server-side Internet GIS is based on stateless HTTP and CGI scripts. The user cannot directly work with spatial objects as with stand alone GIS software. In addition, an HTTP Web server doesn�t remember calls between requests. The whole routine from browser to Web server, invoking the CGI script and initializing the GIS server must be repeated if a user wants to pan in the map delivered by the Web server. End result? Increase in traffic on the Internet.

In addition, every operation must be conducted by the GIS server creating a bottleneck during high usage periods. This results in the slowdown of information transmission between the CGI script and GIS server and the browser user. Since the CGI script single-handedly handles all requests from the Web browser and then interprets all output from the GIS server, it becomes very difficult for the CGI script to handle large amounts of requests from users, especially concurrent requests. A considerable load is placed on the server of a frequently accessed site. The CGI script can also be a vulnerable point. When the CGI script or the GIS server fails to work properly, the whole system will fail.

Finally, the product of all the GIS server�s work is more static images. The Web browser passively displays these static map images. The only interaction with HTML documents is by selecting hyperlinks. The limitations inherent to the Web prohibit the direct manipulation of maps on the browser. For example, the user cannot select a feature by dragging a rectangular, circular or irregular polygon on a map image. Likewise, the user cannot select a linear or a point feature on the map.

There are many reasons why it is worthwhile to integrate a governmental Land Records System or GIS with the WWW environment using CGI:

An economical solution
Provides a single venue for sharing data and applications
GIS computing power performed on the server side can be brought to the user without requiring the user to purchase expensive GIS software and hardware
A platform and device independent standard user interface environment already exists (e.g. the Netscape browser is free and works on UNIX, Windows and Mac platforms)
The WWW is a hypermedia, multimedia environment, allowing easy integration of many types of information
Can be a source of revenue generation
A GIS project integrated with the WWW has unlimited potential

Performance of Internet GIS

Although instructions required to initiate GIS processing can be transmitted across the Internet in a relatively compact manner, a major drawback of current Internet GIS technology is its slow performance. It takes a long time to transfer vector and image data. This will become more evident as additional analysis functions are added. A significant amount of time is spent waiting for data. Slow response times usually cause the user to lose interest.

Slow performance can be addressed in two ways: increasing the speed of Internet connection and developing more efficient Internet GIS programs. The speed of the Internet connection can be improved by using faster modems and faster communication connections. Conversely, the efficient design of the Internet GIS program will make it feasible to run even at a slower speed. The modularized design of GIS analysis tools and data, and the just-in-time delivery mechanism can allow the Internet GIS user to initially download the minimum GIS functions and data. Additional data and analysis tools can be delivered to the user as needed.

Security

"Domain-level" control is where the web server rejects or accepts a connection based on the IP address of the requesting browser. This level of access control is ideal for an internal company web server by limiting access to a whole class C network or even higher. The second level of control is "user authentication", in which the client must enter a user id and password to validate the right to access the requested document. This technique can be used to implement subscriptions in your html directory. The downside of user authentication is that not all browsers support it. The best solution is to use a combination of domain-level and user authentication access controls. All access controls work with your html directory tree, not with individual html documents. If you need to control access to one individual document, that document must be put into a separate directory.

The access control you set in your web server affects what documents it will send to a browser, NOT what an already-logged-in user can do. Because someone has access to web document directories does not mean that they have to have an actual account on your server. Security becomes less an issue for server-side applications, because there is no program codes that are executed in the user�s local machine.

Security issues are not unique to Internet GIS but all Internet applications. Internet GIS can take advantage of the fail-safe measures as they are developed for the Internet as a whole. Regardless, by implementing access control you can devise many entrepreneurial opportunities for your web site.

Institutional Issues

In addition to the issues already discussed, there are other issues impacting development of Internet GIS, including institutional and legal concerns and cost-recovery for development of a Web based solution. Policies and procedures must be adopted that determine the information that can be published on the Internet and accessible by the general public. Liability consequences of false or inaccurate information, especially GIS information published by government, must be minimized at all costs.

Since Internet GIS can be accessed by anyone who has Internet access, who should pay for cost-recovery or profit? Should taxpayers or for profit companies be charged to access Internet GIS data? How about analysis tools? If so, what is the fee schedule and how should they be charged?

Data Publishing

CHESCO-LRS has 20 ArcStorm data layers and over 100 Oracle tables. In order for end users to access the data database internally and externally, two different types of data publishing are anticipated. One is to deploy ArcStorm data layers and Oracle tables as shape files and dbf files to ArcView users in the county departments. CHESCOView, a customized ArcView project, has been written to provide an internal streamlined interface for accessing the land records database. This is a temporary solution due to county's outdated network. After the county completes it�s upgrade to Intranet to 10/100 mbps network (mid-summer), Esri's map server will be used to serve the internal users.

For the general public, only data structure and basic statistic information will be published onto the Web. An AML interface was written to generate web pages for different data types such as ArcStorm data layers, Librarian libraries, etc. The following is one of the AML menus.

Figure 3 AML Data Publishing Interface

The program will generate web pages stored in multi-level directories and provide the description of data layers, item name and definition in each INFO file, etc. Here are some of examples of such web pages: ASDB, profile of ASDB, etc. You can find out more about this program by clicking here.

A web-driven interface will be developed to run this program so that a dedicated person can publish such information periodically through a web browser.

Accessing Data

The casual user application - CHESCOView uses ArcView windows as the standard user interface with customized buttons and functions. When the application is initiated, the County�s tax grid and a key map with major streets and highways will be the default display in the navigation window.

CHESCOView resides on the County�s Banyan server. To access the application and data, the user�s PC workstation must be networked with County�s LAN. The user is given "read only" access to the application. Any views, projects, thematic maps, map plots, etc. generated by the user can be saved to the user�s local drive. The user has the responsibility to manage the files saved to his/her local machine.

CHESCOView includes three primary functions: navigation, view selection, and plotting. View selection will include options for predefined views, user-defined views, and thematic mapping. These functions will be supported by adding functionality to the ArcView 3.0 interface via menus, buttons, tools, and dialogue boxes.

CHESCOView provides four menus which have been added to the ArcView interface: 1) navigation: provides criteria for setting view extents (display window); 2) select themes : provides for selection of themes to include in a view; 3) query: provides menu items for ArcView query functions; and 4) plot: provides selection of customized layouts and scale.

The public access interface - CHESCOGateway provides two types of access to users through the Web. One is for general users; the other is for registered users. General users will be able to browse information and conduct some simple queries. Registered users will be able not only to browse and query the database, but also to download data, generate reports, and design simple maps. The reports and maps can be delivered as electronic files or printed documents as requested by the user. A password or some kind of access control is needed for registered users to get into restricted areas where protected databases reside.

The objectives for designing such interfaces are to provide access with consistency, a standard look and feel, that are intuitive and easy to use, and quickly navigated.