Gregory L. Easson, and David J. Barr

Integration of GIS and Artificial Neural Networks for Natural Resource Applications

Abstract.

The integration of Artificial Neural Networks (ANN) and GIS can be used to interpret natural resource information. The feasibility of the integration of these two technologies has been proven in recent research and has potential applications to other forms of geological and hydrological interpretations. A major impedance to the utilization of this integration is the cumbersome flow of information from the PC-based ANN to UNIX workstation ArcInfo.

This paper will discuss the information flow requirements to more seamlessly integrate workstation ArcInfo and a PC-based ANN. To streamline the flow of information from a GIS to an ANN and back again, a series of programs is being designed and written. These programs will accept data from ArcInfo and reformat the information into training and data files for an ANN. Another series of programs will then take the resultant files from the ANN and convert it into an ArcInfo coverage for display and map production. The programs will consist of AML routines and C programs for both UNIX and PC environments.


Introduction.

The integration of Geographic Information Systems(GIS) and Artificial Neural Networks (ANN) offers a potential mechanism to lower the cost of analysis of geologic and hydrologic information by reducing the amount of time spent interpreting data. This integration allows the interpretive result from a small area to be transferred to a larger, geologically and hydrologically similar area, without the additional time and expense of placing the geologist in the field for a time sufficient to cover an entire project area.

One of the most cumbersome aspects of this type of application is the transfer of data from the GIS to the ANN and back. The geologic and hydrologic data used to produce an interpretive result is most efficiently managed in vector form. However, one of the most widely used ANNs, a Back- Propagation Neural Network, requires data in raster form that is arranged in one-dimensional column vectors. Therefore, an efficient methodology to transfer information between the GIS and the ANN is needed. An efficient translation methodology and user interface will allow for a trained ANN to become a more useful tool.

This paper will discuss the concepts of Artificial Neural Networks and the data requirements of a Back-Propagation Neural Network. The paper will also outline the steps required to translate data from workstation ArcInfo to a PC-based ANN. These steps form the basis of the design of a series of programs, AMLs and other programs, that will automate these steps and increase the ease of integration of these two technologies.

Artificial Neural Networks.

An Artificial Neural Network, sometimes referred to as simply a Neural Network, is a computer program designed to model the human brain and its ability to learn tasks (Haykin, 1994). An artificial neural network differs from other forms of computer intelligence in that it is not rule- based, as in an expert system. An ANN is trained to recognize and generalize the relationship between a set of inputs and outputs.

Early artificial neural networks were inspired by perceptions of how the human brain operates. In recent years the developments in ANN technology have made it more of an applied mathematical technique that has some similarities to the human brain. Artificial neural networks retain as primary features two characteristics of the brain: the ability to 'learn' and to generalize from limited information (Hewitson and Crane, 1994).

Neural Networks, both biological and artificial, employ massive, interconnected simple processing elements, or neurons. In artificial neural networks, the knowledge stored as the strength of the interconnection weights (a numeric parameter) is modified through a process called learning, using a learning algorithm. This algorithmic function, in conjunction with a learning rule, (i.e., back-propagation) is used to modify the weights in the network in an orderly fashion.

Unlike most computer applications, an ANN is not 'programmed', rather it is 'taught' to give an acceptable answer to a particular problem. Input and output values are sent to the ANN, initial weights to the connections in the architecture of the ANN are assigned, and the ANN repeatedly adjusts those interconnection weights until the ANN can successfully produce output values that match the original values. This weighted matrix of interconnections allows the neural network to learn and remember (Obermeier and Barron, 1989).

The first step in utilizing an ANN to solve a problem is to train the ANN to 'learn' the relationship between the input and outputs. This is accomplished by presenting the network with examples of known inputs and outputs, in conjunction with a learning rule. The ANN maps the relationship between the inputs and outputs and then modifies its internal functions to determine the best relationship that can be represented by the ANN.

The inner workings and processing of an ANN are often thought of as a 'black box' with inputs and outputs. One useful analogy that helps in the understanding of the mechanism occurring inside the 'black box' is to consider the neural network as a super-form of multiple regression (Hewitson and Crane, 1994). Just as in linear regression, which finds the relationship such that {y} = f{x}, the neural network finds some function f{x} when trained. However, the neural network is not limited to linear functions. It finds its own best function as best it can, given the complexity used in the network, and without the constraint of linearity (Hewitson and Crane, 1994).

Back-Propagation Artificial Neural Networks.

The basic structure of an ANN, including the back- propagation ANN, consists of layers of neurons or processing elements. (See Figure 1.) These layers are the input layer, output layer, and hidden layer. Hidden layers are so named because they have no connections external to the network. Generally, for most applications, one hidden layer is sufficient. More than one hidden layer greatly increases the amount of time required for training and testing without noticeable improvement in performance. While Figure 1 shows a relatively simple neural network, increasingly more complex networks can be developed by employing more hidden layers and more intra-layer connections.

<BR
Figure 1. Basic Structure of an Artificial Neural Network (after Eberhart and Dobbins, 1990, p. 37).

The input layer of a neural network presents the input data to the processing neurons of the network. Data patterns, which are created by the translation of data from vector to raster form and then to a 1-dimensional column vector (see Figure 2), are simultaneously passed forward from the input layer to a processing layer. A pattern consists of the value for each input (if training the output is also needed) for a given location. The number of inputs depends on the type of problem to be solved. In the initial proof-of-concept work from which this investigation extends (Easson, 1996), the number of inputs was equal to the number of parameters leading to a engineering geological map of an area -- four. The input data can be either binary or continuous.

<BR
Figure 2. Conversion of Mapped Data to ANN Patterns (from Easson, 1996).

The hidden layers receive the data from the input layer. Each connection in the hidden layer has a weight, or strength, of connection associated with it. Each neuron of the input layer is connected to each neuron in the hidden layer. In the same fashion, each neuron of the hidden layer is connected to each neuron on the next layer. The next layer may be another hidden layer or the output layer. In a feed-forward type of network the data flow is from the input layer to the output layer, through the hidden layer or layers. In a back-propagation ANN the feed-forward pass is followed by a backward pass during which the interconnection weights between neurons are modified based on error values.

The output layer produces the final results of processing by the ANN. During the training phase, these output results are compared with the known output, error calculated and interconnection weights adjusted. After training is completed, the output layer produces the values that are returned to the GIS for production of the preliminary engineering geological map.

Data Requirements.

As with all GIS projects, the first step is the conversion of all information into digital form. For a natural resource application, the information will tend to consist of maps of geologic and hydrologic information. This information is usually polygonal and includes themes such as bedrock geology and soil type. The other common type of input for natural resource applications are surfaces. This type of data can be store as arcs with the elevation or thickness as an arc attribute. However, for analytic purposes, surface are best stored as grids or lattices.

For an ANN to be used as a tool to interpret geologic and hydrologic information, the map information must be converted into patterns (as shown in Figure 2, above). These patterns consist of a value for each input theme at a given location. During the training of the ANN, the patterns must also contain the value of the accepted output value for each location. Once the trained ANN has produced an interpretive result, the result must be converted back into ArcInfo (generally as a polygon coverage) for the production of an output map. This map, showing the interpretive result, can be evaluated to see if further training is needed.

Training and Testing the ANN.

Once the geologic and hydrologic data and the engineering geological map have been digitized and coded in the GIS, these data can then be reformatted and presented to the ANN. The information required to train the ANN is the inputs with an accepted output for a subset of the information from the training area. The 'knowledge' gained by the network during the training phase is stored in the interconnection weights. The relationship learned by the ANN must be tested determine the quality of the knowledge.

Testing of the trained ANN requires that inputs, but not the outputs, for another portion of the training be presented to the ANN. The accepted output is compared to the output produced by the ANN to see if the ANN's output is acceptable. If the output produced by the trained ANN is correct within accepted error ranges, the ANN can then be used in other geologically and hydrologically similar areas.

Translation Steps.

The steps required to translate information, for each input and output from workstation ArcInfo to a PC-based ANN are outlined below. These steps occur on both workstation and PC platforms and are listed below in the order in which they need to occur. These steps begin after the data has been automated and is in the form of ArcInfo coverages. After the trained ANN has produced an output, these steps are reversed to put the information back into ArcInfo for final map production.

Step 1:

Convert the polygon coverage to grids with the POLYGRID command. The size of the grid cell is dependent on the resolution needs of the final product.

Step 2:

Convert the grid information to ASCII using the GRIDASCII command and remove the header information written to the file in GRID. This file must be reformatted into a one-dimension column vector. The length of this vector is equal to the number of cells in the training or application area.

Step 3:

Combine the individual column vectors into input files with the number of columns equal to the number of inputs. During the training of the ANN the number of columns will be equal to the number of inputs plus the number of outputs. The column vectors are combined using the paste command in the UNIX operating system. The data are now in the form required for the ANN; each row is a pattern of the conditions that are present in each cell.

Step 4:

After the information has been processed by the trained ANN, the output vector must be extracted from the pattern file. This one-dimensional column vector is then converted back into a format compatible with GRID and read back into ArcInfo.

Step 5:

This step is optional. Convert the grid information into polygons that represent the desired interpretive result. This information is then evaluated, combined with other coverage information and used to produce a final output map.

Conclusions.

Artificial Neural Networks have been proven to be useful in the interpretation of natural resource information. Back-Propagation Neural Networks are one of the most common and widely used architectures. Many architectures and types of ANNs have been developed, and many of them are PC- based. To increase the usability of ANNs for map- based applications, a more efficient methodology is necessary to communicate between the ArcInfo and a trained ANN.

The above methodology was tested in research designed to use a trained ANN to interpret geologic and hydrologic information to produce an engineering geological map (Easson, 1996). At that time, the steps were performed manually. The methodology and steps presented above will be automated using AML and C programs on a networked workstation (with ArcInfo) and PC (with a Windows-based Back-Propagation Neural Network). The project is scheduled to be completed in August 1996. Once completed, the translator will not be limited to engineering geological mapping, but will be able to be used on all manner of analyses.

References.

Easson, Gregory L., 1996. "Integration of Artificial Neural Networks and Geographic Information Systems for Engineering Geological Mapping", Unpublished Doctoral Dissertation. University of Missouri--Rolla, 154p.

Eberhart, Russell C. and Roy W. Dobbins, 1990. Neural Network PC Tools: A Practical Guide. San Diego, California: Academic Press, Inc., 414p.

Haykin, Simon, 1994. Neural Networks, A Comprehensive Foundation. New York: Macmillan College Publishing Company, 696p.

Hewitson, Bruce C. and Robert G. Crane, 1994. "Looks and Uses" in Hewitson, Bruce C. and Robert G. Crane, eds. Neural Nets: Applications in Geography. Boston: Kluwer Academic Publishers, pp. 1-9.

Obermeier, Klaus K. and Janes J. Barron, 1989. "Time to Get Fired Up" in Byte, Aug. 1981, pp. 217-233.


Author Information.

Dr. Gregory L. Easson, Assistant Professor
Department of Geology and Geological Engineering
University of Mississippi
University, Mississippi 38677
Telephone: (601) 232 5995
Fax: (601) 232 7796
E-mail: geasson@sunset.backbone.olemiss.edu or geasson@teclink.net

Dr. David J. Barr, Professor
Department of Geological and Petroleum Engineering
University of Missouri--Rolla
Rolla, Missouri 65401
Telephone: (573) 341 4867
Fax: (573) 341 6935
E-mail: davebarr@umr.edu