Carsten Bjornsson & Niels Strange

Heuristic Allocation of Wetlands in GIS

The problem of finding the optimal allocation of wetlands is approached from an evolutionary viewpoint using a genetic algorithm. Two different and spatially dependent wetlands are allocated along a network of polluted streams in the Skjern Å delta, which is located in the western part of Denmark. The objective is to minimize the operations costs of such wetlands and minimize deviations from stated [Fe2+], which are suitable for salmon habitat. The stream-order and the wetland-order influence the feasibility space and restrict the traditional transition operators such as crossover and mutation. It is shown that the algorithm finds near-optimal solutions in a reasonable number of iterations. A GIS-modeling environment is supplied with these static solutions and the dynamic effects are modeled.

Keywords: Genetic algorithms, geographical information systems, optimization.


1. Introduction

Landscape planners are increasingly confronted with legislative demands to plan and design solutions for complex spatial and temporal problems in the landscape through different types of impact assessments. Solutions to any given location problem has an infinite number of solutions where the challenge is to find the most optimal solution based on a set of criteria (Steinitz, 2000).

As an example mitigation and restoration of wetlands represents a complex and long-term spatial problem. Location as well as sequences of wetlands along riparian zones in the watershed is of importance (Mitch, 1992) where wetlands poorly located might act as sources instead of sinks thus polluting further downstream. Also mitigation and restoration is performed with constrained budgets where optimal. While feasible solutions in large scale based on expert judgment might easily be sought some excellent solutions in smaller scale may be missed. Based on this argument Chiang et.al. (1977) implements a heuristic algorithm to optimize the location of wastewater treatment plants.

Genetic algorithms (GA) belong to a class of algorithms termed evolutionary computation, which also includes evolutionary programming, evolutionary strategies, and genetic programming. A review of the history is presented in, e.g. Bäck (1996). They are population oriented and use selection, mutation, and recombination operators to generate new solutions in a search space.

Since wetlands represent complex systems with high level of spatial and temporal interactions where objectives often conflict the allocation should be based on multiple criteria evaluation. This paper uses a genetic algorithm embedded in a GIS environment to solve this allocation problem.

Section 2 contains the allocation problem. The dynamic modeling system is described in Section 3. An introduction to the genetic algorithm is provided in Section 4. The results of the experiments on using the genetic algorithm to solve the problem including a pre-optimization of the parameter settings, and the dynamic modeling are presented in Section 5. A discussion and conclusions are given in Sections 6.


2. The Allocation Problem

Due to agricultural drainage in extensive farming areas, open pit mining and geological structures of the Western part of Jutland, Denmark many streams are polluted with precipitated iron hydroxides primarily ochre (Fe(OH)3). In recent years wetlands has been reintroduced as a long-term low cost effort to facilitate and re-mediate precipitated ochre. Reintroducing wetlands in riparian areas concentration of ferrous iron (Fe2++) and pH are among the key determinants for ochre precipitation processes and retention time for the remediation processes. Two different kinds of wetlands may be located along the streams to increase the water quality of the system.

The planning objective is to minimize the costs of operating such wetlands and minimize deviations from stated [Fe2+], which are found suitable for salmon habitat. The problem is based on real water quality data as well as realistic objectives. The watershed is modeled as grid cells, and each cell is connected to at least one up stream and down stream cell. Altitude, stream flow and water content are modeled in the system.

Let c denote a specific cell in the stream and let C represent the stream configuration of the whole Skjern Å delta. Each cell p2381.gif - 891 Bytes may be used in different ways. Considering the choice of wetlands we let D denote the possible decision alternatives in each cell, and p2382.gif - 950 Bytes denotes the specific alternative picked for cell c. In the problem each cell may be seeded with three different wetlands, and therefore D={no wetland, wetland-A and -B}. A specific allocation configuration of wetlands along the stream can be denoted as . Additionally, let denote the resulting water quality of ferrous iron in each cell configuration.


2.1 Problem formulation

The planning decision, that the local community faces, is the one of obtaining a certain level of water quality without violating a budget constraint. The construction and maintenance costs of each type of wetland vary depending on size. Usually, more wetlands increase water quality. However, some upstream configuration may in fact reduce the effects of some downstream configurations towards zero. Hence, some logic rules to test this assumption could be implemented.

The stated objective is to minimize the overall value of the wetland configuration represented by costs subject to water quality represented by maximum [Fe2+] for all p2381.gif - 891 Bytes. The mathematical formulation of the problem for the whole stream is expressed as maximizing the number cells with improved deviations from p2386.gif - 874 Bytes:

p2387.gif - 1372 Bytes

subject to the stated, the financial constraint is formulated as:

p2388.gif - 1166 Bytes

where p2386.gif - 874 Bytes represent threshold values of [Fe2+] for each cell in the stream. In the following we will give some real content to (1) and (2).


3. Modeling wetlands

The wetland transformation model presented here consists of a hydrological model (HYM) and a chemical transformation model (CTM) for the oxidation, hydrolysis and precipitation of ochre.


3.1 The hydrological model (HYM)

Retention time and dilution in a wetland represents some of the key parameters for the transformation and sedimentation of ferrous iron and ochre. Let Qi + dQ = Qo be hydrologic inflow and outflow rate from the cell. Using Manning's equation for open channel flow can be used to calculate stream velocity V (m/sec) in the wetland (Bjørnsson, 1999) when dimensions are changed:

p2389.gif - 1462 Bytes

From sample stations along rivers Q is interpolated for each 25 m's by combining surface- and subsurface flow accumulation grids as key index for interpolation. Using Newton iterative approach combing (1) and (2) and knowing width, slope (S) and estimating Manning's quotient for kinematics' friction (n) V is calculated for each 25 m of the stream.


3.2 The transformation models

Three key processes determine the transformation of ferrous iron into ochre: Oxidation, hydrolysis and precipitation of ochre altogether being net acidic for the wetland. Oxidation happens either biotic or abiotic following the same stocheometric process:

Fe2+ + 1/4O + H+ Þ Fe3+ + 1/2H2O

The rate of oxidation can be expressed as:

p23810.gif - 1396 Bytes

where k is the oxidation constant, [Fe++] the concentration (mg/l) of ferrous iron, pO2 partial pressure of oxygen and pH. Oxidation rate below pH 4.5 is increased by a factor105 due to biotic oxidation. Hydrolysis is referred to as being only abiotic following the equation.

Fe+++ + 2H2O Þ FeOOH + 3H+

Hydrolysis and precipitation of ochre from ferric iron is defined from the relation between solutability product constant (KFe(OH)) and the product between the ion constants (QFe(OH)). For each moles of ochre three moles of H+ is released why transformation without HCO3- buffering creates net acidic conditions in the wetland. The main buffering comes from bicarbonate:

H+ + HCO3- Þ H2CO3


3.3 Construction and maintenance costs

From 7 Danish wetlands the average construction cost for sedimentation basins range from 63 - 295 dkk/m2 and yearly maintenance from 1 - 23 dkk/m2. For vegetated wetlands an average construction costs is 30 - 100 dkk/m2 and the yearly maintenance range from 0 - 20 dkk/m2.


4. Optimization Using Genetic Algorithms

The algorithm operates on a population p23811.gif - 1129 Bytes of l individuals for each generation g. From the initial population subsequent populations P(1),..,P(g) will be computed by employing the three genetic operators of selection, crossover, and mutation. After calculating the fitness value p23812.gif - 950 Bytes , a mating pool is formed by selecting the more fit individuals.

A general formulation of the procedure involves the following steps:

  1. Select an initial population, P, containing l individuals' solutions. p23813.gif - 1129 Bytes where p23820.gif - 927 Bytes represents an individual and the number of generations is denoted g.
  2. Calculate the fitness F of all p23814.gif - 901 Bytes in P, and find the maximum fitness value p23815.gif - 851 Bytes and its solution p23816.gif - 853 Bytes .
  3. As long as the maximum number of generation's gmax is not reached, the following loop should be performed. Set g = g + 1.
  4. Calculate the fitness for all p23817.gif - 959 Bytes and P, find the maximum fitness value p23815.gif - 851 Bytes and its solution p23816.gif - 853 Bytes, calculate p23818.gif - 852 Bytes of P.
  5. Select the more fit p23814.gif - 901 Bytes for a mating pool.
  6. Perform transformation of the mating pool applying mutation and crossover.
  7. When the final generation is reached the maximum fitness value p23815.gif - 851 Bytes and its solution p23819.gif - 845 Bytes are returned.

The selection is based on tournament selection (Baker 1985, Goldberg 1989, Goldberg and Deb 1991). During the mating some individuals are transformed by mutations, which create new individuals by a small change in the individuals' binary structure or by crossover where parts of two or more individuals are combined. A feasible initial solution for the first generation is obtained by randomly assigning one of the alternatives in D to a randomly defined stream segment without violating the budget constraint. In the following, the fitness function, genetic operators and the selection method applied by the GA are presented.


4.1 The fitness function and selection mechanisms

The fitness formulation (eq.1 and eq.1.2) is not used directly as a fitness measure for each stream string. Instead each configuration/solution is given a new fitness value according to its order in the population based on the objective function value.

From eq.1.1 FFe(F(c,d( C))) can be stated as follows:

p23821.gif - 1337 Bytes

where FFe is the rate of cells c below p2386.gif - 874 Bytes . The constraint on construction and maintenance costs is expressed as:

p23822.gif - 1555 Bytes

where fmax represents the threshold cost value. The closer f(c,d(C )) is to fmax the smaller the value. Combining eq.4.1 and eq.4.2 the fitness values for each solution i are given as:

p23823.gif - 1682 Bytes

Each member l of the population P is ranked ascending to Fi and assigned an accumulated probability value determined by its contribution to the total sum of the fitness values of l in the population Pg.


4.2 Genetic operators

The parents, i.e. individuals, which survive the selection process, undergoes the genetic operations, mutation and crossover. Bäck (1996) and Goldberg and Deb (1991) conclude that in terms of selective pressure, tournament selection seems to be superior to ranking and proportional selection. Booker (1987) and Eshelman et al. (1989) find that an increasing number of crossover points may improve the performance of GA.

p23824.gif - 4546 Bytes

From a random generator Y(1,P) two members ln and ln+1 from Pg are selected from a mating pool where F(c,d(C )) 0. Two random numbers Y(1,C) and Y(1,6) are drawn to determine the crossing site and size of the crossing string. A crossover between the two individuals is then performed. From the mating pool the procedure is repeated until the number of individuals l equals the size of the population P.

Mutation occurs if a random value reaches a certain threshold. The random generator selects an individual p23820.gif - 927 Bytes and randomly locates d(C ).

5. Results

Using Arc Info version 7.2.1 (NT and UNIX) a genetic algorithm structure were implemented with an ochre transformation model using AML. To test the optimization algorithm two types of analysis were performed. In the first analysis an expert opinion were given to locate type, size and sequence of wetlands in the riparian zones of the catchments area. The 'rule of thumb' locates the plants as near as possible to the known source of pollution. In the second analysis the same wetland parameters were incorporated into the genetic algorithm.


5.1 The data set

The data was prepared for a small sub-catchments area of the Skjern Å river delta with high rates of point and non-point pollution from ferrous iron. A 25 m grid of the streams was created and water quality parameters generated through interpolation based on a distribution from cell accumulation from surface and subsurface hydrologic flows.

See test data Table 5.1.

p23826.gif - 2383 Bytes

Geographic location of the individual stream cells. The arrow indicates the main flow direction with cell 13 as the main outlet from the sub basin.

The purpose with the data set was to test the behavior of the genetic algorithm why the data set was calibrated to fulfill this purpose. The calibration was performed using the ochre transformation model (section 4) to stabilize each cell based on in- and output rates. Table 5.1 illustrates the calibrated stabilized data. Both pH and HCO3- were kept as constants since the ochre model showed to be very sensitive to small changes in the pH level. Measurements from 10 Danish wetland systems for ochre remediation showed no noticeable drop in levels of pH and HCO3- when entering and leaving the two types of wetlands why this assumption presently seems valid.


5.2 The simulation criteria's

The maximum construction and maintenance cost for the first year is set to 220.000 dkk. (~27.500 $US). Two types of wetlands were introduced: sedimentation basins and vegetated basins. The input parameters for both wetlands were (cell length = 25m):

Sedimentation basin Vegetated wetland
Depth (m) 0.70 0.25
Wideness (m) 30 30
Construction (dkk/m2) 131 13
Maintenance (dkk/m2) 100 20
Total cost (dkk) 108.000 90.000

Threshold value was set to 1.55 mg/l and maximum cost fmax to 220.000 dkk.

5.3 The expert opinion

When source of pollution is known, experience locates remediation plants closest to the source. From Table 5.1 cells 1 and 7 are the most likely candidates for locating a wetland due to the high concentrations of ferrous iron.

p23829.gif - 5751 Bytes

The numbers in the cells represents the concentration of ferrous iron with shaded cells showing values below the threshold. The 1/1 notation to the left indicates 1 individual with 1 generation. Current conditions, shows 6 cells out of 13 are below threshold value. Locating a vegetated wetland in cell 1 raises the number below threshold value to 8, where the wetland cell and its neighbor are improved. Locating the same type of wetland in cell 7 maintain the 6 cells below threshold. Cells 7, 8 and 9 are improved and cells 10, 11 and 13 are raised to values above threshold. Both solutions represent minimized costs.


5.4 The GA solutions

Two runs were performed with the first run having 50 individuals in a population and 60 individuals in the second. Both runs were set to 5 generations since tests showed near optimal solutions after 5 generations for this subset of the watershed. In the first generation each individual were only allowed to have one wetland. Through mutation, over cross and selection the number of wetlands can increase.

p23830.gif - 5761 Bytes

The best guess from the first run shows a vegetated wetland located in cell 3 and a sedimentation basin in cell 12. The number of cells below average is 7 with one of the wetlands located in a 'good' cell (3). Cost is relative high but within the given maximum prize. The second best guess locates a vegetated wetland in a 'good' cell not changing this solution from the current conditions.

p23831.gif - 5865 Bytes

The second optimization run with 60 individuals has its best guess with two wetlands: a vegetated in cell 7 and a sedimentation basin in cell 12 giving 9 cells below threshold value. Costs are within the limits. The second best guess locates a vegetated wetland in cell 7 parallel to the best expert guess.

From the fitness values in the lower right corners the best solution also has the highest fitness value suggesting that the GA converges towards a near optimum solution. Plotting all accumulated cost values against number of clean cells indicates that the GA in the second run has visited the expert solution 1 and 2.

p23832.gif - 3372 Bytes


Fig5.1. The fitness values for all suggested individuals in the last generations plotted against the number of clean cells reveals that the GA second run has visited the two expert suggestions.


6 Discussion and Conclusions

From the initial two tests the genetic algorithm showed capable of performing an optimization but only during the second run. In the first run it missed the near optimal solutions. Initial test runs revealed that the GA quickly stabilizes itself around optimums in the visited search space. If the search space hasn't been visited near optimum during the generation of a population selection and over crossing will not be able to reach that part of the search space. Only through mutation can the GA reach this part of the search space. The number of initial individuals therefore seems quite important and should be set sufficiently high. Similar high rates of mutations will improve the search. In these simulations the chance of one individual mutating were 10% in each generation with no mutations occurring during the test runs suggesting that this value should be increased with low numbers of generations.

Locating one wetland in the data set gives 13 possible sites. Locating any number of wetlands under a given financial constrain with the chosen wetland dimensions and costs presents 495 possible combinations rapidly adding to the complexity of the problem.

As seen in the second best expert guess these systems are dynamic non-linear systems where one improvement upstream caused downstream problems (a pattern revealed through other simulations). Through optimization the over all best GA guess indicated a possible solution to this problem. By doing so the GA suggests a near optimal solution in the solution space and thus aiding the decision making process on where, in what sequence and with what feasible consequences wetlands could be located.


6.1 Some considerations for future developments

The enemy number one to this approach is processing time. The time used for this simulation was approximately 35 minutes for each generation with 60 individuals, running on NT40, 400MHz Intel xeon processor. Increasing the size of the watershed will cause the processing time to increase >> n2, why guided search might be an option. One way to improve the search in the search space is to visit only those cells where problems occur. In the given case only visiting cells above threshold values. Another option is to reuse some of the previous calculations since the GA during its optimization approaches a near optimal solution consisting of relative identical strings in the stream network.

References

Bäck, T. 1996. Evolutionary Algorithms in Theory and Practice. Evolution Strategies, Evolutionary Programming, Genetic Algorithms. Oxford University Press, Oxford, 314 p.

Baker, J.E. 1985. Adaptive selection methods for genetic algorithms. In J.J. Grefenstette (ed.). Genetic algorithms and their applications. Proceedings of the second international conference on genetic algorithms. Lawrence Erlbaum Associates, Hillsdale, New Jersey, pp. 101-111.

Bjornsson, C. 1999. Approach to Dynamic Wetland Modelling in GIS, in Geographic Infrmation Research Trans-Atlantic Perspectives ed. Craglia M and Onsrud H, Taylor and Francis, pp. 203-217

Booker, L.B. 1987. Improving search in genetic algorithms. In L. Davis (ed.). Genetic Algorithms and Simulated Annealing. Morgan Kauffmann, Los Altos, California, pp. 61-73.

Chiang, H C and Lauria D T, 1977, Heuristic Algorithm for Wastewater Planning, Journal of The Environmental Engineering Division, Oct. 1977.

Eshelman, L.J., R.A. Caruana and J.D. Schaffer 1989. Biases in the crossover landscape. In J.D. Schaffer (ed.). Proceedings of the third international conference on genetic algorithms. Morgan Kauffmann, San Mateo, California, pp. 10-19.

Goldberg, D. 1989. Genetic algorithms in search, optimization, and machine learning. Addison-Wesley, Reading, Massachusets.

Goldberg, D.E. and K. Deb 1991. A comparative analysis of selection schemes used in genetic algorithms. In: Rawlins, G.J.E. (ed.). Foundations of genetic algorithms. Morgan Kaufmann Publishers, San Mateo, California, pp. 69-93.

Mitch, W.B. and Gosselink J.G. 1993. Wetlands, Von Nostrand Reinhold, p. 272

Steinitz C, 2000, Introduction to the framework model, Presentation at the workshop Applying informations sciences to theories and methods in landscape planning, RVAU Copenhagen.