The Automatic Generalization of the Multiscale Geographic Information

Marco Moreno, Miguel Torres, Serguei Levachkine, Ivan Fajardo
Geoprocessing Laboratory
Centre for Computing Research, IPN
Av. Juan de Dios Bátiz s/n Unidad Profesional "Adolfo López Mateos"
C.P. 07738, México, D.F., México
Tel. (5255)5729-6000 Ext. 56558 & 56590; Fax (5255)5586-2936
marcomoreno@cic.ipn.mx, mtorres@cic.ipn.mx, palych@cic.ipn.mx, ifajardo@ipn.mx
Abstract

Traditionally the modification of scale is a manual process. The Geographical Information Systems (GIS) provide tools in order to make it, however like in manual process this depends on the experience of the users and how these tools are used. Implementing a method that can work automatically allows us to obtain consistent results, according to established rules. The cartographic generalization establishes rules to preserve the geometric and semantic characteristics of the objects. In this work we consider the scale modification from 1:50,000 to 1:250, 000. The process of change of the scale requires the following stages: Analysis, Automatic Correction of Arc Directions, Classification, Selection/Elimination, Simplification and Enhancement. The Automatic Correction of Arc Directions is a very useful function that requires only an altitude layer. The classification is denominated CLAJER. It orders the arcs of the network by their length. This process has been evaluated for real maps in order to find some inconsistencies in the change of scale. This system has been implemented in Arc/Info by means of AML codes.


1. Introduction

Traditionally, spatial objects and phenomena are represented in maps at different scales and for different purposes (geological maps, topographic maps, road maps, etc.). In México, topographic maps are produced at the of scales 1:25,000, 1:50,000, 1;200,000, 1:250,000 and 1:1,000,000. In cartography, the process of the scale modification is called map generalization. The map generalization or simply generalization establishes rules for the representation methodology and preserves the geometric and semantic characteristics of the objects.  It is the reduction of complexity of a map, emphasis of the map essence, suppressing the unimportant details, maintaining logical, unambiguous relations between objects and preserving aesthetic quality. In certain sense, it is possible to solve the problem of generalization in the general form, but in this case, it would not be coincided with the topological, logical and geometrical particularities of certain systems of objects. Therefore, the main idea of the present work is the development of solutions which are applicable to the specific cases. At this moment it is not possible to automatically modify a smaller scale to the large one without data inferring.

In the context of digital cartography the generalization has obtained even wider meaning. It means the possibility to flexibly select and compose feature classes. The features are obtained from queries, the interactive zoom and the inspection of the data at any desired scale (magnification and reduction).  The use of GIS for implementation of a system able to generalize allows us to obtain consistent results, according to established rules.  In the context of GIS, the generalization implies the application of attribute (database) and spatial (geometric) transformations. Database modification includes: the number of entities reduction; attributes elimination; attribute values robustness. Spatial transformations imply the reduction of object density (or its parts) and simplification of the shape. All these aspects are considered in this work.
 

2. Analysis

To modify the scale it is necessary to analyze the geometrical and topological characteristics of the hydrological networks in different scales (1:50,000 and 1:250,000), obtained from topographical maps. The objective of the analysis is to find the particularities that the objects have in several networks and different scales and clearly identify the objects that compose them. The Hydrological Networks are composed of many objects: rivers, drainage, wells, natural and artificial bodies of water. The results of the analysis must show characteristics that define the behavior of the Hydrological Networks.They will be used to implement later procedures. These characteristics are the following: Objects (i.e. the rivers) can be represented by lines or areas depending on the scale. They are always connected with other elements of the network. In addition, objects (i.e. lakes) are represented by areas. This is why some configurations involve the use of arcs, polygons for total representation of the network. Hydrological networks present different configurations, depending on the scale, composed of arcs, points, and areas. Some networks are composed of solely arcs. They are called arc-arc configurations and represent flows. Other configurations are called arc-area-arc, point-arc-area, etc, depending on the elements that compose them. See Figure 1.

Hydrological networks configuration

Figure 1. Hydrological networks configuration.

Due to problem's complexity, the analysis is focused to modify the scale in the networks with the arc-arc configuration. In the following works it would be possible to consider any configuration of networks, using the flow direction. To modify different configurations it is required to manipulate any vectorial layers since the lakes and rivers are stored in different layers. When the scale changes, the representation of some objects (principally lakes and rivers) may be is changed too. The map projection of vectorial layers used in this work has to be UTM or any other projection in meters units.
 

3. Correction of arc direction

Vectorial data are susceptible to errors. For example, the arc directions can originate a bad classification of the arcs. Due to this reason, it is important to develop a process of automatic correction of the arc directions that diminishes the time of correction considerably. The process can use the altitude layers. This process depends on the amount of used intervals: to more intervals will be possible to correct a greater number of arcs to be extracted the altitudes, using a node buffer of the flow layer. The arcs are corrected using an attribute table. We assume that for all arcs, the altitude of the initial node must be greater than the altitude of the final node. The process provides automatic correction and it can be used optionally. However it is helpful in improving the quality of the data.
 

4. Clasification

This process is used for a specific purpose and usually involved the clustering of data values into categories. In this work three clasiffication process are used. Two ones are used to facilitate the implementation of the system of generalization and the other one quantifies  the Hydrological Network components. It will serve to identify the most important elements.
 

4.1 Preliminary Clasifications

In order to classify the arcs, these are to be clustered arcs in subsystems. The clustering is used to define the connections between arcs. A subsystem consists of elements with some possible connections among them in the case-connections between arcs.  Two alternatives exist to do this. First one is the use of watersheds (previously elaborated) and second one is, the use of a buffer that is totally automatic process (not previously elaborated). A unique identifier is assigned to each subsystem. Furthermore, the arcs are classified by its location in the Hydrological Network. They are classified by inputs (I), outputs (O) and half-way (P) arcs. We assign the corresponding value in an attribute.
 

4.2 Hierarchical Length Classification

The classification is called "Hierarchical Length Classification" (CLAJER) [10]. The process is to classify the arcs with longest route from an input to the output (denominated path). This longest route is of the first order. The second order arcs are the longest ramifications of the one of the first order arc. The arcs of the third order are the longest ramifications of the one of the second order arc, etc. The classification finishes with fourth order arcs, because for this particular application further details are not required. The routes are processed using the attributes FNODE and TNODE. The process is applied to classified arcs of the same subsystem. As a result of the classification the length and the order that are stored in attributes are obtained: See Figure 2.

Hierarchical Length Classification (CLAJER)

Figure 2. Hierarchical Length Classification (CLAJER).

The classification is an evaluation of the possible routes that can follow the flow from all the inputs to the output.  It can iteratively be applied, since the same principle is used to find all orders. This classification allows preserving the characteristics of the Hydrological Network.
 

5. Selection/Elimination, Simplification and Enhancement (SESE)

Once the network is classified, the objects are selected/eliminated by the order and length characteristics, eliminating those that do not fulfill established in the classification rules. Later on, the arcs are simplified and enhanced. The selection criteria are in Figure 3. The National Institute of Statistics, Geography and Informatics (INEGI) provides representation dimensions for the arcs and areas according to the scale to represent the Smallest Visible Object (SVO) [4]. Some of INEGI characteristics have been used to elaborate Figure 3. The scale variation form 1:50,000 to 1:250,000 is large. Due to this reason an intermediate stage is required. The intermediate scale is approximately of 1:100,000. It is used to preserve the details such as intersections and maintain the relations with other layers.

Selection Criteria

Figure 3. Selection criteria.

The Douglas-Peuker algorithm is used to simplify the arcs. This step eliminates some vertex to be considered unnecessary after modifying the scale. This algorithm allows us to obtain a good map simplification. In addition it preserves directional trends in a line using a tolerance factor which may be varied according to the required simplification degree. To obtain a better visual quality the arcs are smoothed using Splines in order to produce more aesthetically shape and eliminate some effects of simplification. Both functions represent the generalization operators and cause displacement in coordinates pairs. The displacement solely acts on each one of the arcs, but not on the path (composed of many arcs). This is why the coordinates of the nodes are preserved. The nodes must be eliminated to simplify the database to the maximum. Each pair of nodes represents a register. The nodes are not eliminated before Simplification and Enhancement operators are executed.

This method is implemented in ARC/INFO, using AML. ARC/INFO provides the functions that can be used to implement generalization systems as BUFFER, CLIP, Line simplification, etc. It has good capacity to make operations on the database. It can be implemented in wide classes of GIS. Figure 4 shows the general process of the generalization step by step.

General process of the generalization

Figure 4. General process of the generalization.

6. Results

The process has been evaluated for real maps in order to find some inconsistencies in the change of scale. The resulting digital layers are preserved the particularities in the structure. Some results are presented in the Figure 6(a): the original layer has 350 arcs and the resulting layer has 161 arcs before nodes elimination. We use the Radical law [11] to compare the results. The equation is presented in the Figure 5:

Equation of the radical law

Figure 5. Equation of the radical law.


Where:
nS - Number of objects in source map;
SS - Source scale;
ST - Scale after transformation;
nT - Number of objects in map after transformation.

These results have visually been analyzed and compared with manual INEGI data, it is observed that in the resulting map the main flows preserve: Figure 6(b). Based on test results, we can apply these criteria to change different scales just modifying the criteria of selection and the tolerances of simplification and enhancement. The significant differences are originated due to the extension and number of objects in the network. Better results would be obtained if more extensive areas and objects are considered.
 

Some results of the generalization
Figure 6. Some results of the generalization.



In the Figure 7 is shown the Graphical User Interface (GUI) of the system. This GUI contains the main functions in order to generalize spatial data in an automatic way.

Figure 7. The GUI of the system.

7. Conclusions

In this work we have preserved the automatic change of scale in vector maps, using the established principles of generalization. Methodology developed here allows us to obtain acceptable results of automatic generalization in the specific cases. This methodology can be applied for wide-classes of GIS.

The process can be executed without additional information to spatial data. It is only required the correction of connections between arcs and the directions of the flow. The arcs can automatically be corrected, using only altitude layers. Furthermore in the classification processes the additional information has been generated. CLAJER allows us to modify the objects topology and graphics. The semantic value of the network is preserved.

At this moment, it is possible to change the scale in automatic way for the arc-arc configuration. In the following, it would be possible to process other configurations, using the flow representation in order to quantify the network. In addition it is necessary to develop a process to collapse automatically areas and automatically represent the flow, using the information from several layers. It is necessary to consider, the information from other topics such as populations and roads.
 

References


[1] Environmental Systems Research Institute Inc., "ARC/INFO User Guide: Map Display and Theory", Redlands, California, Esri, Inc. 1993.

[2] Environmental Systems Research Institute Inc., "ARC/INFO User Guide: ARC/INFO Data Model, Concepts and Key Terms", Redlands, California, Esri, Inc., 1997.

[3] Frank U. Andrew, Werner Kuhn, "Spatial Information Theory (A Theoretical Basis for GIS)", International Conference Cosit Helsinki, 1995.

[4] National Institute of Statistics, Geography and Informatics (INEGI), "Diccionario de Datos Topográficos (Vectorial) Escala 1:250,000", México, 1998.

[5] National Institute of Statistics, Geography and Informatics (INEGI), "Diccionario de Datos Topográficos (vectorial) escala 1:50,000", México, 1998.

[6] Kilpeläinen Tiina, "Map Generalisation in the Nordic Countries", Geodetiska Institutet, Helsinki, 1999.

[7] Kilpeläinen Tiina, "Multiple Representation and Generalization of Geo-Databases for Topographics Maps", Geodetiska Institutet, Helsinki, 1997.

[8] Kilpeläinen Tiina, "Multiple Representations and Knoledge-based Generalization of Topographical Data", Geodetiska Institutet, Helsinki, 1992.

[9] Molenaar Martin, "An Introduction to the Theory of Spatial Objects Modeling for GIS". Taylor & Francis, 1998.

[10] Moreno M., "La Generalización Automática de la Información Geográfica Multiescala", Centre for Computing Research, National Polytechnic Institute, Mexico, D.F., 2001, (M.S. Thesis in Spanish).

[11] Müller Jean-Claude, "GIS and Generalization", Taylor & Francis, 1995.