HaeKyong, Kang

Model-Oriented Generalization Rules

Hae-Kyong Kang¹, Soon-Hee Do¹, Ki-Joune Li¹, and Byung-Nam Choi²

¹ Department of Geographic Information Systems,
Pusan National University.
{hkkang,shdo}@quantos.cs.pusan.ac.kr
lik@hyowon.cc.pusan.ac.kr

² Korea Research Institute for Human Settlements.
bnchoe@krihs-web.krihs.re.kr

Abstract

This paper presents a set of rules for deriving a new database from a preexisting one in GIS. If we apply map generalization operators in ArcInfo to the database, it results in evolutions of a data model. We need to control these evolutions for a successful derivation of a new database. In order to control evolutions, we first investigate effects of generalization operators upon data model. Specially, we classify the effects of six generalization operators - preselection, elimination, collapse, simplification, geo-aggregation, classification- upon a data model. And we propose a set of model-oriented generalization rules for the effects.

1. Introduction

Database derivation is a useful way in building a new database in GIS. It is one of four application fields mentioned in [Lag94] which need generalization facilities. Generalization in GIS transforms data into an adequate one to be represented at a smaller scale [McMaster92, Esri96, Dettori98, John99]. It means that generalization derives new data through transformation of the spatial and non-spatial properties of a feature.

A feature contains four elements including geometry, non-spatial properties, topological relations and non-topological relations[Tang96]. These elements are changed through generalization. First, generalization operators changes the geometry of a feature where it generates scale-dependent data. This generalization has been called map generalization or cartographic generalization since it concerns only geometric changes. Secondly when a geometry is changed, topological relations, non-topological relations or both can be changed as well. The generalization must preserve consistency of topological relationships[Dettori96, Dettori98]. And it may create, delete or derive features and non-topological relationships[Peng96, Richard94, John96]. When a non-spatial property is changed, it causes a set of changes of geometry, topological relationships or non-topological relationships can be changed. These generalizations are called model-oriented generalization, since it concerns changes of data model in GIS databases.

In this paper, we propose a set of rules for changing topological relations, non-topological relations, spatial properties and non-spatial properties in model-oriented generalization. We first give a list of ten components of data model which are to be considered during the generalization[Tang96, Rumbaugh91]. They serve as a basic concept for model-oriented generalization in the rest of this paper.

It is for a framework of a data model for model-oriented generalization. Then we select six generalization operators through comparison of previous studies [Beard91, Ruas93, Lee93, Lag94, Dettori96, Esri96, John96, Peng96]. We investigate effects of these operators upon a data model. And we propose a set of model-oriented generalization rules for deriving a consistent database.

This paper is organized as follows. In section 2, we investigate related studies. In the next section, we give descriptions on the framework of data model change for model oriented generalization. Then we propose a set of model-oriented generalization rules using object-orient concept in section 4. We present an example in section 5. Finally, section 6 concludes this paper.

2. Related works

2.1 Generalization

A need of generalization in GIS environments has been widely recognized. [Lag94] cites four application fields of generalization that is data visualization, data integration, derivation of databases and data analysis. In data visualization, the generalization selects adequate data which are dependent upon a scale. This generalization has been called map generalization and many studies have been proposed since the mid of 1970��s[McMaster92a]. In data integration, data analysis and derivation of data, the generalization integrates with multi-scale databases into a new database and it requires more complicated procedures than the map generalization contains a number of issues listed in table 1.

Table1. Issues in generalization

Issue

Related study

- data selection
- creation of new information
(etc. class, relation)
- data derivation from an implicit one of a source database
- updating derived data
- changing topology
- semantic/schematic conflict
- maintenance of consistency
- maintenance of propagation

- Model-oriented generalization operators
[Beard91, John96, Peng96, Dettori96]
- Multi-scale data model
[Rigaux94, Rigaux95, Barne-tt96,Barbara95]
- Rule to derive data [Richard92,Sester98]
- Object abstraction
[John96, Richard94]
- Linking or translation among objects
[Miller93, Atzeni95, Devog98, Sester98
- Integration data model
[Atzeni97, Bennett97 ] - Schema analysis methodology
[Nyerges89, Castano99]

2.2 Model-oriented generalization

From database point of view, a model-oriented generalization includes a process to derive new data model from source data model. If a data model consists of classes, relations and constraints, generalization contains a process to derive classes, relations and constraints from source ones.

Previous studies have concentrated on derivation of new classes[Beard91, John96, Peng96, Dettori96, Sester98]. To create a new class, they used an object abstraction and generalization operators. We summarize the effects of the operators upon a class(Table2).

Table2. Category of an effect

Type of effect	Beard91	Dettori96	John96	Peng96
spatial dimension	Coarsen Collapse	Simplification Symbolisation	(none)	Collapse
number of an object (or instance)	Omit Select	Selection	Selection	Deletion Aggregation Simplification Homogenization
number of an attribute and a feature	Classify Combine	(none)	(none)	Universalization
number of a class	(none)	Aggregation	Generalization Aggregation	Universalization Reclassification

2.3 Motivation

However these researches have concentrated on the change of a class itself without considering its relationships with other classes or constraints. In order to derive a data model with rich semantic, the model-oriented generalization must include derivations of relationships and constraints in addition to the class itself. In this paper, we will investigate derivation rules concerning them for model-oriented generalization.

3. Framework for model-oriented generalization

A feature is an object that represents phenomenon in a real world. A feature has been defined as a primitive unit of GIS databases such as DLG-F(USGS), DNF(Ordnance Survey in UK) and core data of the national framework GIS database (Republic of Korea).

[Tang96] defined that a feature contains positional information(geometry), non-spatial properties, topological and non-topological relations. And relations among features includes abstractions like generalization, aggregation in feature-based GIS.

3.1 Components of feature-based data model(GIS data model)

Based on [Rumbaugh91, Tang96], we describe ten elements and notations of feature-based data model as shown by Figures 2. In order to completely define schema and semantics of a class C, related class of C might be needed such as super class C_sp, component class C_cl and domain class C_spatial. If a class C is migrated to a new database, those classes and relations must be migrated. Table 3 describes definition of each component.

Figure2. A framework of feature-based data model

Table 3. Ten components of feature-based data model

Component	Description
Feature	A feature is an object in database. It has attributes and operations. They represent spatial and non-spatial properties of feature.
Feature class	A feature class is a group of similar features. A feature class has its own topological relationships. For example, a feature class Building has face-edge, face-node and edge-node relations. A feature class Road has edge-node relation.
Spatial object class	It represents a spatial property of a feature such as location.
Domain of feature attribute(Attribute:C_spatial)	It is a class or data type which may be provided as a built-in type or user define type. But we do not treat the built-in type for the domain of feature attribute in this paper.
Generalization	Generalization is a TYPE-OF relationship between a class C and generalized one C_sp super class of C.
Specialization	Specialization is a IS-A relationship between feature class C and its sub class C_sb.
Classification	Classification is a CONSISTS-OF relationship between class C and component one C_cl.
Aggregation	Aggregation is a PART-OF relationship between class C and combined one C_ag.
Association	This is a REFERENCE relationship except generalization, specialization, aggregation and classification between feature classes.
Topological relation	Topological relations between spatial objects are those that are preserved under ��rubber-sheet�� transformations, which continuously deform the underlying space [Worboys97].

3.2 Topology model

We use the topology model used in ArcInfo, which supports three topological relationships that are polygon-edge, polygon-point and edge-point relationships as illustrated by Figure 4.

Figure 4.Topology model of feature-based data model

4. Generalization Operator and Rules

4.1 Generalization Operators

We select six generalization operators from [Esri96] : preselection, elimination, simplification, collapse, aggregation, classification. These operators defined in similar ways in [beard91, Lee93, John96, Peng96, Dettori96, Ruas93, Lag94](table2). For example, preselection[Esri96], select[Beard91] and selection[Dettori96, John96, Peng96] indicate the same operator.

*Preselection

It selects a subset of feature classes from a source database for an inclusion in a new database. Feature selection is dependent on the purpose of given application.

*Elimination

It selectively deletes features that do not satisfy the conditions given by application.

*Simplification

It removes unnecessary details from features without destroying its essential shape.

*Collapse

It reduces a spatial dimension of a feature. For example, if scale is reduced, polygon features must be symbolized as points or lines.

*Geo-Aggregation

It combines features in a close proximity of adjacent features into a new polygon feature.

*Classification

It merges adjacent features which have the same attribute into a new higher-level feature.

4.2 Model-oriented generalization rules

If the six generalization operators are applied to feature-based database, they result in changes of geometry, topological and non-spatial relationships and non-spatial data. These changes can be divided into three groups(Table3).

Table4. Effects of generalization operators

Group	Type of an effect	Generalization operator
1	Determine existence of a feature
2	Transform spatial properties of a feature(spatial transformation)	Simplification, Collapse
3	Re-classify features into new feature.(attribute transformation)	Geo-Aggregation, Classification

*Rules for determining existence of a feature

An object-oriented concept serves a basic concept in order to completely define semantics and schema of a feature. Using this concept such as classification and generalization relation, we can decide an existence of related feature to source one, which is selected by generalization operators in group 1. To successful decision, we propose the following rules.

[Rule1] If a class C is derived into a class C_gen of target database, its super class C_sp must be derived into target one as well.

a. a source model b. a target model(the result of rule 1)

Figure5. Model-oriented generalization rule1

[Rule2] If a class C is derived into a class C_gen of target database, its component class C_sb must be derived as well.

a. a source model b. a target model(the result of rule 2)

Figure6. Model-oriented generalization rule2

[Rule 3] If a class C is derived into a class C_gen of target database, its domain class C_domain must be derived as well.

a. a source model b. a target model(the result of rule 3)

Figure7. Model-oriented generalization rule3

[Rule4] If a feature fof a class C is eliminated, its referencing feature must be removed.

* Rules for spatial transformation

In order to successfully derive a new data using generalization operators in group2, we must preserve topological relationships of features. The following rules support the preservation.

[Rule5] When the geometry of a feature is changed, topological relationships and topological consistency between spatial objects must be preserved.

[Rule6] If the spatial dimension of a feature is changed, new topological relationships for a new dimension are to be created.

*Rules for attribute transformation

The operators in group 3 create a new feature class. In order to derive new feature classes, we propose a set of rules.

[Rule7] If a new feature class is created, new topological relationships must be created for the new one.

[Rule8] The numeric attributes of a new feature class can be derived from the original ones by set-functions such as sum(), max(), min() or average().

[Rule9] If a feature class C_gen is created from C, an aggregation relationship is to be created between C _gen and C.

a. a source model b. a target model(the result of rule 3)

Figure8 Model-oriented generalization rule9

4.3 Generalization Operators

We proposed nine rules for model-oriented generalization in the previous section. These rules are used with generalization operators. This section describes that the rules are applied to data model with six generalization operators that we described in this section.

*Preselection and Rules

If a feature class C is selected in a source data model, rule1 derives the generalization hierarchy of C that contains C_sp,(C_sp)_sp and generalization relation between them. Rule 2 derives the classification hierarchy of C that contains a class (C_cl)_cl, C_cl and classification relation between them. And rule 3 derived domain of C that is C_domain. As a result, six classes and relations among them are derived by preselection and rule 1, 2 and 3(figure9. b).

a. a source model b. a target model(the result of rule 3)

Figure 9. Preselection and rule1, 2 and 3.

*Elimination and Rules

The elimination reduces the number of a feature. In this case, we must break its referencing feature to preserve consistency of database. The operator is applied with rule 4.

*Simplification and Rules

Simplification changes the geometry of a feature and consequently the topology model is changed as well. Rule 5 preserve topological relationships among spatial object classes from the propagation.

*Collapse and Rules

Collapse reduces the spatial dimension of feature. As a result, a new feature class, which has lower spatial dimension, is created. The new feature class derives non-spatial attributes of a source feature. Rule 3 loads the domain of a source feature into a target data model. Rule 6 creates a new topological relationships of the feature class.

*Classification and Rules

A classification creates a new feature class C_gen from a class C in a source model(figure10 a.) At this time a new aggregation relationship must be created between C_gen and C.Therefore the classification is related to rule 3, 7, 8 and 9.

Rule 3 derives the domain of C. Rule7 creates a new topological relationships for C_gen. Rule 8 creates a new attribute SetAtt of C_gen from the attribute Att_num of C. Rule 9 create a new aggregation relationship between C and C_gen.

a. a source model b. a target model(the result of rule 3)

Figure 10. Classification and rules

*Geo-Aggregation and rules

A geo-aggregation combines features in close proximity of adjacent features into a new feature. Although its geometric derivation is different from classification, the classes of data model is identical to those in classification. The operator creates a new feature class. Rule7 creates new topological relationships for the new feature class. Rule 8 creates a new attribute. Rule 9 creates a new aggregation relationship.

5. Example

We developed a model-oriented generalization tool supporting the a set of rules we proposed and six generalization operators on ArcInfo desktop 8.0.2 using ArcObjects and Visual Basic Application. We show important functions of our system with an example.

This example describes the process of deriving a new smaller scale database from 1:1000 database for an application. To derive a new database, our application first support a user interface to analyze the data model of a source database. Then it requests a set of conditions which a new database satisfies with. And it applies generalization operators and a set of rules to the data model.

5.1 A source database and conditions

We use a part of the landuse database that is built by the Korea Research Institute for Human Settlements. Figure 11 shows a source data model for this example.

Figure 11. A source data model

A new database must satisfy the following conditions.

Condition1 A target database must include County, State and School. To createState, we use District and state_name of District.

Condition2 If the value of state_name of a District feature is suwon, the District feature is loaded into a target database.

Condition3 We set the scale of a target database for 1:5000. Simplify the resolution of a District feature adequately to represent at1:5000 scale. A proper tolerance to transform 1:1000 to 1:5000 is given 100.

We will use preselection operator and classification operator for condition1, elimination operator for condition 2, simplification for condition 3. The results of generalization can be changed by operator sequence. But we do not consider the sequence in this paper. We will apply operators in preselection, elimination, classification, simplification order.

First, the application display a user interface to analyze source data model. Figure 12 shows the interface for data model analysis. It supports that a user sets relations between classes.

Figure12. Analysis of source data model