In contrast to many computerized mapping systems a geographical information system (GIS) supports many possible applications such as routing, allocation models , address matching, various kinds of geographical analysis, as well as cartographic representation. Data is updated at one point and may immediately be reflected in all applications.
The question is how to manage and maintain both the true geometric properties and the different cartographic representations of the same data. Often agencies resort to keeping multiple databases, one for each scale range and product. This is an expensive solution, especially since database revision must be performed individually for each database. There is also a great risk of inconsistency between the databases.
In our production environment, we are using a data model, which is based on the concept of Master and Product databases, MDBs and PDBs. Through the implementation of this Master-Product model the problems related to maintenance of multiple represenations of the same dataset have been reduced. The basic concept is to maintain one large-scale Master Data Base (MDB), from which several smaller-scale Product DataBases (PDBs) can be created and maintained. Using the Master-Product model it is possible to keep geometrically accurate base data and cartographically processed product data separate yet related
The system is constantly evolving. Todays version of the MDB-PDB
model is implemented in ArcInfo and ArcStorm (Esri inc) requirering
a Unx server and NT clients . Development efforts are currently
focussed on implementation of the MDB-PDB model in the Esri products
ArcView and the Spatial Database Engine (SDE).
Introduction
Data stored in a geographical information system (GIS) can support many different applications such as routing, allocation models , address matching, various kinds of geographical analysis, as well as cartographic production.
An increasing number of agencies around the world discover how geographic data can be used in many new applications. One example is a road and street network databases which traditionally have been used for production of road maps, but which are now considered for to provide the geographical base in systems for vehicle navigation.
The quality, flexibility and usability of geographic databases are becoming more important aspects today, than the need to digitally "mimic" a paper map product. Moreover, in the near future we will really be "on-line" with the geographic database and will require methods for presenting data from geographic databases in a cartographically pleasing manner "on-the-fly" via Internet.
To create high quality maps in different scales, including the ability to handle and present complex cartographic events, while keeping the geometric integrity of the database in use has been a real challenge in our production environment.
Cartographic production requires a database which has been adapted to suit the scale of the map product. Data has undergone various steps of generalization, such as simplification, deletion, amalgamation and displacement. Furthermore, different map products require different types of generalizations depending on the purpose as well as the scale of the product. Geographic analysis, on the other hand, requires highly accurate base data.
The data model
The demands on the data model are high when data is intended for use in both geographic analysis and for cartographic production.
The geographic database in itself (even if it is geographic) is a generalisation of the real world. Any geographic database will always be an abstraction of the real world, and it will be designed with certain applications in mind. This has been called a geographic generalisation by some authors (Muller, 1991; Kilpeläinen, 1992). The scope and purpose of the database will determine the data model, i.e. the object classes, and the attributes of the object classes in the database. This is one of the reasons why it is so difficult to define and provide general purpose "base" database for many different applications.
The road and street network databases are a good example. The attribute defininitons will look quite different depending on the intended purpose of the database. Some applications, such as address matching require relatively few attributes, the name or the identification number of the street or the road, and the addreses. Road databases for vehicle navigation, on the other hand, require quite an extensive list of attributes for access restrictions, speed limits, turning restrictions, and other factors of importance for a driver. The selection of which streets or roads to include will also differ between applications. Thus, even though the geometry might be the same, the underlying data model might still be entirely different.
In addition, cartographic applications require cartographic generalisation to produce a visually pleasing and readable map product. The cartographic generalisation will include deletion of features, which is reatively easy to handle in any data model, but it will also include amalgamation, and displacement of objects, which are much more difficult to handle. In addition, cartographic editing involves object symbolisation, and exaggeration of important objects. Both of which must be supported in the object model, by specific cartographic attributes which will control the the visual presentation of the data.
The question is how to manage and maintain both the true geometric properties and the different cartographic representations of the same data, in an efficient manner. With currently available technology it is necessary to store multiple representations of the same "real world" feature; the primary object, representing the feature with the best possible geometry, and derived objects, representing generalised versions of the feature.
In the literature it is possible to find many different models for how to implement multiple representations of the same real world feature (Kilpeläinen, 1992). We have chosen to build and maintain separate, yet linked, databases (figure 1). The geometrically correct base data is stored at highest possible accuracy in a master database (MDB). Generalised, or derived data, is stored in scale dependent product databases (PDB).
Figure 1. Highly accurate base data is stored in a master database
(MDB). Generalised, scale dependent and product specific data
is stored in separate, yet linked, product databases (PDBs)
There are reasons to clearly separate between the generalised geometry of a feature and cartographic edits to that feature (such as road buffers, or product specific type selection or symbolisation). The generalised geometry makes the data suitable for use in products of a certain scale range, but it is not strictly product dependent. Cartographic edits are truly product dependent. The cartographic representation of lines, points and polygons is controlled by feature attributes (type values), and product specific look-up tables.
In the MDB-PDB data model, we aim at keeping the derived (generalised) data in the PDB separate from the truly product unique data, by keeping them in separate layers in the database(figure 2). Thus it is possible to use the derived data for many different products in the same scale range. All derived data is linked back to the master database on a feature level. In other words, each derived feature in the Product Database "knows" which feature it was derived from in the master database.
Figure 2. Generalised base data, and product specific cartographic data is, to the extent possible, stored in separate layers in the Product Database
Database Consistency
The main problem with multiple representations of the same object is to maintain data consistency. In the MDB-PDB model, functionality to check consistency between the product databases and the main data base has been implemented through use of the feature based links, and through the use of the history functionality of the databases. It is possible to query the master database for features, which have been changed (added, deleted, modified) since the last update of the product database. The changed features can be extracted, and automatically transferred to the product database.
The ability to automatically check that a specific product database is up-to-date is extremely valuable in a production environment. Resources can be focussed on the maintenance and management of one database (the master database), and the product databases can be updated as-needed, and for limited areas, by requesting a list of the changes from the master database, of relevance for a certain product.
In addition, the ability to automatically transfer changes from one database to another will save a lot of time. This is especially valuable for attribute changes, which can be done fully automatically. We estimate that at least 60 % of the revisions to our databases are attribute changes.
Generalisation
"On-the-fly" generalisation, as defined by van Oosterom (1995), is the process where generalisation processes are temporary applied to the geographic database and produces data to be visualised "on the screen" or to produce hardcopy output. The advantages of "on-the-fly" generalisation are obvious since no or limited interactive cartographic editing or generalisation will be necessary.Still, it is difficult to see, at least in the near future, on-the-fly methods, with commercial acceptable productivity, which can solve traditional cartographic problems and yield an acceptable visual output in cases such as:
object symbolisation conflicts solved by object displacement
amalgamation of minor objects
exaggeration of important objects
These cartographic problems must be solved by geometric editing in the database and this should be done in the ProductDataBases. The MDB is the geographically correct database, which is used for analysis and where all database updates must be executed. In the PDBs, all necessary cartographic editing and generalisation is performed.
The discipline of cartographic generalisation has concentrated on how generalisation tasks should be performed in terms of (cartographic) database editing. Equally important is to assure that future updates of both the generalised and the ungeneralised data can be performed and controlled (i.e. a link between the generalised and the ungeneralised data must exist, be used and preserved during the updating process). This is currently not solved in a satisfactory manner by the MDB-PDB model, since the automated transfer of geometrically edited features from the master database to the scale databases will overwrite any generalisation or cartographic editing previously applied to the feature in the PDB.
One possible approach to alleviate the problem is to avoid these specific types of cartographic generalisations, to the extent possible, and to find new ways of visualising complex situations. For example to visualise a complex junction, not by displacement of individual componentents of the junction, but by a symbol indicating the nature of the junction (Oxenstierna, 1997).
Another useful (and complementary) approach is to store as much of the cartographic information as possible as attributes in the databases, and to use look up tables or cartographic presentation rules to translate the attribute values into cartographic properties.
Both of the above methods are applied in our production environment to minimise the need for repeated manual edits of the same feature for different cartographic applications, when the base data has changed.
Implementation of the MDB-PDB model
A new PDB is created by copying data from the MDB. Each feature in a PDB relates back to its parent feature in the MDB through uniqe identifiers. It is possible to update the PDBs when changes have occured in the MDB. The update process can be automated. New features copied from an MDB to a PDB will automatically receive the cartographic properties defined for the PDB.
The MDB and the PDBs are stored as libraries in an ArcStorm database (Esri inc). ArcStorm (short for ArcStorageManager) is a geographic data storage facility and transaction manager for ArcInfo data. An MDB is also called a source library, and a PDB is called target library.
Updating a PDB from an MDB
There are two critical points in the MDB/PDB approach:
a) to keep track of changes introduces in the master database, and
b) to update the product database with respect to relevant changes
In its current implementation, the MDB-PDB model utilises the ArcStorm History function to keep track of changes in the MDB. Given from/to dates, the system will check:
What features have been deleted?
What features have been added?
What features have been changed, either with respect to the geometry or with respect to the attributes?
Based on the unique relationships between corresponding features in the MDB and the PDB(s) it is possible to automatically update the PDB(s) with respect to changes in the MDB.
The concept can be divided into the following steps:
1. Define MDB
2. Create PDB
3. Update MDB
4. Extract from MDB
5. Update PDB
The MDB-PDB model is currently being implemented in the Spatial Data Engine, a new product from Esri inc, which stores geographical data in a relational database system , currently Oracle.
Product examples
This production model can be used in different kinds of organisations. One example is the The Swedish National Road Administration (SNRA), which is using this model in all its geographically related applications.
SNRA has all its information in a road database called KARDA (the MDB). From this base a large number of diverse products are generated.
References
Kilpeläinen, T.:1992. GENERALIZATION. Not in the Domain of Maps but in the Domain of Geographical Databases. Surveying Science in Finland, Vol.10, no. 2, pp. 11-33.
Muller, J.-C.: 1991. Generalization of Spatial Databases. In: Geographic Information Systems: Principles and Applications, 1:457-475. London: Longman.
Oxenstierna, A.: 1997. Generalisation rules for database-driven cartography. In: proceedings ICC 1997. In printing.
van Oosterom, P.: 1995. The GAP-tree, an approach to 'on-the-fly' map generalisation of an area partitioning. In: GIS and Generalisation, pp. 120-132, London: Taylor and Francis.