Datawarehouse; where to locate GIS
|
[ Top ] |
We cannot calculate how much data nowadays is electronically available. Neither can we imagine how much it has cost to gather this data and to feed the systems. Almost all data is captured and used in special purpose systems, so called operational systems. Data in these systems is being modelled for these systems in a specific format. This is done just on behalf of the specific transactional system. Now this data is here, we want to use it for not only the operational tasks but more for decision support. This is becoming more and more important. If you want to keep ahead of your competitor, you must get a better understanding of their needs, the trends in the market, the correlation between events, etc. Not only in competition, but good information also makes the whole company perform better. Good information means information that is there at the right moment, at the right place and in the right (useably) format.
For people who are not used to work with GIS and even those who do, it is hard to imagine that over eighty percent of the business data has some spatial context. This means that if you want to use data for your decision support systems, you have to consider the use of this spatial factor.
To get and to keep all spatial and non-spatial data in a useable form for decision support, the concepts of Datawarehousing and GIS can be applied in an integrated fashion.
[ Top ] |
Until now information technology is mainly focused on getting the manual processes automated. The goal is to get the work done at lower operating costs. It is rather easy to calculate the return on investment when you buy a bookkeeping program which saves you, for example 20, man hours a week. It needs another way of thinking to see the benefits of investing in a dedicated Datawarehouse system. This investment is based on the premise that the intellectual process gets informed, that strategic decisions can be better made.
The users of the corporate computers understand that the key to identify co-operative threats and opportunities lies locked in the corporate data which is often embedded in legacy systems on different technologies. The customer nowadays has more and more individual needs. Companies have to react to these needs, have to know what trends there are, or better what the trend will be. They need tools which can analyse data, the so called decision support systems. These systems are fed by data which is managed in a Datawarehouse.
In very many cases the introduction of a Datawarehouse is leading to fundamental changes in the way the market is looked at, processes or correlation's between events. This is driven by the policy of micro-segmentation on basis of data-patterns, which allows the enterprise to observe, over time, the behaviour of data, the corresponding behaviour of customers, processes or events. It is no longer sufficient to satisfy a customer, to monitor processes or events. The aim must be to delight the customer, to predict outcomes of processes and events. The competition has to be beaten. It is not enough to keep up with the competition, you have to surprise them. Information to do so, must be available and accessible.
There are three basic assumptions to justify a Datawarehouse. The first is locked inside the corporate data; there are valuable patterns of information which are very important in guiding the business. The second is that this information will form the basis of unique services to customers, discovering new trends, predicting outcomes in a manner that will transform the understanding which the company may have of the market. The third is the shortening of the distance between the identification of strategy and the execution of strategy. This will progressively transform the understanding which the company may have of its own organisational structure. Through the developments in hardware and software it now is possible to create the IT-architecture (the Datawarehouse) which can handle the huge amount of data.
[ Top ] |
The Datawarehouse may be defined in terms of nine characteristics which differentiate the Datawarehouse from the legacy systems in the company. To make these differences clear, let us first have a look at the main characteristics of an operational system.
The operational or 'legacy'-systems are being optimised for their operational tasks. Therefore they have:
The Datawarehouse is designed for Business analyses. It has a different character than the operational systems.
Typical for Datawarehouse systems is:
The Datawarehouse is in essence a response to the problems and constraints that exists in Information-technology. Datawarehousing is an answer to the problems of: integration of operational applications, modelling the data to corporate standards, not fulfilling the demands of reporting requirements of decision makers, not being able to ensure that the data in corporate databases is clean and consistent.
The Datawarehouse makes it possible to do on-line analytical processing (OLAP). OLAP systems are used by decision makers to query and analyse the data in the Datawarehouse. The data for analysis with OLAP is accessed through metadata which document data source, frequency of update and location of data.
The outcome of queries is represented "multidimensional". A multidimensional database is a database where the data is structured as measures and dimensions. Measures are numerical data such as sales. Dimensions are the kind of data that can be summarised with measures such as store, region, or state. The user can specify high- or detailed-level views of data with navigation through drill downs in reports to finer levels of detail and analysis by product, location, and time.
The data returned from the queries can be used to drill down. This allows the user to ask more detailed questions. For example, after identifying the road with the most accidents, the user can then search the Datawarehouse for information about weather circumstances, number of vehicles per day, road surface etc.
[ Top ] |
To be able to use spatial data, and to take full advantage of the spatial dimension, the locational element data has to be integrated in the Datawarehouse. The following GIS-concepts is being used and, with GIS-technology, being implemented in the organisation. There are four main items to distinguished:
Figure 1: Spatial Datawarehouse | [ Top ] |
Making your Datawarehouse spatially enabled provides you with four distinct capabilities:
Spatially enabling adds a new dimension to your database. This dimension the geographical one, which does not need to be explicitly defined. By storing the geographical co-ordinates in the database, query is possible on their interrelationships based on geography. When location of stores, or competitors information are stored in the Datawarehouse, for example the following questions can be asked:
An infinite number of questions can be created when you want to geographically relate subjects in the Datawarehouse (customers, competitors, stores, stations, generators etc.) to subjects of geographical interests (towns, gas pipes, power lines, bus routes, streets).
[ Top ] |
To build an architecture like a Datawarehouse means investing. It has to be made clear what the benefits are. In most cases the real benefits of the Datawarehouse are not known or even anticipated at the moment of construction. This is because the Datawarehouse is used in a entirely differently way from the operational systems. It is used in a trial and error way of working. The decision Support analyst cannot say what the possibilities and potential of the Datawarehouse are until the first version of the Datawarehouse is ready to use. The normal way of calculating the return on investment cannot be used.
Fortunately, the Datawarehouse is built in small steps. The first step (iteration) can be done quickly and for a relatively small amount of money. Once the first portion of the Datawarehouse is built and filled, the user can start to explore the possibilities. At that point it is possible to make a justification of the development costs of a Datawarehouse.
The benefits of the Datawarehouse comes in the ability to make effective decisions from it. The possibility to discover trends and correlations, as they happen now provides the benefits to the business. It is not easy to quantify the benefits to justify the Datawarehouse. How much is saved by giving the Decision makers an effective process for making critical decisions? The experience of the now implemented Datawarehouses teaches us that the organisations who have such an Informationtechnology-architecture could not do without it anymore.
[ Top ] |
The advantages of using a Datawarehouse lie in the better understanding of the business, the possibility of the customer being served better, better understanding of the business risks, improvement of the business processes, being able to make more tailor made products and services.
The most important attraction of spatial enabling your Datawarehouse is being able to make dynamically geographic queries on your data, to aggregate your data to geographic areas, to analyse data, spatial (re)organisation of your data and, last but not least, presentation and visualisation.
Logisterion has executed projects for a telecommunications company, public housing and the ministry of public works. Although these experiences are in some cases at an early stage, it is clear that there are several ways to look at a Datawarehouse in combination with a GIS. In all cases, we see many benefits of Datawarehousing with a spatial component, and its additional value to the business.
[ Top ] |
Kelly, S, Datawarehousing: the route to mass customization. 1997.
Vermeij, L.W. & Berkel, J. van, GIS Mapping for the Data Warehouse. 1997.
Drs. Jan van Berkel
Consultant, Logisterion Automatisering Stationsplein 45 3013 AK Rotterdam Netherlands | ![]() |
Telephone: 00 31 10 217 07 00
Fax: 00 31 10 413 96 93 Email: berkel@logisterion.nl | [ Contents ] |