BC Environment (BCE) is bringing data from its administrative computer systems to the desktops of all of its decision makers. This is encouraging communication and cooperation between program areas, and facilitating joint resource management for environmental planning and regulation.
Most of BCE's administrative systems have a spatial component. Each water license, waste permit, wildlife habitat zone or fish spawning channel has a geographical location. These spatial components are being combined for display and analysis with a Geographic Information System (GIS).
BCE's goal is to enable all BCE staff to combine attribute and spatial data from any of these systems with easy-to-learn tools on their desktop computer. However, the complexities of data models, database languages, operating systems, and GIS are too much for operational staff to learn. At each of BCE's fourteen GIS installations, there is a GIS coordinator who serves as a "data interpreter" to make business data and software accessible for day-to-day use by operational staff from all departments.
BCE's GIS coordinators were hired for their expertise with GIS, databases, and computer systems. They form the GIS Working Group, which meets in person at least four times a year and is constantly communicating through phone, E-mail, the World Wide Web, and interoffice visits. This level of personal contact has helped to make the GWG one of the most successful technical groups in the BC government.
The coordinators informally share their expertise on GIS processing and analysis techniques, data sources, data standards, application of GIS to BCE business processes, and knowledge of other Ministries and contractors. When appropriate, they work together on developing applications, and capturing and translating spatial data.
The GIS Working Group is developing a common database of several dozen layers or themes of spatial information. For example:
Wherever possible, these data sets have been compiled for the entire province--just under 1,000,000 square kilometres.
These data sets contain all the common GIS feature types: points, lines, polygons, regions, dynamic segmentation, and rasters. Many are linked to attribute data in a RDBMS (Relational Database Management System). Data sets that are too large to manage as one provincial coverage have been cut into tiles and stored in a large data set storage manager.
The amount of spatial data at each GIS site varies from one to over eight gigabytes. In all the data from all the sites, there are over one hundred layers of data from over twenty different sources.
There are four system components that make working together in this way possible:
BCE has several administrative systems which support ongoing licensing, regulation, and reporting activities. These systems run on a variety of computers and operating systems, and are written in a variety of languages and file structures. They may be running on stand-alone hardware, or their data models may be optimized for administration, or they may not be documented at all. It is therefore difficult to extract ad hoc information directly from many of these systems.
BCE's response to this problem is its data warehouse. A data warehouse is a set of disk files and database tables organized to facilitate distribution of data to a diverse group of users. It contains read-only copies of data from these systems, restructured for query and analysis, and stored in a standard format in a single database. Automated processes copy the information to regional servers, where it can be used by operational staff.
BC Environment's data warehouse has these logical components:
The concept of publishing data is central to a data warehouse. In writing a book or article, an author may gather notes from many sources, go through revisions, and get editing help before publishing the final work. Once published, it is ready for public use as an information source. However, its readers cannot make changes to it without involving the author in publishing a new version.
In BCE's case, copying a data set into the warehouse is analogous to publishing. Once in the warehouse, it is in a stable, known state, ready for distribution to various users. The users are not able to modify it, but may send revision requests to its creator for inclusion in the next version.
Each data supplier has one or more directories in the data warehouse. Publishing a new or modified datasets involves putting it into one of these directories. Every morning, a small (under 100 lines) Unix script checks for changes in the data warehouse directories. Files which have changed are automatically replicated to the regional sites. This mechanism acts as a direct 'pipe' from each data supplier to each data user, without human intervention.
The data warehouse simplifies sharing of data because :
The GIS Working Group has developed comprehensive standards for hardware, software, applications, and published data at each site.
The need for standards was seen very early in the GIS implementation at BCE. The first test data sets suffered from the "box of floppies" syndrome: "Here are some diskettes that contain your . . . data." The data set could not be used before finding out what GIS software it was in, what projection, what attributes, what symbology, etc. Then started the translation, cleaning, reprojection, and rebuilding of topology and symbology.
The GIS Working Group quickly developed standards so that a data set would go through this process only once when it was first acquired. It would then be in a known state, ready for use anywhere in BCE. Even if a data layer is captured independently in different regions, it is to the same standard so that common applications can be used at all sites.
The standards do not attempt to cover personal data or software, or data in development prior to publishing. Some sites find it easier to create all their data to conform to the standard, but this is an individual choice. A standard that attempts to cover all situations becomes too complex and is as a hindrance to its users.
There was never an attempt to write a complete set of standards documents all at once. They are in a gradual evolution, so that new sections are added when the need arises. In most cases, the simplest possible standard is adopted, although sometimes after an energetic discussion of alternatives. Each change to the standards is then approved by consensus of the GIS Working Group.
The primary copies of the standards documents are located on the GIS at BC Environment page on BCE's WWW site. This is the main system for distribution, as paper copies are not maintained. The standards are also communicated by the same methods as the GWG itself (mailing list, E-mail, personal meetings, etc.).
The spatial data standards include:
The new Database Administrator's Working Group will be charged with developing and using an equivalent set of standards for attribute data.
A warehouse of published, standardized information is not useful unless there is a way to find out what data sets are in it, and what information they each contain. This is the role of the Data Registry. It is a single catalogue listing of a variety of published and unpublished data sets, including those in the data warehouse.
The information in the Data Registry is called metadata (data about data). Each data set's metadata shows when it was created, who is responsible for it, how accurate it is, how large it is, what attributes it has, how it can be obtained, etc.
The Registry is stored in an RDBMS in which it can be queried or searched with RDBMS utilities, from its custom interface, or from the WWW by BC Environment staff and the general public. If a data set is available on-line over the Internet, its metadata contains a link to its ftp directory, from which it can be downloaded immediately.
It is advantageous to store metadata in a data set itself, as well as in a central catalogue like the Data Registry. When the data set is copied for use by another agency, the metadata is carried along with it. This way, a conscientious user can temper analysis based on the data set with knowledge of its source, accuracy, currency, completeness, etc.
At the time of writing, BCE is working to integrate metadata in GIS coverages, the Data Registry, and its WWW site. The United States Federal Geographic Data Committee (FGDC) has produced a metadata standard which seems to be well suited to this. The GIS Working Group has agreed to proceed with a pilot study based on FGDC metadata support in the GIS.
BC Environment is finding that sharing data, software, and applications between all its GIS sites is straightforward using comprehensive standards and proven, stable technologies. This has freed the GIS Coordinators from much of the mundane detail of data loading, translation, storage, and distribution, and given them more time to do interesting, useful analysis to support the government's mandate.
BC Environment System Services Branch
737 Courtney Street, 3rd Floor
Victoria, British Columbia
Canada V8V 1X1
Phone: (604) 387-9614
E-mail: bmackenzie@galaxy.gov.bc.ca