Laurens Robinson
The Metadata Catalog used for the City of Oakland's GIS effort was
developed at the University of California, Berkeley. It can be downloaded
from their anonymous ftp site.
At the login prompt type anonymous,
at the password type your email address. When you have logged on,
change to the pub/klamath directory. To access the Metadata Catalog
software go to the /kmdd subdirectory. If you are unable to use ftp, you
may send for a copy of the software. The Metadata Catalog is formally
referred to as the Klamath Meta Data Dictionary (KMDD).
Send your requests to:
Yvonne Everett Attn: KMDD Dept of Landscape Architecture 202 Wurster Hall University of California Berkeley, CA 94720 Voice (510) 642 8641 Or: send email to: yvonne@ced.berkeley.edu
The Beta Release of the KMDD uses a Visual Basic program and file structure. It is currently being rewritten to use a Microsoft Access database, although I'm not sure when that release will be available. The developers have indicated that future development of the KMDD may include a Thesaurus. For now, the KMDD is simple but stable, and we experienced no loss of data from it's use. GIS Analysts, as well as persons with knowledge in a specific discipline or domain ( referred to as "Domain Experts") can easily key information into the KMDD. That information can then be saved and organized in a number of ways to provide information to those individuals or organizations that need to determine it's suitability for their purposes. It is by no means a final solution. Yet we feel it is a good place to begin. In fact, the Data Resource Management Division at the City of Oakland has used it extensively for documenting the City's coverages. We look forward to the new Release of the KMDD which will use the Microsoft Access database engine. We have found it a cost-effective solution. If you do not currently have a strategy for managing the Metadata you have collected about your coverages, grab this program and get started.
Once we have generated files from the KMDD about the GIS coverages, this information is linked to a Project Repository that contains information about the entire Measure 'I" Emergency Response System (MIERS). The MIERS Project Repository stores all information relevant to any aspect the Project. Included are spatial and non-spatial Conceptual, Logical, and Physical models, information about the contributing Domain Expert, relevant Business Rule guidelines from the California State Emergency Management System (SEMS), Functional Specifications and, briefly, anything we felt someone might want to know about the project.
Although there are other Metadata components in our overall strategy, such as CASE tool Encyclopedias, Microsoft Excel, Word, and Project documents (artifacts) electronically linked to the Project Repository,the focus of this paper will be on the KMDD (our GIS Metadata Catalog component) as one component of an overall strategy. For more detailed information regarding the MIERS Project Repository, send email to Laurens Robinson at Laurens@ix.netcom.com
The tendency to avoid defining the Metadata for a coverage is tremendous, just as it is for programmers to create documentation for programs after the program has been completed. Metadata is the card catalog for the GIS Library. Without it, a potential user of the Library may take a long time, and possibly never succeed, in locating the GIS information they need to access.
As GIS technology becomes more popular, the task of managing the information becomes more and more difficult. Esri addressed the immediate needs of Federal agencies by delivering DOCUMENT.AML, an ATOOL written and currently used by the Water Resources Division of the USGS and the Environmental Agency. Another approach has been taken by the Klamath Meta Data Dictionary (KMDD) Project, a joint effort of the Klamath province GIS Project of the University of California-Berkeley, and numerous GIS users in the Klamath Province. The KMDD, a Visual Basic Metadata program available free over the Internet, has combined input from users, project staff, the Federal Meta Data Standards, the Spatial Unified Data Dictionary, and the Sierra Meta Data Dictionary of the California Department of Conservation. The City of Oakland has been using the KMDD to document features, layers, coverages and GIS Metadata for the Measure I Emergency Response System (MIERS). It includes two categories of data: (1) Basic and (2) Detailed.
Basic identification information includes data set identification, currency, data description and theme, location, data set structure, source, and resolution, data owner and custodian contact information, and (very important) Metadata custodian contact information.
Detailed information includes availability of the data set, acquisition information, operating system software and data processing, summary table of types of attributes stored, measures of positional and thematic accuracy, specifications about the source, history, comments from users of the data set, and listings of further resources. .
The purpose of this paper is to lay out a "strategy" for the GIS professionals within an organization to: (1) identify the components of an effective Metadata strategy, (2) implement the strategy (using the KMDD), and (3) set up a mechanism for updating the Metadata catalog. Further development of the KMDD is expected. We have been working with a stable Beta Release, yet we have found cases where additional data needed to be associated with a coverage. It will help to use naming conventions that link coverage data togethe. Ideally, all this information should be stored in the same document. Basic and Detail information is storedd in .MDD files. Any graphics associated with a coverage must be imported into the KMDD as a bit map image, and it is written to a .BMP file. In some cases, MIERS GIS Analysts created more detailed information in a separate .DOC document to supplement the Metadata stored in the KMDD.
A typical coverage set would include:
PARCEL.MDD, PARCEL.BMPwhile a more detailed set might include:
PARCEL.MDD, PARCEL.BMP, & PARCEL.DOC.These are in addition to any other GIS files for that coverage. Together the entire set should be kept in a collection. We use a simple subdirectory structure for each coverage collection.
A little common sense goes a long way in this exercise. First, determine what you need to collect, based on the purpose to which the GIS data will be put. If you are providing data to constituents and the public, one set of data might be more appropriate; if you are serving the needs of marketers in a commercial business, another set would be needed. After you have done some basic assessments of the types of uses to which the data might be put, collect the Basic or Required information that is mandated by law. Pay particular attention, those of you in the US, to "Executive Order 12906, Coordinating Geographic data Acquisition and Access: The National Spatial Data Infrastructure" which requires federal agencies to document their data. By far the most prudent approach is to conform to the Content Standards for Digital Geospatial Metadata, developed by the Federal Geographic Data Committee (FGDC). If you do, you will help enable the consistent naming and definition of Metadata by all US organizations, and ensure that you have not committed a fault of omission.
Clearly the topic of what to collect is an enormous one. Cherie Barton, Software Product Specialist at Esri, puts it this way: "Metadata is information about a database, or a portion of it, such as a layer, an attribute, or specific features. Metadata tells what the database contains, how accurate the data is, and even how you use it. Anytime you write down where data came from, who worked on it, and what was done to it, you are recording Metadata." I take a slightly broader view. Metadata is anything that anyone could conceivably want (or need) to know about the data. The problem is deciding what to include and what to exclude. Standards are a good place to start. As the GIS annotation, notes, anecdotes and other information is collected and added to the Metadata set for a given feature, coverage, theme, etc., value accumulates in your data. Your data actually becomes more valuable. Why? The less time it takes to discover if the GIS data is what you want, the more time you have available to actually use the GIS data set to solve a real- world problem. The KMDD takes a lot of the guesswork out of the equation for you. If you can collect the information it asks for, you will be pretty much in compliance with the FGDC. You may want to go beyond this basic data. In some cases it is absolutely necessary. We found that the majority of coverages did not need to have >DOC files attached to them.
Keep the problem in perspective. What is the intent of the coverage? Are your intended users very specialized in their work? General purpose? Who will be using it the most? An excellent rule of thumb is to simply ask the people that will use the data what else they would like about the coverage. If you have not stored the most important thingsabout that coverage, value is lost to your data. It gathers no interest, and it does not appreciate over time.
Lets be honest about it, collecting Metadata is somewhat of a boring
task. Especially if you aren't that familiar with the data, it's usage, and it's
accuracy from direct experience. That's a good reason to get the Domain
Expert to do it. After all, they work with the data on a regular basis and
are more apt to come into contact with instances where it is wrong, or
questionable. This is important information that needs to be corrected. If
neglected, or ignored, pretty soon your data sets will be called into serious
question. It has then become a data integrity issue, and may contribute to
errors of judgement on the part of the user of the data. In extreme cases, it
could lead to lawsuits.
An automated tool helps tremendously. With ArcInfo Version 7.0.3 an ATOOL named DOCUMENT.AML that automates this task was delivered. Briefly, it looks at system files, and combines the information from input variables from the Domain Expert's entries, and generates some of the Metadata for you. The KMDD was our obvious choice for getting users and Domain Experts to key data into a Microsoft Windows-based Visual Basic program. It gets the job done. We have not evaluated the DOCUMENT.AML product, but Esri may have some reference sits that have used it successfully.
There is a growing market for GIS data and any discussion of it in the Public Sector is rife with controversy. I do not propose to enter into it. Executive Order 12906 effectively mandates that Metadata be "accessible" to the National Geospatial Data Clearinghouse. It's a not just a good idea, it's the law for federal agencies, but it also just makes good (common) sense to standardize the distribution of GIS data. All it requires is that federal data providers such as the United States Geologic Survey (USGS) and Bureau of Land Management (BLM) publish the data they have available, including how to obtain it. The idea is similar to standardizing the Content Standards.
They may yet call it the Distribution Standard. The idea is to provide a "market" for locating, linking, depositing, obtaining, identifying, and accessing GIS data. The Geographical Information Technology (GIT) development project in Sweden is doing the same thing. It makes a lot of sense, doesn't it? Most of the data that is required to "market" the data can be stored in the Beta Version of the KMDD. Thus an effective distribution strategy must begin at the collection stage. If you fail to collect certain critical (compliance-oriented) data, you may not be able to distribute it effectively, or at all.
Setup a procedure for updating the Metadata catalog. In the final analysis, Metadata simply helps someone make better decisions, provided it is accurate. If it is inaccurate, it might be worse than if it didn't exist at all. There is danger here. If the Metadata and the GIS data are not kept in sync, you might have a real problem on your hands. In a worst case scenario, you might contribute to the cause of someone being seriously injured, or worse.
On the MIERS Project, Domain Experts have the responsibility for updating the data, and they are in an excellent position to notify the GIS Analyst of any changes that are necessary. The GIS Analyst implements the change and updates the KMDD. If additional .DOC files are created, these are then registered to the Project Repository. When there are a lot of changes to a coverage, some procedure to periodicaly update the Metadata must be enforced. It would be good if changes to the coverage would automatically trigger a process to update the Metadata Catalog. A message could be sent to the appropriate person. A file could be place in a locked state until it was updated and released by the Metadata Catalog Administrator.
Currently, we are wrestling with smoothing out the procedures. It helps if there is a Data or Repository Administrator whose primary function is to manage Metadata for the Project. The MIERS Repository Administrator monitors the Metadata, is responsible for backing it up, and restoring it if necessary. A complete data management strategy will necessarily include the management of all the enterprise data and information assets. It doesn't happen overnight. It is important to begin someplace. The KMDD could be that place for you. It was for us.
Acknowledgements.
I wish to acknowledge the work of the Staff of the Klamath Province GIS
Project, the UC Berkeley Cooperative Extension, and the painstaking
efforts of innumerable GIS users in the Klamath Province and the City of
Oakland for actually using the system. I wish to thank the US Forest
Service, The US Bureau of Reclamation and the US Fish and Wildlife
Service for funding the Klamath Project and the California Department of
Forestry Strategic Planning Program for administering the funding. I
would like to thank Professor John Radke, College of Environmental
Design, for guiding the development of the KMDD, and Dr. Yvonne
Everett for coordinating the project. Especial thanks goes out to James
Ganong and Douglas Allen who produced the software, and to all those
others too numerous to mention, who contributed to this effort.
I wish to thank my GIS team members at the City of Oakland, F. Michael
Smith, the MIERS Project Manager, Johnathan Lowe, Consultant, who
created a lot of bitmapped images to include in the Metadata, Francis Rolle,
System Administrator, Robyn Starr, Consultant, Patricia Combs, Joann
Ward, and all those Domain Experts that documented their data using the
KMDD.
Author Information. Laurens Robinson, MA.,MBA, is an Information
Resource Consultant specializing in the area of Information Asset
Management. He is currently the Data Administrator for the City of
Oakland, and the Repository Administrator for the MIERS Project.
Laurens also works with other Public and Private Sector Agencies, as a
management consultant, in an effort to help them develop effective asset
management strategies so that their data may be transformed into useful
information which enhances their decision-making ability. He thinks that
organizations will one day value data and information as quantifiable
assests of an organization's balance sheet. He can be reached at
Laurens@ix.netcom.com or 510 238 6860.