Thomas C. East

Looking for Buried Treasures

or

Your Geographic Information System: More than Meets the Eye

When developing a GIS Database from scratch, it is prudent to carefully plan what information, particularly feature attribute information, should be captured during the collection phase and how it will be collected, and also to explore how it will be used and maintained.

But what does one do when an essential item is overlooked? This paper describes various ideas, techniques, and methods of pattern recognition and artificial intelligence which can be used to capture some of this information which, at first glance, may not be readily available, but in reality, can be accessed and used. And, best of all, the computer can do most of the work for you!


INTRODUCTION

Origins of the Paper

GIS databases often incorporate graphic and non-graphic information from many different sources, with varying levels of complexity and which can cross traditional or established disciplinary lines. Therefore it is not unusual to eventually discover that critical data elements or relationships may have been overlooked or omitted in the data collection process.

However, sometimes the needed information may actually be just below the surface of the map, so to speak, or may be masquerading as something else. With the proper tools and techniques, this information can usually be extracted, derived, or otherwise uncovered. This paper discusses both general concepts which can be applied in many situations, and also provides real examples of solutions developed to collect information missing from a GIS database.

A LITTLE BACKGROUND

Graphic interpretation - Human vs. Automated

The human eye and mind together make an incredibly powerful system for interpreting graphic information. What is easily observed, interpreted, understood, and taken for granted by young children and even infants, can prove befuddling and impossible for the most powerful computers. Of course, what is often overlooked is that what happens in the human mind involves processes of enormous complexity. However, we do so many things so routinely, and make so many assumptions and connections within the databases in our minds, that we are not conscious of having performed them at all. The complexity is hidden.

Computers can be programmed to quickly, repetitively and accurately execute many well-defined tasks previously performed by humans. But increasing levels of complexity and depth often require correspondingly complex and deep computer databases which typically are not available.

Two Guidelines

Nevertheless, with a little thought, preparation, and sometimes a simpler mind-set, it is possible to discover more in an available database than meets the eye.

Guideline No. 1: Stop and think about the assumptions made by the eye and mind as they interpret the information taken in.

Although computers may seem to be very complex, they are actually incredibly simple devices. The phrase I heard most often in computer science courses was "Computers are dumb!" Contrary to popular opinion, they cannot think for themselves. They simply do exactly what they are told to do, and therein lies a key to getting the results desired.

Guideline No. 2: Take nothing for granted - think, literally, as a computer thinks.

Questions, Suggestions and Thought Experiments

Although I don't recall the first time I saw a map, when I took my first airplane ride I do remember expecting to see large letters on the ground announcing the names of the various cities, rivers and roads below. Having literally interpreted what I had seen on maps, I was very disappointed! Does a computer know any better? No. Therefore, think as the computer thinks to discover the false assumptions it makes in dealing with the information on the map. Listed below are some key questions, suggestions, and thought experiments which I have found helpful in getting the proper mind-set to discover the information I need when interpreting maps or other graphic information. Think about these questions before going on to the suggestions and thought experiments.

Questions to ask about your maps or data

1. What do you want your GIS to do?

2. What is your model? Do you even have one?

3. What is the model of your source map or data?

4. What is right about your map or data? What is wrong? Why?

5. What do you know to be true? What do you know to be false?

6. In what context was the map or data created or collected? What was the desired goal? What was not desired?

7. What is the goal today? What is to be avoided?

8. Do you know everything about your map or data? Is anything hidden? Are you sure?

9. Are there patterns in your map or data? Are they systematic? Why?

10. Are there unique situations or problems? Where are they found? Why?

11. What do you know about other related sources of data? Unrelated sources?

Suggestions

1. Describe your goals and objectives in simple terms.

2. Define what the model should be.

3. Know and understand both the good and bad in your data. Sometimes the bad can be useful or provide a clue.

4. Apply what you know or expect from experience (heuristics.)

5. Use your knowledge of related and unrelated information.

6. Use smart guesses - what you know to be right and what you know to be wrong.

Thought experiments

1. Imagine what you would encounter walking across the map, not the earth.

2. Imagine what would you see flying over the map, not the earth.

3. Ask yourself questions about what you see or encounter.

a. What happens when you cross a road or stream or follow a contour line?

b. Does the angle of approach make a difference? Why?

c. Are you entering or leaving an area or both?

d. What can you find nearby which gives clues about what you are looking for?

e. What can't you find nearby which gives clues about what you are looking for?

f. What can you find that is unrelated to what you are looking for? Is it really unrelated?

g. Do you know something special about this area of the map, perhaps from another layer, that is important or useful?

h. Would you encounter or see anything differently if you imagined the map as a negative image, reversed, inverted, three dimensional or any combination of these?

CATEGORIES OF SOLUTIONS

There are many categories into which solutions of interpretation can fall. The following list is not exhaustive but gives a good idea of the types of solutions available. Each category will be illustrated by actual examples.

Derived Solutions

This method involves deriving and creating new graphic information from information already available. For example, creation of street centerline coverages from coverages containing only double-edge roads.

In 1987 the Northern Kentucky Area Planning Commission (NKAPC) completed 1"=200' scale planimetric base mapping for portions of a three county area in Northern Kentucky, directly across the Ohio River from Cincinnati, Ohio. After completing the base mapping, it was discovered that street centerline information was omitted in the data collection process. After searching unsuccessfully for an automated solution to this problem, a derived solution was developed by NKAPC staff.

The process involved the intersection of regularly spaced horizontal and vertical arcs spanning the entirety of a map tile, with double-edge road arcs in the road layer. This step created intersection nodes on the road edges (see Figure 1.)

Once the intersecting nodes were generated, the center point of each short arc which spanned the road edges was calculated. This center point represented a vertex on the centerline to be generated (see Figure 2.)

The final step involved connecting these vertices by snapping existing road edges to these vertices, creating centerline arcs that closely followed the shape of the road as delineated by the road edges. The inspiration for this derived solution came while using Thought Experiment No. 1 listed above. The method works quite well although minor editing is required near intersections. The NKAPC has provided this solution to many organizations around the world.

Deduced Solutions

This technique uses readily available information along with other methods such as heuristic techniques to deduce information which is obvious to the human mind through assumptions or understood knowledge, but is not recorded by the computer. For example, tagging uncoded contour lines with elevation values when these values are not present on the map.

Although contours were collected at a 5' interval in the original NKAPC base mapping, elevation values were not attached to the arcs. The thought of individually selecting and tagging thousands of contour lines was enough to inspire a search for an automated solution. Once again, finding no readily available tools, NKAPC staff developed a solution.

Fortunately, for our purposes, 25' contours were coded separately from the 5' intermediate contours, though they also had no associated elevation values. However, elevation annotation was available for the 25' contours. This annotation was manually placed during map compilation by breaking the contour arc at the desired location, coding the underlying arc with a symbol which would not display, and placing the contour elevation annotation over this invisible arc. Questions 8 and 9, Suggestions 3 and 4, and Thought Experiment No. 1 provided the starting point for this solution.

After creating a new coverage using extracted 25' contours, the elevation annotation text was used to tag elevation values to the underlying invisible arcs. These arcs, now containing elevation values, were chained to adjacent arcs having no elevation values (see Figure 3) propagating the elevation value along the entire length of a contour.

Minor editing was required to ensure proper topology. In this way, all 25' contours were tagged with their proper elevation values. A side effect of this process was an approximate 15% reduction in the number of contour arcs being stored. This resulted in lower storage requirements and faster display capabilities.

The next step, tagging the intermediate 5' contour arcs, has not yet been implemented. By traversing (walking across) the coverage, much as was done in the centerline creation example, and by using both common knowledge and heuristic techniques, elevations can be added to these contours as well.

As a 25' contour is crossed, the previously coded elevation value and the arc's internal-id are noted before continuing to the next arc, accumulating contour arcs along the way (see Figure 4.)

When the next 25' contour is encountered, it can be determined whether the intermediate contours have increased or decreased by comparing the elevations of the two 25' contours. Intermediate values can then be assigned. In some cases, the 25' contour may be the same arc that was last found and so the internal-id must also be compared to resolve the situation. Also, it must be noted that it may not be possible to determine a contour's elevation immediately. As in the centerline generation process, a second pass in a vertical direction is required to resolve uncertainties. Therefore, keeping track of internal-id's is essential.

Transferred or Related Information Solutions

This approach takes advantage of information which is available in another form and which can be transferred or related to create the desired information. Examples include the conflation of feature attribute information from one coverage to another, more spatially accurate coverage. However, conflation is generally a very edit-intensive process. Sometimes the solution can be far simpler.

During the 1991 Comprehensive Plan Update process, NKAPC staff needed to create a thematic coverage of the plan's proposed land use categories for Kenton County, Kentucky. This coverage consisted of thousands of small, individual polygons and several larger, extremely intricate and complex polygons, each coded as one of approximately twenty (20) general land use categories. One such category was designated "Physically Restrictive Development Area" (PRDA.) These areas, because of their greater slopes and their potential for landslides, were difficult and hazardous to develop (not to mention digitize!) After several confusing, frustrating and unsuccessful attempts to digitize this information from a previous Comprehensive Plan Update Map, an alternative solution was in order. Questions 6, 7, and 11, Suggestion 5, and Thought Experiment 3.g, provided the inspiration for the solution.

Having recently obtained digitized soils information for Kenton County from the local Natural Resources Conservation Service (NRCS) - formerly the Soil Conservation Service (SCS) - and remembering that slope characteristics information was available in the accompanying soils report, a test plot of soils having the appropriate slope range was made for comparison to the previous Comprehensive Plan Update Map used in the laborious digitizing efforts.

The result was a nearly perfect match, requiring only minor editing to fine tune the PRDA areas. Moreover, the use of the soils information provided an additional measure of support for the designation of these areas as physically restrictive. A good deal of time was saved while strengthening the basis for the designation. The process was a simple matter of relating knowledge available from a different source.

CONCLUSION

It should be noted that no automated process is 100% effective in every case. The goal is not to achieve perfection, but to let the computer perform the tedious, well-defined tasks of recognizing the obvious conditions which occur most frequently, and let the human mind resolve the less frequent uncertainties and ambiguities which are uncovered in the process. The importance and inescapability of the human factor in problem solving cannot be diminished. While computers can, and do, make many tasks easier to accomplish, the human mind will always provide the tasks and the methods of solution.

By concentrating on the Guidelines, Questions, Suggestions, and Thought Experiments provided, as well as others which may prove beneficial, it is possible to discover the assumptions made by the human mind. Through the application of imaginative ideas and techniques, it is possible to reveal the information hidden in your GIS!


Thomas C. East, GIS Program Services Manager Northern Kentucky Area Planning Commission 2332 Royal Drive Ft. Mitchell, KY 41017-2088 Phone: (606) 331-8980 Fax: (606) 331-8987 email: teast@mail.iac.net