Robert P. Comer and W. Thomas Ofenstein

An ArcView Extension for Interactive Stereo Processing

The 3-D real-world location of an object may be determined from its relative position in two or more aerial images taken from different perspectives, once the camera positions and attitudes have been determined. With airborne acquisition of digital imagery, the process can be performed entirely in a softcopy mode. Naturally, a software framework is required to manage the measurements and processing. This paper describes an ArcView(r)-based framework -- an ArcView extension to manage interactive stereo photogrammetric processing of aerial imagery. The extension supports interactive selection of the raw aerial images to be analyzed, and automatic creation of a view and image theme for each raw image. It also provides a tool for selecting and labeling points of interest, and executes an external batch process to compute the latitudes, longitudes, and elevations of real-world object locations. The results may be visualized in conjunction with base map data and/or image mosaics. Esri's(r) ArcView Spatial Analyst extension can be used to contour and grid the output from the stereo processing.


Introduction

The Stereo Extension was developed to enable exploitation of data acquired from a low-cost airborne imaging system developed by TASC, and to understand the extent to which terrain information can be extracted from these data. The system acquires three-band digital imagery from either: The system is flown on a Cessna 172 equipped with an AccuPhoto(tm) GPS-based pilot aiding system. The on-board computer independently records GPS data plus the output of a low-cost inertial measurement unit (IMU). Post-mission processing of these data provide high-accuracy camera attitude and position data. These results enable automatic generation of geo-registered composite mosaic images. Furthermore, the navigation and digital imagery can be combined for stereo processing to derive terrain elevations without a classical photogrammetric resection (block triangulation) process.

This paper consists of six main sections which describe:

Single-Image Views

The first step in the stereo processing of a data set from TASC's airborne imaging system is to bring individual image frames into ArcView for display and interactive selection of conjugate points. In the case of images acquired from three independent video cameras, the three image bands are brought into alignment, and the images are saved in RGB TIFF format (one of the formats accepted as an ArcView image theme). The Stereo Extension aligns the bands according to pre-calibrated angles specifying the relative alignment of the three video cameras. In addition, an ArcView TIFF world file is created for each image. This allows users to interpret the location of graphically-selected point features in the image in terms of coordinates in the imaging system's focal plane.

Image-Coordinate World Files

A world file is a plain text file that is used to tell ArcView the scale and origin of a geo-registered image. This enables images to be properly viewed with respect to other images and vector GIS data when displayed as an image theme. Normally, the coordinates used are the map display coordinates (x,y) of the view. However, raw aerial images cannot be simply scaled and shifted to the coordinates of a map projection. Instead they present a perspective view of the earth and in addition include the effects of aircraft heading, pitch, and yaw, as well as the three-dimensional nature of the terrain. Our world files for airborne images relate the rows and columns of the pixels being viewed to (x,y) coordinates in the imaging system's focal plane, rather than to map coordinates.

Example

Figure 1 presents an ArcView View of a single image frame. It shows a portion of Lilypad Pond and is an example of imagery of the Huntington Wildlife Forest in the central Adirondacks of New York. This imagery was acquired on a late afternoon in November 1996. The aircraft was traveling west on an east-west flight line; west is toward the top of the image and north is to the right. The image was taken from an altitude of approximately 2000 meters above sea level. The mean terrain height in the image is approximately 500 meters. The mean ground sampling distance is just under one meter. Other illustrations in this paper utilize data from the same mission.

Single Image Frame

Figure 1 A Single Image Frame

The image is a combination of data from three video cameras with aperture filters for three bands:

The bright red patches in the image are the crowns of healthy spruce trees and other conifers, all deciduous trees in the area having dropped their leaves some weeks before the image was captured.

Note that the image appears as the only theme in the ArcView View. It has an associated TIFF world file, which provides image coordinates (x,y) in the image plane, normalized by the focal lengths of the lenses (roughly 16 mm) used on the video cameras. These coordinates range from about (x = -0.195, y = 0.16) at the upper left corner of the image to (x = 0.195, y = -0.16) at the lower right corner. Doubling the inverse tangents of these values gives the angular field of view: about 22 degrees from side to side (across the flight track) and almost 18 degrees from top to bottom. The area viewed on the ground is roughly 580 by 480 meters.

Creating Image Views Automatically

As mentioned above, the Stereo Extension supports automatic creation of single-image TIFF files. The choice of which images to process is made by the user via graphical selection of image nadir points. Figure 2 shows nadir points for images from five east-west flight lines over Huntington Forest, with eight having been chosen using ArcView's selection tool. The view shown is projected in Universal Transverse Mercator (UTM) coordinates, Zone 18. (North is up.) The image backdrop is a portion of an automatically generated mosaic of approximately 600 individual images of Huntington Forest and the surrounding area. The area shown is approximately 2.6 by 1.8 kilometers.

Selected Image Nadir Points

Figure 2 Selected Image Nadir Points

In addition to automatically creating a band-aligned TIFF image and the associated image-coordinates world file, the Stereo Extension also creates a separate, dedicated ArcView View for each image. It then adds the appropriate image to the view as an image theme. Each view is named with the numerical identifier of the image it displays. Figure 3 illustrates a stack of automatically-created views and their image themes.

Automatically Generated Image Views

Figure 3 Automatically Generated Image Views

Defining and Marking Conjugate Points

The Stereo Extension supports interactive selection of conjugate point pairs, triplets, etc. (depending on the number of images overlapping a given point). A custom ArcView tool allows users to graphically select a point feature in any image by clicking on its location in the image. Once a point is selected, the user is prompted to assign a numerical identifier for the point.

The user can work in any order, selecting features in a single image or switching back and forth among overlapping images. The user may choose any positive integer value for the identifier (within the standard range supported by the machine), but must use a single identifier only once for a given image. The same identifier must be used when selecting the same feature in two or more images.

The points selected by the user are stored in point shapefiles. Each point has the user-provided identifier as an attribute. There is one shapefile per image. Due to the presence of the image-coordinate world files described above, the points locations are retrieved directly in image coordinates and inserted in the shapefiles without further transformations.

Example

Figure 4 illustrates a pair of views containing two overlapping images, and the conjugate points selected for each. Each view shows the full extent of its image theme. Within the area of overlap, there is a one-to-one correspondence between points in the upper and lower images. Once such correspondence is indicated by the white circle in each image. In both images the circle encloses point number 114. To the extent possible, the points were selected to represent features on or near the ground (e.g., distinct shadows, low vegetation, and smaller conifers rather the crowns of higher conifers that are at significant elevation above the ground.)

Conjugate Points in Two Image Frames

Figure 4 Conjugate Points in Two Image Frames

Although the features surrounding the points appear similar in the two images, their absolute positions (i.e., image coordinates) shift considerably. This is due to the change in perspective as the aircraft flies overhead. The lower image was acquired first, and the direction of flight is toward the top edge of the images. Point 114 appears close to its top (leading) edge, but shifts toward the image center in the upper image. This shift in image coordinates, coupled with knowledge of the aircraft position and attitude at the acquisition time of each image, fully constrains the three-dimensional location of point 114.

Going To Three Dimensions

When the user decides there is sufficient information in the point location shapefiles, the Stereo Extension can be used to launch a batch process. This process computes a three-dimensional location (latitude, longitude, and elevation) for each distinct point identifier (as long as that point has been identified in two or more images). The camera positions and attitudes resulting from the post-mission navigation processing, mentioned in the Introduction, are used together with the image coordinates to find best-fit estimates of location.

The results are stored in a comma-delimited file and include latitude, longitude, point identifier, and elevation. The file can be imported into an ArcView project as a table, transformed into an event theme, and displayed using native ArcView capabilities. The data can also be further manipulated using Spatial Analyst to interpolate an elevation grid (surface) or create a contour map.

Example

Figure 5 shows results from processing the points identified on the two images shown in Figure 4, plus points on the preceding four images from the same flight line. (This view is in UTM coordinates with north up.) A total of 334 points were marked on the six images. Most appeared on three overlapping images rather than just two. The total number of distinct points was 125 and an 'object-space' location (latitude, longitude, elevation) was computed for each one.

The map locations of these object-space points are denoted through the 'Point Elevations' event theme, and their elevations are color-coded (as a 'Graduated Color' legend in ArcView). Behind the point symbols is a grayscale-shaded grid theme, representing an interpolated surface from Spatial Analyst. The shading changes in discrete increments with every 15 meters of elevation. The area of the grid covers the 800 by 520 meter extent bounding the points. The 4-meter mosaic displayed in Figure 2 is used as background to provide a frame of reference. Point 114, circled in Figure 4, is again circled here.

3-D Point Locations and Contour Surface

Figure 5 3-D Point Locations and Interpolated Surface

The display covers the area from Lilypad Pond on the left to the somewhat higher Arbutus Pond in the upper right. The total local relief is slightly over 100 meters. The two ponds are separated by a ridge, striking northwest-southeast. The presence of this ridge is indicated independently by the late-afternoon shadows on its northeast slope.

Software and System Environment

All the work described in this paper was performed in the Windows NT(r) 4.0 system environment. Most of the interactive stereo processing capability is written in Avenue(tm), but more specialized and compute-intensive photogrammetric operations are performed by native-code executables written in C++. One such instance is the conversion of raw image data to band-aligned TIFF image files. The other is the computation of three-dimensional point locations from the conjugate point shapefiles and the camera stations, which were provided independently by the navigation post-processing system.

For convenience, the ArcView scripts, menu items, and conjugate point selection tool are bundled in a single user-defined ArcView extension. This allows the Stereo functionality to be included in new ArcView projects by simply invoking the Extensions dialog (accessed from the ArcView File menu when the project window is active). User-defined extensions are a feature of ArcView 3.0. The extension capability provided a significant convenience for this project, as opposed to importing and compiling individual scripts, then creating the stereo menu, menu items, and point selection tool. For further information about developing user-defined extensions, see the work by Ofenstein, 1996.

Summary and Concluding Remarks

Our Stereo Extension provides a convenient way to process imagery and navigation data from TASC's low-cost airborne imaging system in order to estimate point terrain elevations. In addition, it provides a framework in which to explore automated approaches to terrain extraction from aerial imagery.

The extension takes advantage of ArcView world files in a novel way to conveniently extract point locations in image coordinate space. It provides automatic creation of views for user-selected images, provides a tool for selecting and labeling conjugate points in overlapping imagess, and launches a C++ process to determine three-dimensional object-space locations. The results may be imported as an event theme for display as point-wise elevations or interpolation using Spatial Analyst.

Possibilities for improvements include checks to help manage the user-assigned conjugate point IDs (e.g., prevent users from repeating the same ID in a given image) and additional display capabilities. One possibility is to show the epipolar line segments in one image, given a point location in another image, to guide or constrain the user's search for the conjugate point.

Acknowledgements

The Stereo Extension was developed as part of work performed by TASC for Rome Laboratory/IRRE under contract number F30602-96-C-0036. David Marcus developed most of the three-dimensional location software. Kim Rauenzahn provided editorial input and ArcView guidance. The development of TASC's airborne imaging system was a joint effort of many persons, including Gerry Kinn, Steve Karels, Gary Matchett, and Tom Parr. All registered trademarks and trademarks mentioned herein are the property of their respective manufacturers.

References

W. Thomas Ofenstein, Implementing a User-Defined ArcView Extension, presentation at Esri User Conference, 1996.


Author Information

Robert P. Comer
Principal MTS
TASC
55 Walkers Brook Drive
Reading, MA 01867
Telephone: (617)942-2000 x2543, (508)262-0669
Email: rpcomer@tasc.com

W. Thomas Ofenstein
Principal MTS
TASC
55 Walkers Brook Drive
Reading, MA 01867
Telephone: (617)942-2000 x2192, (508)262-0656
Email: wtofenstein@tasc.com