Abstract
A TIN (Triangular Irregular Network) representing surface elevation data is usually one of the largest datasets in a GIS. Large datasets have unique problems which manifest themselves solely because of the size of the data volume. These problems include physical data storage limitations, longer access and display time, and memory allocation issues.
An effective way to address these issues with large
vector datasets in ArcInfo has been to utilize Librarian. One
of Librarian's data management concepts is to break a large dataset
into smaller, rectangular "tile size" chunks.
Unfortunately, a TIN is not a vector coverage and
therefore does not lend itself to direct inclusion and management
by Librarian. However, one can use the Librarian directory structure
if the TIN is broken, like the vector data, into tile size chunks.
Upon first reading, the CLIP option in the CREATETIN
command, sounds like it will provide exactly what we need. Define
a box as a CLIP window and extract just the data required for
each tile. The CLIP option, however, creates a uneven outside
edge (hull) on the resulting TIN, leaving a gap between the clip
box and the data points. The CLIP option does not interpolate
elevations along the clip box, but merely ignores any data points
outside the box.
In order to break the TIN into pieces, elevation
data must be maintained along the tile boundaries to insure data
continuity across the tiles. By creating cross-section profiles
(one for each tile edge) with the ARCPLOT command SURFACEXSECTION,
elevations are interpolated at regular intervals along the profile.
These mass points are included as input to the CREATETIN command
allowing the TIN to be clipped correctly.
The purpose of the paper is to present a methodology
using ArcInfo commands and AML procedures to split a TIN into
multiple, rectangular pieces which retain the characteristics
of the contours before the split.
Large Dataset Management Issues
Managing very large datasets has a unique set of problems which are not apparent with small sets. These problems are manifest solely because of the size of the data volume:
* access time to select specific records is longer.
* tough to find blocks of scratch disk space large enough to copy
or sort the files.
* sorts, computations, and display of the data take longer.
* making updates to a large dataset requires all other users be
"locked out" during a transaction
* network traffic is increased.
* plot files are often too large to plot.
* a single modification requires the entire dataset to be loaded
and saved again.
* more page swapping is required to move chunks of the data in
and out of memory.
The CAD department at Shell Oil, New Orleans, had
an Intergraph design file which contained all the offshore leases
for the Gulf of Mexico. This file was too big to keep on the system
at all times. It was stored on half a dozen 9 track tapes. Every
year after the offshore lease sale, they would back everything
up off the system, load this one file, make all the changes, produce
a series of hardcopy plots, load it back onto tape and remove
it from the system. Not a very efficient way to manage a large
dataset, but they had no other choice.
Any GIS has the potential to grow and incorporate
more and more unwieldy data. Elevation data is one of the most
volume intensive (and therefore problematic) information sets
in a GIS. The data volume breakdown of the GIS at the Erie County
Water Authority is summarized in the table 1. Over 70 percent
of the disk space is devoted to storing the TIN data.
Librarian Data Management Concepts
One data management solution is to utilize Librarian.
Librarian is a software package from Esri designed to manage large
vector datasets in ArcInfo. The primary data management concept
behind Librarian is to break a large data set into smaller, rectangular
tiles.
Figure 1 adapted from Esri, 1991d, pg. 1-2, 2-3
A library is really a collection of ArcInfo workspaces
(physical directories) which contain coverages or layers of GIS
data. Each directory or workspace represents a tile in the library
. The tile size is defined by the data density - the denser the
data, the smaller the tiles should be. The objective is to break
the data into manageable sizes for editing, analysis, display,
and storage (Esri, 1991d).
TIN Specific Problems
Not Vector Coverage
Given Librarian's concept of data tiling, the answer
would appear to be simply, load the TIN into a map library. Even
though TINs share characteristics of vector coverages (lines,
nodes and topology), a TIN is not a vector coverage and cannot
be managed directly by Librarian. Because a map library is a physical
organization of directories and subdirectories (figure 1), the
TIN, if broken correctly, can be stored in the library directory
structure.
CREATETIN CLIP Doesn't...
The problem lies in correctly splitting the TIN into
tile size chunks. The ARC CREATETIN command, (Esri, 1991a), has
a CLIP option that would appear to provide the needed functionality.
Usage: CREATETIN <out_tin> {weed_tolerance}{proximal_tolerance}
{z_factor}{bnd_cover | xmin ymin xmax ymax}
The user can provide either a bounding polygon coverage
or min-max coordinate window. The CLIP option, however, creates
a uneven outside edge (hull) on the resulting TIN, leaving a gap
between the clip box and the data points. The CLIP option does
not interpolate elevations along the clip box, but merely ignores
any data points outside the box. The screen images in the figures
below show a progression of data errors introduced by the CREATETIN
command.
Figure 2 (below) illustrates the behavior of the
contours before the TIN was divided. The horizontal line represents
the tile boundary where the split will take place.
Figure 3. A single TIN was divided using two rectangular
clip windows which shared a common edge. Note the gap along the
clip line edge.
Although the data gap seems most severe at the corners of the clip box in figure 3 (above), the less visible gap between the northern and southern TINs, in figure 4 (below), shows the obvious disruption of the contour patterns.
Elevation Interpolation
In order to avoid data gaps and disruption of the
contours , additional elevation points need to be inserted along
the clip edge. This is not a direct command option, but a series
of steps which can be defined as a procedure and encoded as a
macro.
The ARCPLOT command SURFACEXSECTION (Esri, 1991b)
allows the user to define a profile interactively or by passing
a set of xy coordinate pairs. (which in this case forms a rectangular
clip window). The user also defines the sampling interval for
interpolation points along the profile.
Usage: SURFACEXSECTION <line_cover | * | xy...xy><surface_id>
<profile_info_file> {sample_distance}
The output consists of X, Y, and Z values stored
in an INFO file. A routine clip_xsection.aml automates the creation
of profile data.
The CREATETIN command can not read this file directly,
so an intermediate routine is required to unload the XYZ values
from INFO into a TIN generate file (xsection_to_gen_points.aml).
How the profile elevation points are handled is important.
In general, TINs are defined by 2 basic data types Mass points
and Breaklines (Esri, 1991c, pg. 4-2).
Mass points
Are the basic elements used to build a TIN. Each
point has an xy location and a Z value, which is treated with
equal importance in terms of defining the surface. Mass points
may also be used to represent the nodes and vertices of line and
polygon features.
Breaklines
Represent linear features used to define and control
surface smoothness and abrupt changes in slope (e.g. the surface
of a lake). Z values along a breakline can be constant or vary
over the length. Breaklines are categorized as either hard or
soft. Soft breaklines maintain linear features in the TIN, but
do not influence the smoothness of the surface. Hard breaklines
define linear features and define interruptions in the surface
smoothness.
Even though the TIN manual (Esri, 1991c, pg. 4-2) states that soft breaklines should be "used to represent study area boundaries, survey lines, and other linear features which do not influence surface smoothness." My experience during the course of this project has shown that including the clip line profile elevations as either hard or soft breaklines, severely alters the behavior of the contours. Figure 5, below, illustrates the disruption of the contours by the inclusion of profile elevations as breakline data.
By extracting the XYZ values from the SURFACEXSECTION
INFO file and creating a TIN point generate file, the profile
elevations can be loaded using CREATETIN. Figure 6 below shows
the altered TIN and the correct contour behavior.
Clip Results
The remaining figures (7&8 below) show the successful results of the clip and the consistent nature of the contours even at the TIN edge.
Conclusion
Large datasets require special attention to limit
their impact on GIS performance and data management.
The Librarian data management concept of breaking
up large datasets with a wide geographic extent into smaller,
manageable tiles can applied to TINs, but not directly.
To use the Library directory structure, the TIN must
be divided in such a way that gaps or breaks in the contours are
not introduced.
Additional elevation points must be interpolated
along the clip line using SURFACEXSECTION.
The profile data must be included as MASS POINTS
for CREATETIN to handle the CLIP option correctly.
References
Esri, 1991. ARC Command References. ArcInfo Users Guide; Version 6.0, Environmental Systems Research Institute, Inc., 380 New York Street, Redlands, CA 92373.
Esri, 1991. ARCPLOT Command References. ArcInfo Users Guide; Version 6.0, Environmental Systems Research Institute, Inc., 380 New York Street, Redlands, CA 92373.
Esri, 1991. Surface Modeling with TIN. ArcInfo Users Guide; Version 6.0, Environmental Systems Research Institute, Inc., 380 New York Street, Redlands, CA 92373.
Esri, 1991. Using Map Libraries. ArcInfo Users Guide;
Version 6.0, Environmental Systems Research Institute, Inc., 380
New York Street, Redlands, CA 92373.
Graham S. Hayes
President/Owner
GIS Resource Group, Inc.
21 S. Grove Street, Suite 130
East Aurora, New York 14052
Phone: (716) 655-5541
Fax: (716) 655-5540
E-mail: graham@gisrg.com