Jeff Cargin, John Dwyer
The process for locating a site for Pennsylvania's Low-Level Radioactive Waste Disposal Facility (PALLRWDF) requires the use of a wide-ranging array of publicly available information obtained from a variety of agencies at the Federal, State, County and Municipal levels. In addition to data collection, the process also requires the dissemination of project information to the public. Although the earliest stages of the Project were performed at smaller scales, the standard base scale for the majority of the Project has been 1:24,000. This paper will discuss a number of the problems solved and lessons learned in using a Geographic Information System (GIS) to manage the spatial database necessary to perform a state-wide site screening at such a large scale. Among these are included: locating the agency that is the true source of the data being used; identifying and correcting mis-located data in an agency's database without validating the entire data set; processing very large GIS coverages; and making project information usable, understandable and accessible to the public.
1.0 INTRODUCTION Federal law requires that each state be responsible for ensuring that low-level radioactive waste generated within its borders is disposed of safely. A state may do this individually, or may combine their efforts with those of other states and form a compact. Pennsylvania, Maryland, West Virginia and Delaware have joined to form the Appalachian Compact, with Pennsylvania agreeing to site the first low-level radioactive waste disposal facility for the Compact. The screening process, designed to locate a suitable site for the facility, is now underway. Screening is divided into two major portions: Disqualification (determining where in the Commonwealth the facility cannot be located), followed by Evaluation (determining where in the Commonwealth the facility should be located). The Disqualification portion of the process, which was accomplished in three stages (statewide, regional and local), has been completed, and the first portion of Evaluation is underway. The state-wide screening process for locating a site for Pennsylvania's Low-Level Radioactive Waste Disposal Facility (PALLRWDF) requires the use of a wide-ranging array of publicly available information obtained from a variety of agencies at the Federal, State, County and Municipal levels. Although the earliest stages of Disqualification were performed at smaller scales, the standard base scale for the Stage Three Disqualification and Evaluation is 1:24,000. In addition to data collection, the process also requires the dissemination of project information to the public. This paper concentrates on the Disqualification portion of the siting process to discuss a number of the problems solved and lessons learned in using a Geographic Information System (GIS) to manage the spatial database necessary to perform a state-wide site screening at such a large scale. Among these are included: locating the agency that is the true source of the data being used; identifying and correcting mis-located data in an agency's database without validating the entire data set; processing very large GIS coverages; and making project information usable, understandable and accessible to the public. 2.0 DATA SOURCES The siting process for the Pennsylvania Low-Level radioactive Waste Disposal Facility has been an incredible learning experience from the standpoint of GIS involvement. The data requirements for a statewide screening at a scale of 1:24,000 are onerous, involving a wide range of topics and a similarly wide range of agencies. The necessary data is available somewhere in one form or another. The trick is, identifying the form and the source. 2.1 Who's got the Data? An important consideration in gathering data for use on a large, multi-year project such as the PALLRWDF Project is: what is the original source agency for each data set used? Although certain data sets are available from many agencies, and while it may be very convenient to obtain some data from an agency other than the source, there are still important reasons for identifying the primary source agency for each data set, as borne out by experience on this project. Reason number one is: the source agency actually generated the data, and so is most likely to be aware of any limitations associated with a particular data set. This information is especially important in the planning stages of a project. Many of our first estimates of the amount and quality of data available for this siting project came from agencies once or twice removed from the source, and tended to overestimate the efficacy or availability of a given data set. For example, at the start of this project a number of years ago, conversations with Soil Conservation Service (SCS) personnel indicated that eight Pennsylvania counties had Soil Survey Geographic Database (SSURGO) digital soils information available, with more to follow shortly. However, further conversations with those at the SCS responsible for the digital files indicated that this was not the case. In fact, digital linework for eight counties was just released in the fall of 1994, none official SSURGO data, some four years after the initial conversations. The lesson? It may take four or five phone calls, but be persistent: the path eventually leads back to the source. A second reason for contacting the source agency is that it is aware of the currency of the data, and can provide accurate information on the availability of updates. For example, the Federal Emergency Management Agency (FEMA) provides a subscription service for Flood Insurance Rate Maps (FIRM) showing 100-year floodplains (which are disqualified under the PALLRWDF siting process) allowing the Project Team to receive updated information on a regular basis. Available updated information for the different data layers has been used to set up a schedule for continued disqualification, to be carried out in parallel with the Evaluation portion of the process. This dynamic approach to disqualification helps in avoiding unpleasant surprises as the search narrows to three potentially suitable sites. A third reason for contacting the source agency is that the source agency is also most likely to be aware of any changes in regulation or policy that would affect the data. A recent example involves the disqualification of Exceptional Value Wetlands, which are in part defined as wetlands containing threatened or endangered (T&E) species. Recent changes have been made to the Pennsylvania list of T&E species: some species have been added to the list, and others have been removed. This, in turn, removes some wetlands from EV Wetland consideration, while adding others. LAW has been in continuous contact with both the Nature Conservancy and the Western Conservancy, as well as with the State Bureau of Forestry, which maintains the Pennsylvania Natural Diversity Inventory. As a result, we knew of the impending change well in advance, and were able to incorporate the changes to the EV Wetlands Disqualification coverage in a timely manner. LAW has made an effort in each case to contact source agencies for each of the thirty-five data layers involved in the three stages of Disqualification. These source agencies are listed in Table 1. Maintaining a good working relationship with the source agencies over the life of the project is also an important step in maintaining the currency of the database. LAW is fortunate in that the information used for this project is considered publicly available information. Therefore it is no problem to provide the source agencies with any of the coverages we have developed for this project using their data. This value added data exchange has allowed us to use original source maps from, and in return provide boundary coverages to, a variety of state agencies, such as the Bureau of Forestry, Bureau of Parks, and the Pa. Game Commission. TABLE 1 Source Agencies By Disqualifying Criteria/Data Layer Disqualifying (DQ) Criterion Source Agency DQ-01 Masking Facilities US Environmental Protection Agency US Nuclear Regulatory Commission DER Bur. of Radiation Protection DER Bur. of Solid Waste Management DER Bur. of Water Quality Management DQ-02 Active Faults PAGS DQ-03 Geologic Stability PAGS US Soil Conservation Service DQ-04 Slope US Geological Survey DQ-05 Carbonate Lithology PAGS DER Bur. of Land and Water Conservation DER Bur. of Stormwater Management USGS Water Resources Division County Planning Commissions Pa. Dept. of Transportation Township Engineers/Managers County and Municipality Roads Dept.s DQ-07 Coastal Floodplains Federal Emergency Management Agency DER Bur. of Water Resources DQ-08 Exceptional Value Wetlands The Nature Conservancy Western Pa. Conservancy DER Bur. of Forestry US Dept. of the Int., Fish and Wildlife Pa. Fish and Boat Commission DQ-09 Dam Inundation DER Bur. of Dams and Waterway Mgment. US Army Corps of Engineers DQ-10 Public Water Supply DER Bur. of Water Supply and Community DER District Offices DQ-11 Surface Water Intakes DER Bur. of Water Supply and Community DER District Offices DQ-12 Wildlife Area Boundaries - National Park Systems US Dept. of the Int., Nat. Park Service - National Forests US Dept. of Agriculture, Forest Service - National Wildlife Refuges US Dept. of the Int., Fish and Wildlife - National Fish Hatcheries US Dept. of the Int., Fish and Wildlife - National Wild and Scenic Rivers US Dept. of the Int., Nat. Park Service - National Wilderness Preservation US Dept. of the Int., Fish and Wildlife System US Dept. of Agriculture, Forest Service - Pa. Wild and Scenic Rivers DER Bur. of Water Resources Management - Pa. Natural and Wild Areas DER Bur. of Forestry DQ-13 State Forests and Game Lands - State Forests DER Bur. of Forestry - State Game Lands Pa. Game Commission DQ-14 Watersheds DER Bur. of Water Quality Management DQ-15 Oil and Gas Areas - Gas Storage Areas DER Bur. of Oil and Gas Management - Oil and Gas Fields PAGS - Oil and Gas Wells PAGS DER Bur. of Oil and Gas Management DQ-16 Agricultural Land - Agricultural Security Areas Pa. Dept. of Agriculture, Bur. of Farm County Ag. Land Preservation Boards County Planning Commissions County Assessor or Mapping Offices County Recorder of Deeds - Class I Soils US Soil Conservation Service DQ-17 Mines DER Bur. of Mining and Reclamation DER Bur. of Abandoned Mine Reclamation PAGS US Geological Survey US Dept. of the Int., Bur. of Mines DQ-18 Protected Area Boundaries - National Natural Landmarks US Dept. of the Int., Nat. Park Service - National System of Trails US Dept. of the Int., Nat. Park Service - National Historic Register US Dept. of the Int., Nat. Park Service - Pa. State Park Systems DER Bur. of State Parks - County Park Systems County Governments - Municipal Park Systems Pa. Dept. of Community Affairs Municipal Governments - Pa. Historic & Museum Commission Pa. Historic and Museum Commission Lands NOTE: PAGS = Pa. DER Bureau of Topographic and Geologic Survey 2.2 What Format is the Data in, How Good is it, and When can I Have it? Once the source agency for a given data set has been located, all your data questions can be answered. However, you should be prepared not to like all the answers you receive. This is a period of transition in the move to a digital universe, and many digital data sets are almost there , or not there at all. During the past three years of this project, the majority of the information received from the source agencies has been in the form of maps. Digital data sets have for the most part involved point locations for given features. In requesting data, there are a number of considerations that must be addressed. The first is: what is the format of the data? While there is a tendency to react more favorably to digital data than to maps, in this time of transition we have found that such a reaction can be premature. One of our first experiences of digital disappointment involved the PA Bureau of Forestry. When we first contacted them for information on the State Forest boundaries (State Forests are disqualified features), we were informed that the National Forestry Service (NFS) had just digitized the State s maps, and that we were welcome to use these digital files. However, on close examination, these NFS files had not been digitized for the same purpose nor to the same tolerance required by our project. As a result, it was easier to simply re-digitize the Bureau of Forestry maps than to correct the digital file. This example brings us to the next consideration, locational control. Just as locational control is not always available digitally, it is not always available from maps. For example, during Stage Three Disqualification, a 100-year floodplain disqualification coverage was developed using FIRM maps. However, many of these maps are not georeferenced, and direct digitization was not appropriate. At a statewide scale, neither were detailed flood studies. As a compromise, the 100-year floodplains were delineated onto USGS 7.5-minute topographic quadrangle maps, using visual control points (e.g. road intersections) and base flood elevation lines if available. These delineated floodplain boundaries were then digitized to produce the 100-year floodplain coverage. This coverage carries a caveat: it is meant to represent the 100-year FEMA floodplains only for the purpose of siting the PALLRW Disposal Facility; it is not a digital FIRM map. There can be a difficulty with finding locational problems with publicly available data on a project such as this is: how does one correct the problem without the need to validate the entire data set? One approach is given above: carefully prescribe the use of the data set, spelling out its limitations. Another approach, the one we follow in most cases, is to bring the problem to the attention of the source agency, and let them make the correction. A case in point on this project was Public Water Supplies. The PALLRW Disposal Facility may not be sited within 1/2-mile of a well or spring used as a Public Water Supply. In order to produce a Public Water Supply disqualification coverage, we contacted the Department of Environmental Resources (DER) Bureau of Water Supply and Community Health and obtained a digital file containing point locations and identification numbers (IDs) for the Public Water Supplies across the Commonwealth. Upon examination, we discovered that many of the Public Water Supplies lacked location information. In a joint cooperative effort, the sanitarians for the various DER districts provided the missing locations. (This is an example of a hole in a data set; data sets must always be examined for, and the source agency questioned about, such weak links). A point coverage was made of the Public Water Supply data, and the coverage was examined for errors. On the first pass, several Public Water Supplies were located outside the Commonwealth, and were obvious mis-locations. The Public Water Supply locations were then compared to the county code that was part of their ID. Again, it was found that there were wells that were not located in the same county as indicated by their ID. These sets of mis- located wells were again submitted to the Bureau of Water Supply and Community Health, and the sanitarians in the appropriate DER districts provided location corrections. The result of this exercise is that we were able to use a corrected data set that was publicly available and still provided by the source agency. 2.3 A Cautionary Tale In some cases the search for data sources can be compared to the search for El Dorado: at each village (department) you come to, the villagers (technical personnel) and village elders (department heads) know of the golden city (digital map/database), and although they themselves have never seen it, they know of someone in the next village (some other department) who has. However, at the end of the quest, after many such encounters in many such villages, what was seen from a distance as a golden city, when viewed up close turns out to be simply the sun shining on adobe huts (you can fill in your favorite analogy here: paper maps/lists that never were made digital; AutoCAD drawings in table inches with indifferent layering and linework; digital data that "can't be officially released until it undergoes full agency review"; etc). The two-fold point of this little tale is that, before you plan an expedition (or project) where you're relying on outside data sources: a) make sure you're talking to the primary source of the data; and b) find out in detail how good the data is. 3.0 USING LARGE DATA SETS A siting project as large as the PALLRW disposal facility siting project, involving a state-wide screening at a scale of 1:24,000, can put a severe strain on available GIS resources. Many of the problems faced and surmounted on this project have derived from the sheer volume of data. In this section we review some of these problems and solutions. 3.1 Keeping Track of Large Data Sets While the project team attempts to use data that is already available in digital format whenever possible, the primary means of entering spatial data into the GIS is manual digitizing. It is estimated that over 33,000 individual source maps have been digitized to date on the project. At one point, nearly 60 digitizers in 13 different offices across the country were digitizing maps for input into the GIS. Each of these digitized maps additionally has an associated checkplot and coverage tracking forms. In order to efficiently handle such large volumes of work and paper, specific Quality Assurance (QA) procedures were developed, not only for performing the digitizing work itself, but also for the preparation and transmittal of maps to and from the digitizing staff, and for handling the inevitable errors and omissions that crop up after the work has been processed. It is these procedures that make processing such large amounts of data in a timely fashion possible. A great deal of effort was put into developing a set of procedures that could be used to standardize digitizing tasks and allow for efficient digitizing while still meeting the project's quality standards. Two digitizing procedures were developed, one for ArcInfo and one for AutoCAD. The QA procedures ensured that the digital files received from one digitizer would be identical in format to those received from another, and allowed the conversion process from digitized files to completed coverages to be automated using AML. Considering the number of maps being digitized, and the amount of associated paperwork, it is clear that the process for preparing and transmitting source maps is as crucial to project quality control as digitizing procedures (it always helps to know that you got back as many maps as you sent out, and that all of them have been properly processed). Under the current data tracking procedure, as the delineators complete the identification of disqualified features, the delineated maps are grouped into a logical collection of maps referred to as a batch. Once a group of maps is placed into a batch, they remain together as a batch for the remainder of the project. Each batch is assigned a unique identification number, and a tracking form is associated with each batch to track its progress through the digitizing and coverage building process. A batch checkplot also helps to ensure that the entire batch has been properly processed. The automation of the digitizing and coverage building process, combined with the detailed data tracking provided by the batch processing, has enabled the project team to efficiently manage extremely large volumes of maps and optimize staff resources. This efficient approach to data management has been proven to be a key component of the siting process. 3.2 Avoiding Unnecessary Work One method of dealing with large numbers of large data sets in a timely manner is to use data sets that are already in the system to reduce the level of effort involved for those that have yet to be entered. This philosophy has been followed wherever possible on this project. The most obvious use for previously entered data layers is to tell us where not to look for data. While some of the data collected during Disqualification are represented as points or lines, the vast majority are area features represented as polygons. A key characteristic of the disqualification process is that once an area has been disqualified by any given feature, it remains disqualified. This means that once an area has been disqualified by a particular feature, that area no longer needs to be investigated for the presence of other disqualifying features from other criteria. The GIS is perfectly suited to take advantage of this component of the project. Because the GIS has the ability to produce plots at any scale, LAW developed a technique early in the disqualification process to produce maps of previously identified disqualified areas which could be overlaid onto a source map which was being delineated for a new disqualifying feature. These plots, referred to as DQ Overlays, allowed staff members delineating new features to determine which areas had already been identified as disqualified, and therefore did not require delineation for the new feature. While quad-scale plots are the most common DQ Overlays produced, the GIS is often used to produce overlays at a number of different scales, such as county or municipality scales. Many times, the exact scale or projection of the source map cannot be determined, or the source map is not georeferenced, which makes producing a matching DQ overlay difficult. To circumvent this problem, it is often possible to add geographic references, such as roads, streams or political boundaries, to the DQ plot, while not actually used as overlays, these plots aid in helping the delineators to identify which areas on the source map have been previously identified as disqualified, and can therefore be avoided during the current delineation effort. A second way of using previously entered data layers is to combine several of them to produce a new data layer, thereby saving effort in delineation and digitizing. An example of this is the development of the Exceptional Value (EV) Wetland data layer. EV Wetlands are wetlands that intersect or are wholly contained within certain features specified in the regulations, such as National Natural Landmarks or Threatened or Endangered Species. The PALLRW disposal facility may not be placed within 1/2 mile of an EV Wetland. Digital data were already available for these specified features, most of which were Disqualification features for the purposes of the PALLRW disposal facility siting. National Wetland Inventory (NWI) maps were used together with plots of the specified features to identify and delineate the EV Wetlands, which were then digitized and buffered by 1/2-mile. However, for approximately one-third of the Commonwealth, digital NWI were available. An ArcInfo AML was therefore developed to take the digital NWI maps, use the specified feature coverages to select out the EV Wetlands, and then buffer them 1/2-mile and combine them to produce an EV Wetlands coverage. This use of previously available data turned months of labor-intensive effort into weeks of CPU time. 3.3 Putting It All Together One of the more challenging aspects of the disqualification process was the preparation of a single composite disqualification coverage that incorporated each of the individual DQ data layers. From a theoretical point of view, this would appear to be a relatively routine operation - the polygon coverages would be unioned together in pairs until eventually all disqualified areas were unioned together into one composite coverage. However, due to the immense size of some of the individual data layer coverages and the size limitations inherent to ArcInfo Version 6, which was the version in place at the time, this became a rather involved task. The initial unioning scheme for combining the 30 disqualification coverages with polygon features is illustrated in Figure 1; however, because of the 10,000 arcs per polygon limitation present in the software at that time, it quickly became apparent that this method of joining the coverages would not be feasible. Therefore, as an alternative, the coverages were subdivided into groups of coverages which contained substantial overlap between features. The groups were selected strategically to allow the composites to be dissolved at various stages of the unioning process, which reduced the size of the coverage to a manageable level.
Even with the careful ordering of unions and successful dissolves, however, the dissolved group coverages still proved to be too large to combine due to size limitations of the software. Therefore, the coverages were subdivided into six regional coverages corresponding to the six Pennsylvania DER region boundaries. This subdivision allowed the coverages to dissolve properly and become as small as possible; however, the dissolved regional coverages still contained polygons too large to combine back into an entire statewide coverage. To circumvent this problem, 7.5-minute quadrangle boundaries were unioned into each of the dissolved regional coverages. This caused the large polygons in the dissolved coverages to be split into smaller, quad sized polygons. Once this step was performed it was possible to join the six regional coverages together into a single composite coverage. The final process used to combine the 30 coverages is illustrated in Figure 2. Even with this technique, however, the size of the final Stage Three disqualification coverage was well over 100 MB, with a total of over 100,000 polygons.
As a final note, we mention that, while the 10,000 arcs-per-polygon limitation has since been removed in ArcInfo Version 7, ostensibly removing our combinatory problems at the same time, the unioning of such large coverages can still be difficult due to significant disk space requirements. When problems such as this arise, the methods discussed above can still be effective in combining large coverages. 4.0 GETTING DATA TO THE PUBLIC There is a two-fold purpose to the distribution of information from this Project to the public. The first purpose is to keep the public informed on the progress of the siting process itself. The GIS, with its ability to easily create themed maps, plays an essential role in this effort. The second purpose is to disseminate to the public, for their use, the GIS databases collected and generated by this Project. Since the Project is funded through the Commonwealth, the information collected is treated as public domain data. As each of the individual portions of the Project is completed, as, for example, in the case of each of the three stages of Disqualification, the digital GIS coverages are made available to the public, as well. 4.1 Maps As each portion of the Project has been completed, a series of public meetings have been held at various locations throughout the Commonwealth. To date, public meetings have been held at the completion of each of the Stages of Disqualification: Stage One, in November of 1991; Stage Two, in February of 1993; and, most recently, Stage Three, in May of 1994. The Stage Three public meetings involved an extensive map production effort. For each of the Stages a series of maps were produced which illustrated the disqualification process at the appropriate scale. For Stage One, maps were produced at a statewide scale of 1:750,000. On one, each individual disqualification criterion was plotted in a different color, and on the second, each was plotted in the same shade. Between the two maps the public could visualize which areas of the state were identified as disqualified and which particular disqualifiers eliminated the most area in different regions of the state. At this point approximately 23 percent of the Commonwealth had been identified as disqualified. For Stage Two, the statewide maps were again produced, but this time the same information was also produced for each of the six DER regions at a scale of 1:400,000. At the end of Stage Two, approximately 46 percent of the Commonwealth had been identified as disqualified. It was in Stage Three that the fullest use was made of the capabilities of the GIS. The Statewide and Regional maps were again produced,as they had been for Stages One and Two. However, Stage Three represented the last full Stage of the disqualification effort (although, as explained above, disqualification layers will continue to be updated throughout the siting effort). At this point, 75 percent of the Commonwealth had been identified as disqualified, and the remaining portion of the Commonwealth was to pass on to the Evaluation portion of the siting process. It was understood that public interest would be more intense at this stage than for the previous two stages, and that the question most likely to be asked would be "Am I disqualified?". As a result, a 1:150,000 scale series was prepared for each of the 67 counties in the state. These maps showed the composite disqualified areas in gray, but also added local features such as roads, streams, and USGS 7.5-minute quadrangle boundaries. These maps allowed the members of the public to generally locate themselves within a county, and, if they required further information, to identify which specific USGS quadrangle they were interested in. This further information was available because the Project team had prepared each of the 909 quadrangles in the state at a scale of 1:24,000, which is the scale of the Stage Three data. These maps included the disqualified areas in gray, as well as the roads, streams, and municipal boundaries. Using these maps, members of the public were able to find the precise location in which they were interested, and determine for themselves whether or not it had been disqualified. The project team continues to receive requests for maps even now at a rate of approximately 15 to 20 maps per week. It is estimated that over 10,000 original maps have been produced and submitted to various public agencies and citizens since the completion of Stage Three Disqualification. The ability to disseminate Project information at this level of detail is an important element in developing public trust, and it is an ability that is not available without the use of GIS. The project team has received numerous compliments on the clarity and usefulness of the display maps from members of the public, including those who may have been opposed to the project in general but still appreciated the level of effort put forth to present the project information in an understandable manner. 4.2 Digital data As was mentioned earlier, a good deal of the data that has been collected and processed for this Project has never before been in digital form. For this reason, there has been a great deal of interest from both inside and outside of DER to obtain this data. As a result of this interest, it has been a Project practice to provide to DER the digital coverages for each of the Disqualification stages at the completion of that particular Stage. The DER Bureau of Information Systems (BIS) disseminates the coverages throughout the DER. In addition, the Project team also processes requests for data received from outside agencies through the DER Bureau of Radiation Protection (BRP), which oversees the PALLRWDF project. The coverages developed for this Project do have limitations. It was determined at an early point in the Project that the desired end was to site a facility, not to develop databases for the Commonwealth. The information important to the Project is whether or not an area is disqualified; as a result, the coverages produced during disqualification, while having excellent boundary information, do not contain a great deal of attribute information (although each of the coverages contain extensive documentation). In addition, since the use of DQ Overlays, as described above, were prevalent in Stage Three, not all of the Stage Three coverages are complete for the entire Commonwealth. Even with these stated limitations, however, the coverages have still provided a GIS jump-start to a number of agencies, and are receiving extensive use throughout the Commonwealth. ACKNOWLEDGEMENT The authors wish to acknowledge the management and staff of Chem-Nuclear Systems, Inc., for their cooperation and support in the production of this paper, and throughout the entire GIS siting effort.