Robert A. Chastain Jr.

GIS Methods for Developing an Exposure Metric for Electromagnetic Fields in Cancer Studies.

Author: Robert A. Chastain Jr., North Carolina State Center for Health Statistics

Defining Issue: A complex exposure metric for electromagnetic fields (EMFs) which involvess the spatial relationship between residences, linear high voltage power line features, and geomagnetic fields is developed using a GIS. This investigation requires the capture and derivation of an array of physical variables from different data sources. The storage, conversion between formats, and identification and rectification of error in these data are important challenges to meet when designing appropriate methodologies to perform research of this nature.

GIS Solution: A GIS is used to 1) locate (geocode) an epidemiological sample of brain cancer cases and controls and produce a digital map layer of high voltage power lines; and 2) capture, derive, and combine a number of data sources needed to apply a physical model that incorporates total geomagnetic field (GMF) intensity, dip angle, the angle of residences relative to nearby power lines, and the orientation of power lines relative to magnetic north into the calculation of an exposure metric which is referred to as the parallel component of the total geomagnetic field. The calculation of this exposure metric may be used to identify potential risk levels at the geographic locations of individual residences within an epidemiological sample within a GIS environment. Various techniques are employed to identify and control error introduction among the input digital map layers.

Application or Methodology: Whereas the primary component measured in many power line epidemiological studies has been AC electrical and/or magnetic field intensity, this study involves the derivation and measurement of a more complex exposure metric linked to a parallel component of the GMF surrounding power lines so that its relevance to the spatial distribution of brain cancer may be tested. A major advantage afforded by the use of GIS in the application of this metric to study the possible etiologic role of EMF exposure is the fact that residential access is not required to collect the necessary data at each site. Also, the computational strength of a GIS makes it feasible to study large samples of recorded brain cancer cases where appropriate epidemiological data is available.







1. Introduction and Background

1.1 Introduction

The steady increase of electrical power use in industrialized nations has taken place within only the last 50 years or so. This has brought an increased presence of high voltage power lines on the landscape. An ever increasing body of laboratory and epidemiological research has been generated over the last 20 years due to increasing public concerns about the potential health effects of electromagnetic field (EMF) exposure. A seminal epidemiological study concerning residential exposure to EMFs was performed in the greater Denver area in 1977 (Werthheimer and Leeper, 1979). The results of this early study suggested that there was in fact a spatial relationship between cancer in children and power lines carrying high current. This study did not find a etiologic link between the presence of high current residential environments and a particular type of cancer, but assumed that it may indirectly affect the development of cancer, perhaps through inhibiting cellular growth or inducing alteration in cellular electrical potentials. In this as well as other residential studies that reported positive findings, higher incidences certain types of cancer were linked to elevated magnetic fields rather than electrical fields (Wartenberg et al 1993).

One limitation to the many earlier studies that examine the link between cancer and EMFs is the fact that the AC magnetic field is investigated alone as the potential etiologic factor. More recently, exposure metrics have been developed that consider the interaction between the AC field generated in the vicinity of power lines and the ambient DC geomagnetic field (GMF) (Liboff 1985, Blackman et al. 1990, Liboff et al. 1989,1990, Poole and Trichopoulos 1991, Bowman et al. 1995, Liboff and McLeod 1995). The orientation of static GMF and extremely low frequency EMFs have been empirically shown to give rise to specific biological activity in laboratory settings (Blackman et al 1990). Combinations of static and AC fields have also been investigated in at least one epidemiological cancer study (Bowman et al. 1995).

The primary goal of this paper is to discuss the methods used to automate a hypothesized exposure metric for extremely low frequency EMFs in the form of high voltage power lines using a Geographic Information System (GIS). This metric can then be tested through epidemiological investigation to further the ongoing research into the possible etiologic role of EMF exposure linked to brain cancer or other cancer types. Although a large brain cancer epidemiological case/control sample is used as the human component of this research, it is not the intention of this paper to evaluate the relation between EMFs and brain cancer. The scope of this research is confined to automating an EMF exposure metric. In the course of this discussion, the difficulties inherent in combining spatial data from various sources will be elucidated, and the solutions applied to these difficulties will be described.

The hypothesis underlying the exposure metric used in this research states that the combination of static GMFs and extremely low frequency EMFs can lead to conditions where ion cyclotron resonance (ICR) occurs (Liboff 1985, Liboff et al. 1989,1990, Liboff and McLeod 1995). The ICR hypothesis was developed by Abraham Liboff (1985) to explain electromagnetic interaction with biological material whereby the movement of charged ions is affected through geomagnetic and AC field interactions. According to ICR theory, certain combinations of GMF intensity and the AC magnetic field frequency (60Hz) that surround power lines are said to affect the transport of certain ions and thereby cause physiological changes linked to human lymphoma cell culture proliferation. The ICR harmonics ranges for these ions are of specific interest. To test for geomagnetic involvement, the exposure metric employed examines the parallel component of the total GMF to the circuital magnetic field that surrounds the power line. This EMF exposure metric does not assume a linear relationship is present between residential distance to power lines and exposure level, but instead allows an examination of interactions between:

� Orientation of power line segments relative to magnetic north (i.e. the spatial orientation of the AC field and GMF)

� Relative angle of residences to overhead power lines

� Total GMF intensity

� Dip Angle of the GMF


The advantages of using a GIS include the ability to derive, compile, and synthesize the physical variables needed to compute the mean parallel component of the GMF for a large number of residences in a sample and do so without requiring access to these residences. Also, GIS affords the ability to conceive of a study area in three dimensions, so elevation can be used to compute the physical components of this metric. The derivation of the relative angle between the residences and power line orientation variables were derived using these GIS capabilities. The orientations of the mapped power line segments with magnetic north were obtained using the COGO module of ArcInfo. Finally the GMF measurements were calculated for the residences in the sample using the 1995 USGS EPOCH World Magnetic Model provided by the USGS.


1.2 Study Area

The study area is confined to Davidson, Forsyth, Guilford, and Randolph counties in North Carolina. This study area was chosen because an actual brain cancer epidemiological sample was available from the North Carolina Central Cancer Registry and reliable power line data from the Duke Power Company was also available before the initiation of the project. The area is in the upper piedmont of the southern Appalachian mountains, and the landscape is therefore characterized by a great deal of local relief. The geology of the area is complex due to its situation in a transitional zone between the North Carolina piedmont and the mountain regions. This contributes to the amount of spatial variability of the total geomagnetic field intensity and dip angle in the study area. The study area is depicted in Figure 1 using a hillshaded digital elevation model (DEM) as a background to emphasize the amout of elevational relief present.

Figure 1: Study Area Map
FIGURE 1: Study Area Map with Cases and Controls from the Epidemiological Sample

The study area is also characterized by a great deal of demographic variability. To the south, in Davidson and Randolph counties, much of the area can be characterized as rural. The large cities of Winston-Salem and Greensboro lie to the north, in Forsyth and Guilford counties, respectively. Many high voltage power lines traverse the study area due to the fact that power for Winston-Salem and Greensboro originates to a great extent from a large power plant located on the Yadkin River north of High Rock Lake on the southwestern edge of the study area. Also, more high voltage transmission lines are present due to the fact that the study area is served by both Duke Power and Carolina Power and Light, both utilizing different source plants.


1.3 Organization

The bulk of this paper entails a discussion of the methods employed to relate geocoded residences in an epidemiological sample to mapped power line features in a digital spatial database. First, the creation of the digital base map layers will be discussed. Some of the quality control methods employed on them is also covered. The paper then continues with a discussion of the derivation of the different physical variables that are the elemental components of the exposure metric as well as other variables that were necessary as collateral information for their calculation. These values were computed on a point-by-point basis for all of the mapped residences in the study area using ArcInfo GIS and other software. Finally, some issues relating to the application of the methods developed in this study within an epidemiological framework are discussed.


2. Data and Methods

2.1 Mapping the Base Data

Two fundamental digital map layers were necessary to perform this research. One was a point coverage representing location of the residences in an epidemiological case/control sample. The other was a line coverage representing the location of the high voltage power lines in the study area. These two digital map layers are illustrated in figure 1. The accurate mapping of these data was the most important step that was performed in preparation for this research. More weight rested on these sensitive duties than on any other, since many of the subsequent calculations were based on the geographic location of these variables relative to each other.


2.1.1 Addressmatching

The first step in this investigation involved mapping as many of the individuals in the epidemiological brain cancer case/control sample as possible. This was accomplished through the use of a geocoding technique known as addressmatching. With this method, the address attribute record for an individual was used to place them in geographic space so that database relationships could be established between addresses and coverage features. The address items in the records of a data set were matched to those in a road network digital map layer that contained lines attributed with street names, types, and the address ranges for the right and left sides of the line segments. Limitations to the effectiveness of geocoding with the addressmatching method include: 1) incomplete road network attributes; 2) lack of agreement between the road network map layer and address items in a data set due to temporal variation; and 3) the presence of post office box or rural route addresses. A specific problem encountered in this study area is the lack of attribute data in more rural areas. This condition can lead to an underrepresentation of addresses in these rural areas. In this study, this bias was overcome as much as possible through alternative geocoding methods.

Due to the numerous difficulties encountered during the initial automated process of addressmatching, a number of alternate procedures were carried out to geocode the residences in the epidemiological sample. First, the US Census Bureau's Topologically Integrated Geographically Encoding and Referencing (TIGER) system 1992 roads coverages for the four counties in the study area were used as primary address coverages with which to geocode the addresses in the sample data set. After the completion of the initial automatic address-matching process, an additional set of the remaining addresses were successfully matched in reject processing mode by modifying the address items in the records and by using collateral information to force some of the individual records to road arcs. A great deal of collateral information was derived from the DeLorme North Carolina Atlas and Gazetteer (scale: 150K) and various city street maps.

The North Carolina Department of Public Instruction's Transportation Information Management System (TIMS) roads coverages for the four counties in the study area were also used as supplemental address coverages, because of the inadequacy of the attribute information within the 1992 TIGER roads files, especially in rural areas. The TIMS matches were then moved to correspond with their location relative to the TIGER roads, due to the superior spatial accuracy of the TIGER roads coverages. Because many rural addresses contain only rural route and box numbers and lack of a street or road name identifier, many addresses could not be matched using automated GIS addressmatching methods. Field work was therefore performed to locate some of the rural route addresses and street addresses that fell through these cracks. While performing field work, a list of rural route addresses was given to a postal supervisor in Randolph county. This list was circulated to the appropriate rural mail carriers, and they were able to provide information that allowed additional rural route and P. O. box addresses to be geocoded. Finally, when the 1994 TIGER road coverages became available, additional addresses were able to be geocoded due to the additional attribute information contained within them. The addressmatching success rate results were as follows:

855 / 1004 successfully matched (85.16%)
216 / 251 cases (86.06%)
639 / 753 controls (84.86%)

After the completion of the geocoding process, checks for positional and attribute accuracy were performed. Checking the geocoded residences for positional accuracy involved selecting points in ARCEDIT by the city field in the address information and deciphering incorrect address matches by interactive viewing. Various attribute accuracy and completeness checks were also performed using the database management capacity of INFO.


2.1.2 Power Line Mapping

The second base mapping task involved identifying all of the high voltage transmission lines in the study area, obtaining some key attribute information about them, and creating a positionally accurate digital map layer containing this information. Many of the high voltage power lines within the study area are maintained by the Duke Power company, but other regional and local providers maintain additional transmission and distribution lines in the area, especially in the more rural portions of the study area to the south and west. The various agencies (municipal and Electric Membership Corporations) responsible for the maintenance of high voltage transmission lines in the study area were identified through consultations with the North Carolina Rural Electrification Authority. Figure 2 graphically depicts the service areas for the different utility organizations from which high voltage lines were mapped for this study.


Figure 2: Study Area Map
FIGURE 2: Study Area Map with Power Line Sources

The sources for the power lines mapped for this study are:

Duke Power
Carolina Power and Light (CPL)
Davidson Electric Membership Corporation (EMC)
Randolph EMC
City of High Point

It should be noted that deriving a complete and accurate digital map layer of the power lines in the study area would not have been possible without the thorough cooperation of these utility organizations.

The power lines from the different source agencies were generated using various methods, depending on the source material provided. The Duke Power, CPL, and Davidson EMC transmission line data were provided in CAD DXF format and required little modification for the creation of ArcInfo coverages. The Randolph EMC data was also in CAD DXF format, but the data was judged to be too spatially coarse and generalized to suit the purposes of this research. The corresponding transmission lines were therefore located on USGS 1:24000 topographic quadrangle maps and manually digitized to create a series of digital map layers. These coverages were then appended into one coverage and the kilovoltage attribute information for the individual power lines was transferred to the arcs from the CAD data. The power line information obtained from the City of High Point was digitized from a paper street map, as this information was not available in digital form.

The numerous sources from which the power line digital map layer was produced and the apparent lack of consistency among these sources warranted further scrutiny to assure both the completeness as well as spatial accuracy of this fundamental element of the research. Various map and collateral information sources were applied to the task of checking these data for positional as well as attribute errors. First, the transmission digital map layer contained in the US Census bureau's TIGER data was overlaid for a qualitative look at the completeness of the mapped power line data. Also, the USGS 24K topographic maps for the study area were perused to check for data completeness. Further completeness checks were performed by checking the data against a windshield survey conducted along I-85, which dissects the study area. The notes taken during this survey were oriented using the DeLorme North Carolina Atlas and Gazetteer (150K). Finally, the Davidson EMC data was both digitized manually and read in from their CAD DXF files. These coverages were overlaid and compared for positional accuracy. This vigilance was successful in discovering one missing transmission line in the study area. The line was digitized from USGS 24K quads, mapped, and sent to the agency that maintained the rest of the lines in that portion of the study area. They verified that it was one that they maintain. This line was of recent origin and was therefore missing from the digital data which they had provided us for this study.


2.2 Methods for Deriving Physical Variables

After the two primary digital map layers were mapped, relating them to each other using the exposure metric under examination in this study was possible. With this exposure metric, it may be possible to investigate a potential etiologic link between high voltage power line AC field exposure and certain types of cancer. This link is hypothesized to involve the combination of the AC field surrounding power lines with specific levels of GMF intensity as well as the orientation between the AC field and GMF. This exposure metric is the component of the total GMF that is parallel to the circuital magnetic field that surrounds the power line. It calculates degree of GMF intensity modification which is affected by an AC field surrounding a power line. The equation for the parallel geomagnetic component if q is restricted to 90 degrees is:

B = Bt(sina cosb + cosa sinb sinq)

where Bt is the total GMF intensity, a is the dip angle, b is the angle made by the elevation of the power line relative to a residence, and q is the angle of orientation of the power line relative to magnetic north (Liboff and McLeod 1995).

A number of procedural steps and combinations of data from different sources were performed in the process of automating this equation in a GIS environment. This process was actually initiated by the tasks of geocoding the residences and mapping the power line features in the study area. The next step involved the acquisition of the GMF variables required in the equation for each geocoded residence point. The values for Bt, a, and the declination of magnetic north were acquired from the USGS EPOCH world magnetic model, which is described below. Since one of the parameters required as input for this model was elevation, it was first necessary to derive the elevation of the residence points from a USGS Digital Elevation Model (DEM). The gridded DEM was converted into a polygon coverage so that an overlay of the residence points could be performed using IDENTITY to compute their geometric intersection. If the TIN module had been available, the LATTICESPOT command could have been used on the gridded version.

EPOCH is a global scale computer model that calculates diverse measures of the earth's GMF at user-specified points at or above its surface. The effects of the earth's outer fluid core are represented in this model. Although this accounts for 90 percent of the GMF at any given time, there are three other components of the earth's geomagnetic field. It is also affected by characteristics and features associated with the earth's crust. On land, magnetic anomolies can be produced by mountain ranges, ore deposits, ground struck by lightning, geological faults, and cultural features (i.e., railroad tracks or power lines). Magnetic anomolies linked to these sources may be rather high, but are isolated and of small spatial extent. The final two components of the GMF are associated with processes that are driven by the solar wind in the magnetosphere and ionosphere. Magnetic anomolies caused by these events can be severe and vary in temporal extent. Because the EPOCH model characterizes only the portion of the GMF generated by the fluid outer core of the earth, spatial and temporal anomalies will be observed when comparing ground measurements with output from the model (Quinn et al. 1995). The output of the the EPOCH model proved to be more detailed compared to the information contained within the USGS published GMF charts (1985) in that a spatial gradient was apparent for for Bt, a, and the declination of magnetic north in the study area from the EPOCH output. The USGS GMF Charts depict the area as comparatively uniform with respect to these values.

Several parameters are needed as input into the mathematical model employed in EPOCH to calculate GMF component measures. First, latitude and longitude are required as spherical locators on the earth's surface. Elevation is also used as a radial measure from the center of the earth. Finally, a date must be entered so that the temporal variation in GMF values may be accounted for. June 1992 was chosen as input for the model, because this date approximated the original compilation period of the epidemiological data. An ASCII file was produced containing these values for all of the geocoded residences, and was entered into the model. After the EPOCH model generated an output of various measures and components of the GMF, EXCEL was used to extract the desired GMF variables into a format importable into INFO. This was then related back to the original resident point file using the record number.

Relating the geocoded residences to nearby power line features necessitated the measurement of the distance between all of the residence points to their nearest power line segment. This was performed using the NEAR command in ArcInfo (Esri 1996). The search radius for this operation was set wide enough so that all of the residence points would match to a power line feature. The power line features that were identified as being proximal to a residence were separated out from the power line coverage, and the x and y coordinates of the perpendicular distance (d) intersect location were used as the point with which to identify the ground elevation of that line feature. This elevation value was subsequently used as a component of the total elevation of the power lines.

The total elevation of the power line features matched to residence points was expressed as the sum of the transmission tower height of the power lines and the ground elevation at the point of the perpendicular distance (d) intersect points. This elevation was subsequently used in the calculation of the relative angle between residences and nearby power line features (b). The average height of the lines themselves were calculated by category of kilovoltage based on information from the various utility companies regarding tower heights using the following formula:

44kv 15.55 meters
69kv 15.55 meters
100kv 15.55 meters
115kv 15.55 meters
230kv 24.30 meters
500kv 27.80 meters

The difference in elevation (h) between the matched residences and power line features was calculated in INFO. The angle of elevation of the residences relative to their nearest power line features was then calculated using the relation tanb = h/d.

Calculating the orientation of the power line features relative to geomagnetic north was the final step performed to obtain the physical variables requisite in the parallel component of the total GMF exposure metric equation. The ARCCOGO command on the ArcInfo coordinate geometry module COGO was used to derive the orientation relative to geomagnetic north of the power line segments specified by the perpendicular distance algorithm (NEAR command). The deviation of these arc segments from magnetic north was subsequently modified to range from 0 to 90 degrees using the database management capabilities of INFO.


3. Discussion

The scope of this research was limited to operationalizing a complex exposure metric designed to characterize the potential risk posed by EMFs in the form of high voltage power lines. The intended use of the brain cancer sample geocoded for this research was to serve only as base map data with which to develop GIS methods to automate this exposure metric, not to support an evaluation of the relationship between EMFs and brain cancer. Although specific methods were delineated for the spatial automation of this metric, alternative techniques exist for all of the constituent steps developed in this paper for the automation of the exposure metric. A number of issues must be considered before selecting an appropriate methodology for the application of this exposure metric within an epidemiological framework. They involve both the quality and lineage of the input geographic data and the sensitivity of the elemental physical variables in the exposure metric.

There is an adage that states that the results of any analysis is only as good as the input data involved (garbage in - garbage out). The high voltage power line digital data supplied by the organizations responsible for their maintenance were in some instances incomplete and/or inaccurate. Although a great deal of time and effort went into quality checking this data, some inaccuracies may still be present. Since a number of different sources of information existed to draw upon while assembling the power line digital data, decisions were made as to which ones represented the closest approximation to ground truth. In this study, much of the power line data was either obtained by or quality checked through digitization from USGS 24K topographical maps, as this source was judged to be most reliable for positional accuracy. In any study where redundant information sources exist for base mapping, a hierarchy of map sources should be decided upon based on the accuracy of the source materials so that decisions can be made as to which sources to use for subsequent spatial analysis.

Ascertaining the most sensitive components of the exposure metric would be valuable in determining the rigor necessary in collecting the data for its application in future studies. For example, if the relative angle between the residence and overhead power line proves to be a sensitive element in the measure of exposure, then perhaps geocoding residences using addressmatching may not yield a sufficient level of positional accuracy. The use of a global positioning system in the collection of residence and/or power line tower locational data may even be warranted. If the total GMF intensity emerges as the most significant element in the exposure metric, then perhaps a more detailed geomagnetic survey should be initiated over a study area. Spot measurements inside and outside of the residences in a sample population may be necessary to suitably account for this variable. Likewise, if the spatial orientation of the linear AC field associated with the power lines is important enough, then a finer spatial representation of declination angles may be required. GIS provides valuable tools for testing this metric and its elemental components within an epidemiological framework and its capabilities are indispensable in synthesizing the physical variables compiled for such research.

It is hoped that methods for using a GIS to operationalize the GMF parallel component exposure metric such as those described in this paper will be applied in the future. These methods should prove effective given an epidemiological sample with sufficiently complete epidemiologically relevant information. In epidemiological case/control studies where accurate geocoding and good occupational and residential history data exists, the computation of an exposure metric has a great deal of potential and is superior to the use of proximity as a surrogate for exposure. This study has demonstrated the capability of GIS to develop an exposure metric that can be used in epidemiological studies. It is hoped that this study helps pave the way for the development or automation of additional exposure metrics using GIS.





References


Blackman, C.F., Benane, S.G., House, D.E., and Elliot, D.J. 1990. Importance of Alignment Between Local DC Magnetic Field and an Oscillating Magnetic Field in Responses of Brain Tissue In Vitro and In Vivo, Bioelectromagnetics, 11: 159-167.

Bowman, J.D., Thomas, D.C., London, S.J., and Peters, J.M. 1995. Hypothesis: The Risk of Childhood Leukemia is Related to Combinations of Power-Frequency and Static Magnetic Fields, Bioelectromagnetics, 16: 48-59

Environmental Systems Research Institute (Esri). 1996. ArcInfo 7.0: Online Documentation, Redlands CA: Esri.

Liboff, A.R. 1985. Geomagnetic Cyclotron Resonance in Living Cells, Journal of Biological Physics, 13: 99-102.

Liboff, A.R., McLeod, B.R., and Smith, S.D. 1989. Ion Cyclotron Resonance Effects of ELF Fields in Biological Systems. In Wilson, B.W., Stevens, R.G., and Anderson, L.E. (eds): Extremely Low Frequency Electromagnetic Fields: The Question of Cancer, Columbus, OH: Batelle Press, pp 251-289.

Liboff, A.R. and McCleod, B.R. 1995. Power Lines and the Geomagnetic Field, Bioelectromagnetics, 16: 227-230.

Poole, C and Trichopoulos, D. 1991. Extremely Low-Frequency Electric and Magnetic Fields and Cancer, Cancer Causes and Controls, 2: 267-275.

Quinn, J.M., Coleman, R.J., Shiel, D.L., Nigro, J.M. 1995. The Joint US/UK EPOCH World Magnetic Model, Naval Oceanographic Office, Stennis Space Center, MS.

United States Geological Survey. 1985. The Magnetic Field in the United States. Charts GP-986D, GP-986F, GP-986I, Box 25046, MS 968, Denver Federal Center, Co 80225.

Wartenberg, D., Greenberg, M., and Lathrop, R. 1993. Identification and Characterization of Populations Living Near High-Voltage Transmission Lines: A Pilot Study, Environmental Health Perspectives, 101(7): 626-632.

Wertheimer, N. and Leeper, E. 1979. Electrical Wiring Configurations and Childhood Cancer, American Journal of Epidemiology, 109(3): 273-284.






Robert A. Chastain, Jr.
GIS Analyst
North Carolina State Center for Health Statistics
P.O. Box 29538
Raleigh, NC 27626-0538
Tel: (919) 715-4473
FAX: (919) 733-8485
email: rchastai@gis.sches.ehnr.state.nc.us