Lessons Learned in the Migration of GIS from UNIX to a Windows Environment
As many other organizations, the US Environmental Protection Agency - Region 2
is in the process of migrating GIS applications from UNIX to a Microsoft Windows 95 platform.
This paper describes our difficulties in accessing GIS information in a UNIX environment from
a MS Windows desktop and our initiatives to resolve them. The information consists of spatial
libraries and non-spatial Oracle databases on local and on remote servers. We will describe the
current technical architecture which has simplified our operational environment. We will also
describe the features of a new GIS user Interface that has significantly facilitated the use of GIS
technology by our professional and para-professional staff. Lastly, we will describe the future
direction for providing access to GIS technology, which consists of web based applications
served over an Intranet using Esri's ARCView Internet Map server.
Introduction
The Region 2 GIS effort began in 1989 when the Office of Policy and Management (OPM)
began planning for GIS implementation. Since that time we have established a full complement
of equipment, software and communications infrastructure for the use of this technology. We
have also established
GIS data development partnerships with state counterparts and federal
agencies which have yielded a robust set of GIS data coverages. But, so far, use of
GIS in Region 2 has been limited to a small number of applications by staff able to cope with a relatively
difficult to use UNIX based technology.
Problems with GIS Application in UNIX Environment:
- - Screens look foreign when compared to Microsoft Windows based applications. Software
interface does not have the visual and functional consistency found in standard Windows menus.
- - Unstable and slow access to GIS information It is a complex environment requiring data
transfers across networks (UNIX server to Windows desktop).
We were able to stabilize our
environment by replacing the software for accessing the UNIX file system from PCs (Novell's
NFS software) with a public domain product called "Samba". This program is running on an
interim basis in order to move the GIS libraries from the UNIX server to Novell servers. This
change significantly improved performance & reliability.
- - GIS applications take a long time to load Because data is distributed over three networks
(Novell, UNIX here and in North Carolina). Queries typically take longer than 30 seconds to
respond.
- - Difficult to learn / teach Because X terminal emulation software is not intuitive and as
mentioned earlier, the application menu navigation looks foreign to PC users.
- - Aging (Motorola chip based) GIS infrastructure is not suitable for GIS implementation Region
wide. We are in the process of expanding the use of this technology from a handful of users to
over a thousand potential users.
Current technological improvements in hardware and operating
systems provide a unique opportunity for the standard desktop platform to also support
computational intensive applications like GIS.
- - Excessive capital and maintenance costs The cost of an NT workstation is about four times
cheaper than a comparable UNIX workstation. Having a single Intel based PC architecture for
the desktop and GIS applications also provides a scale of magnitude savings in maintenance and
support.
Above Problems Being Addressed by Two Initiatives:
- - Migrate all GIS data holdings and applications to a Microsoft Windows Environment
- - Design and Implement a new Microsoft Windows based GIS User Interface
Considerations in Migration of GIS data holdings to a Microsoft Windows
Environment:
Because of the large magnitude of disk space needed to store image data, we looked at
different technologies for compressing this data and considered three options:
- Allowing the Novell Operating System to compress image data which shrinks files in a 4 to 1
ratio but incurs additional CPU time during image processing.
- Re-sample the images using GRID to unproject the data to geographic NAD83 (originally in
UTM) and save the resulting image in JPEG format.
- Re-sample the images using a commercial product like ERDAS or Esri's Spatial Data
Analyst and create a set of images in different resolution scales that can be subsequently selected
by an application.
We have adopted option 2 and we are able to compress the data in a 10 to 1 ratio. We have also
created an image catalog for selecting available images and an index map to manage the image
creation process. We will also be piloting option 3 using Lizardtech's MrSid software package.
- Migration provided the opportunity to standardize GIS coverages to a common projection. We
are creating coverages in unprojected geographic coordinates because the original data sources
used various projections (NJ State Plane, NY UTM-18, PR State Plane, EPA Hqts Albers Equal
Area). By keeping the coordinates in geographic NAD83 we are able to standardize the input
and then project data within the GIS application.
- The process for creating unprojected images requires lots of disk space (about 7 times the size
of the files) needed for processing. Also highly CPU intensive since each pixel has to be
decoded to Geographic coordinates and then reprojected.
- Novell network pre-assigns many server drives, particularly if client PC uses other Oracle
applications. This leaves very limited number of network drives for GIS applications.
- GIS libraries in New York office were also made available to the Edison, New Jersey facility,
but data transfers using a high speed (T-1) line was unacceptable. This problem was solved by
installing a Novell server and creating a mirror copy of the libraries in the Edison, NJ facility.
Since many of our coverages are static, we don't anticipate data synchronization problems.
- When we initially moved the GIS libraries to Novell servers we were unable to access these
libraries because we didn't have copies of ArcInfo on an NT server. It became necessary to
keep two sets of libraries during this transition... Changes were applied to the libraries on the
UNIX server and subsequently copied over to the Novell servers using Samba.
- We used an NT server to create an initial copy of the GIS libraries which we then tried moving
to the Novell server, but discovered the File System formats were incompatible. Windows NT
server is a complex operating system requiring a system administrator with professional
knowledge on this product.
Architectural Components of Intel Based GIS System:
The GIS is housed on four separate servers; two high-end servers running Netware 4.11 to store
the GIS coverage libraries and the image data. Two mid-scale servers running Windows NT
which provide ArcView Internet map server functions and in the future, the Oracle GIS
database. (Oracle GIS database still resides on a local UNIX server).
Configurations:
1. Server containing ArcInfo Libraries, Shapefiles and other vector data:
IBM 325 server running Novell Netware 4.11
- 200 Mhz Processor; upgradable to two processors
- 2.25 GB Hard Drive; 16X CD-ROM Drive
- 512 MBytes RAM (4x128 DIMMs)
- 16 GBytes disk space using 5 GByte drives into Core RAID (Redundant Array of
Independent Disks) Lightning Array
2. Image server containing Digital Orthophotography, satellite images and other raster data:
IBM 325 server running Novell Netware 4.11
- 200 MHZ Processor; upgradable to two processors
- 2.25 GB Hard Drive; 16X CD-ROM Drive
- 1 GBytes RAM (8x128 DIMMs)
- 120 GBytes of disk space using 9Gbyte drives in Core RAID Lightning Arrays
3. Intranet Map Server configured with ARCView, ARCView Internet Map Server, Map Objects
and Map Objects Internet Map Server:
Data General AV4800 server running Windows NT 4.0
- Dual 166 Mhz Pentium Processor
- 2.25 GB Hard Drive; 16X CD-ROM Drive
- 256 MBytes RAM
- 64 MByte VRAM video board interfacing the server console
- 8 GBytes additional disk space using 4 Gbyte drives in Core RAID Lightning Array
4. Oracle server housing the GIS database:
Data General AV4800 server running Windows NT 4.0
- Dual 166 Mhz Pentium Processor
- 2.25 GB Hard Drive; 16X CD-ROM Drive
- 256 MBytes RAM
- 8 GBytes additional disk space using 4Gbyte drives in Core RAID Lightning Array
Topology:
Connectivity to these servers is via 16 Mbit Token Ring using CISCO routers and
Cabletron hubs. Over the next 2- 3 years we plan to migrate from Token Ring to Fast Ethernet
(100Mbit) which will provide the capacity and bandwidth we will need to prevent bottlenecks.
Because our new building was designed with this foresight, the cabling is in place, but there are
significant costs in upgrading the routers and hubs.
GIS Workstations:
The new Intel based GIS workstations are 266 Mhz Pentium II computers with 64 Mbytes of
RAM, 4.4 Gbyte disk; a SVGA- PCI video card with 4 Mbytes of VRAM and a 21 in. Trinitron
color monitor. Some of these workstations are dual boot (Windows NT and 95) for users working
with ArcInfo and/or Image processing applications. Otherwise they run a standard Windows 95
image that includes ARCView. This standard Windows 95 image is also being installed on all
other Pentium class desktop computers.
GIS User Interface General Attributes:
- - Includes standard views with minimal selected data as starting point for users (such as
Northeast with state boundaries and state names, Caribbean with shorelines and island names,
and National with EPA Regions, Canadian, Mexican and Caribbean shorelines).
- - Provides intuitive access to Regional data.
- - Regional libraries are presented in a hierarchy by type of coverage... each coverage has a
meaningful name and linked to its metadata record via HTML based Help Screens
- - Is fully linked to Agency, Census and other data via transparent connections to Oracle
( Envirofacts Warehouse; Demographics database; EPA Spatial Libraries; R. GIS database)
- - Modular ... applications, tools and project/application-specific elements can be added in a
consistent way.
- - Based in ARCView 3.1 as PC Desktop tool for EPA staff.
- - Optimized for Pentiums but acceptable response on 486s, including portable PCs accessing
system remotely.
Required functionality for the Interface is broken down into the following categories:
- - Project/View Management: Functions include opening/saving projects, views, tables, charts,
documentation, etc. These functions are for the most part provided by Arcview without
additional programming, but drop-down menus may need to be added to give users access to all
general use projects and applications.
- - Position, Mapextent and Data Selection: Functions include defining the area on the earth to be
included in the users view.
* The mapextend should be determined by the geographical area defined by the user, not the
range of data values.
- - Data Selection and Display: Functions include determing what data will be displayed,
showing relevant documentation about the data to allow the user to decide what layers should be
shown, how the data will be represented or symbolized, and how legend information will appear.
Also includes ability for users to add their own data.
- - Querying Tools: Functions that allow users to extract information from selected data layer(s)
ranging from simple identify functions that generate statistical summaries and charts such as the
Population Estimation and Characterization Tool (PECT); buffering analysis; trend analysis;
identify facilities meeting a particular criteria; Etc.)
- - Regional Applications: Functions include ability to select from a library of Regional
applications or projects to open (e.g., Environmental Justice Analysis, Enforcement Targeting,
etc.).
- - On-line Help: Functions include Web-browser based help screens for special functions of the
Region 2 customization of Arcview and user-friendly metadata for all coverages. (Does not
replace basic help already available within Arcview, but supplements.)
- - Mapping/Layout:
Users can define a views area of interest by:
- - Scale (already built into Arcview)
- - EPA Region
- - State(s)
- - City or town
- - Street address/intersection
- - Zip Code(s)
- - Lat/Long
- - Coverage (already built into Arcview)
- - Predefined Project Areas (e.g., LI Sound)
- - HUC Code(s)
- - Quad(s)
- - Quarter Quad
Future Direction...
We anticipate by the end this calendar year having a standard Windows 95/NT environment
with a full complement of GIS analytical tools that will serve well the mission of our Agency.
After we complete this EPA Region wide implementation, our next priority will be to provide
access to our GIS applications and data over our Intranet using Esri's ARCView Internet Map
Server.
We have adopted Visual Basic as our primary software development environment and we
anticipate future GIS applications to be based on this tool set. We will also be exploring
techniques to directly access the GIS libraries that are maintained by the originators of the data
and eliminate many of the GIS data layers we currently maintain locally.
We are confident the new GIS user interface will be successful because it is being implemented
in the context of all our other work. As this application evolves to a web based tool, it will be
just another component of the existing desktop. This is in contrast to other information systems
projects currently being developed in unfamiliar groupware technology like Lotus Notes that are
not gaining user acceptance.
Acknowledgements
The user interface specifications described on this paper were developed by the EPA Region 2
ARCView Interface Design Team that also included: Linda Timander, Harvey Simon, Daisy
Tang, Carlos Kercado, Roch Baamonde, William Hansen, Robert Simpson and Stanley Stephansen.
The authors also wish to thank Raj Samayam, Robert Simpson, William Hansen and Frank
DeMarco for their review and valuable comments.
George A. Nossa
Team Leader, Environmental Systems Team
US Environmental Protection Agency - Region 2
Information Systems Branch
290 Broadway
New York, NY 10007
Telephone: (212) 637-3325
Fax: (212) 637-3354
E-mail: nossa.george@epamail.epa.gov
Robert Eckman
GIS Network Administrator
US Environmental Protection Agency - Region 2
Information Systems Branch
290 Broadway
New York, NY 10007
Telephone: (212) 637-3324
Fax: (212) 637-3354
E-mail: eckman.robert@epamail.epa.gov