Lessons Learned in the Migration of GIS from UNIX to a Windows Environment

As many other organizations, the US Environmental Protection Agency - Region 2 is in the process of migrating GIS applications from UNIX to a Microsoft Windows 95 platform.

This paper describes our difficulties in accessing GIS information in a UNIX environment from a MS Windows desktop and our initiatives to resolve them. The information consists of spatial libraries and non-spatial Oracle databases on local and on remote servers. We will describe the current technical architecture which has simplified our operational environment. We will also describe the features of a new GIS user Interface that has significantly facilitated the use of GIS technology by our professional and para-professional staff. Lastly, we will describe the future direction for providing access to GIS technology, which consists of web based applications served over an Intranet using Esri's ARCView Internet Map server.


Introduction

The Region 2 GIS effort began in 1989 when the Office of Policy and Management (OPM) began planning for GIS implementation. Since that time we have established a full complement of equipment, software and communications infrastructure for the use of this technology. We have also established GIS data development partnerships with state counterparts and federal agencies which have yielded a robust set of GIS data coverages. But, so far, use of GIS in Region 2 has been limited to a small number of applications by staff able to cope with a relatively difficult to use UNIX based technology.

Problems with GIS Application in UNIX Environment:

Above Problems Being Addressed by Two Initiatives:

Considerations in Migration of GIS data holdings to a Microsoft Windows Environment:

Because of the large magnitude of disk space needed to store image data, we looked at different technologies for compressing this data and considered three options:

  1. Allowing the Novell Operating System to compress image data which shrinks files in a 4 to 1 ratio but incurs additional CPU time during image processing.

  2. Re-sample the images using GRID to unproject the data to geographic NAD83 (originally in UTM) and save the resulting image in JPEG format.

  3. Re-sample the images using a commercial product like ERDAS or Esri's Spatial Data Analyst and create a set of images in different resolution scales that can be subsequently selected by an application.

We have adopted option 2 and we are able to compress the data in a 10 to 1 ratio. We have also created an image catalog for selecting available images and an index map to manage the image creation process. We will also be piloting option 3 using Lizardtech's MrSid software package.

- Migration provided the opportunity to standardize GIS coverages to a common projection. We are creating coverages in unprojected geographic coordinates because the original data sources used various projections (NJ State Plane, NY UTM-18, PR State Plane, EPA Hqts Albers Equal Area). By keeping the coordinates in geographic NAD83 we are able to standardize the input and then project data within the GIS application.

- The process for creating unprojected images requires lots of disk space (about 7 times the size of the files) needed for processing. Also highly CPU intensive since each pixel has to be decoded to Geographic coordinates and then reprojected.

- Novell network pre-assigns many server drives, particularly if client PC uses other Oracle applications. This leaves very limited number of network drives for GIS applications.

- GIS libraries in New York office were also made available to the Edison, New Jersey facility, but data transfers using a high speed (T-1) line was unacceptable. This problem was solved by installing a Novell server and creating a mirror copy of the libraries in the Edison, NJ facility. Since many of our coverages are static, we don't anticipate data synchronization problems.

- When we initially moved the GIS libraries to Novell servers we were unable to access these libraries because we didn't have copies of ArcInfo on an NT server. It became necessary to keep two sets of libraries during this transition... Changes were applied to the libraries on the UNIX server and subsequently copied over to the Novell servers using Samba.

- We used an NT server to create an initial copy of the GIS libraries which we then tried moving to the Novell server, but discovered the File System formats were incompatible. Windows NT server is a complex operating system requiring a system administrator with professional knowledge on this product.

Architectural Components of Intel Based GIS System:

The GIS is housed on four separate servers; two high-end servers running Netware 4.11 to store the GIS coverage libraries and the image data. Two mid-scale servers running Windows NT which provide ArcView Internet map server functions and in the future, the Oracle GIS database. (Oracle GIS database still resides on a local UNIX server).

Configurations:

1. Server containing ArcInfo Libraries, Shapefiles and other vector data:

IBM 325 server running Novell Netware 4.11
- 200 Mhz Processor; upgradable to two processors
- 2.25 GB Hard Drive; 16X CD-ROM Drive
- 512 MBytes RAM (4x128 DIMMs)
- 16 GBytes disk space using 5 GByte drives into Core RAID (Redundant Array of Independent Disks) Lightning Array

2. Image server containing Digital Orthophotography, satellite images and other raster data:

IBM 325 server running Novell Netware 4.11
- 200 MHZ Processor; upgradable to two processors
- 2.25 GB Hard Drive; 16X CD-ROM Drive
- 1 GBytes RAM (8x128 DIMMs)
- 120 GBytes of disk space using 9Gbyte drives in Core RAID Lightning Arrays

3. Intranet Map Server configured with ARCView, ARCView Internet Map Server, Map Objects and Map Objects Internet Map Server:

Data General AV4800 server running Windows NT 4.0
- Dual 166 Mhz Pentium Processor
- 2.25 GB Hard Drive; 16X CD-ROM Drive
- 256 MBytes RAM
- 64 MByte VRAM video board interfacing the server console
- 8 GBytes additional disk space using 4 Gbyte drives in Core RAID Lightning Array

4. Oracle server housing the GIS database:

Data General AV4800 server running Windows NT 4.0
- Dual 166 Mhz Pentium Processor
- 2.25 GB Hard Drive; 16X CD-ROM Drive
- 256 MBytes RAM
- 8 GBytes additional disk space using 4Gbyte drives in Core RAID Lightning Array

Topology:

Connectivity to these servers is via 16 Mbit Token Ring using CISCO routers and Cabletron hubs. Over the next 2- 3 years we plan to migrate from Token Ring to Fast Ethernet (100Mbit) which will provide the capacity and bandwidth we will need to prevent bottlenecks. Because our new building was designed with this foresight, the cabling is in place, but there are significant costs in upgrading the routers and hubs.

GIS Workstations:

The new Intel based GIS workstations are 266 Mhz Pentium II computers with 64 Mbytes of RAM, 4.4 Gbyte disk; a SVGA- PCI video card with 4 Mbytes of VRAM and a 21 in. Trinitron color monitor. Some of these workstations are dual boot (Windows NT and 95) for users working with ArcInfo and/or Image processing applications. Otherwise they run a standard Windows 95 image that includes ARCView. This standard Windows 95 image is also being installed on all other Pentium class desktop computers.

GIS User Interface General Attributes:

Required functionality for the Interface is broken down into the following categories:

  1. - Project/View Management: Functions include opening/saving projects, views, tables, charts, documentation, etc. These functions are for the most part provided by Arcview without additional programming, but drop-down menus may need to be added to give users access to all general use projects and applications.

  2. - Position, Mapextent and Data Selection: Functions include defining the area on the earth to be included in the users view.

    * The mapextend should be determined by the geographical area defined by the user, not the range of data values.

  3. - Data Selection and Display: Functions include determing what data will be displayed, showing relevant documentation about the data to allow the user to decide what layers should be shown, how the data will be represented or symbolized, and how legend information will appear. Also includes ability for users to add their own data.

  4. - Querying Tools: Functions that allow users to extract information from selected data layer(s) ranging from simple identify functions that generate statistical summaries and charts such as the Population Estimation and Characterization Tool (PECT); buffering analysis; trend analysis; identify facilities meeting a particular criteria; Etc.)

  5. - Regional Applications: Functions include ability to select from a library of Regional applications or projects to open (e.g., Environmental Justice Analysis, Enforcement Targeting, etc.).

  6. - On-line Help: Functions include Web-browser based help screens for special functions of the Region 2 customization of Arcview and user-friendly metadata for all coverages. (Does not replace basic help already available within Arcview, but supplements.)

  7. - Mapping/Layout:

Future Direction...

We anticipate by the end this calendar year having a standard Windows 95/NT environment with a full complement of GIS analytical tools that will serve well the mission of our Agency. After we complete this EPA Region wide implementation, our next priority will be to provide access to our GIS applications and data over our Intranet using Esri's ARCView Internet Map Server.

We have adopted Visual Basic as our primary software development environment and we anticipate future GIS applications to be based on this tool set. We will also be exploring techniques to directly access the GIS libraries that are maintained by the originators of the data and eliminate many of the GIS data layers we currently maintain locally.

We are confident the new GIS user interface will be successful because it is being implemented in the context of all our other work. As this application evolves to a web based tool, it will be just another component of the existing desktop. This is in contrast to other information systems projects currently being developed in unfamiliar groupware technology like Lotus Notes that are not gaining user acceptance.

Acknowledgements

The user interface specifications described on this paper were developed by the EPA Region 2 ARCView Interface Design Team that also included: Linda Timander, Harvey Simon, Daisy Tang, Carlos Kercado, Roch Baamonde, William Hansen, Robert Simpson and Stanley Stephansen.

The authors also wish to thank Raj Samayam, Robert Simpson, William Hansen and Frank DeMarco for their review and valuable comments.


George A. Nossa
Team Leader, Environmental Systems Team
US Environmental Protection Agency - Region 2
Information Systems Branch
290 Broadway
New York, NY 10007
Telephone: (212) 637-3325
Fax: (212) 637-3354
E-mail: nossa.george@epamail.epa.gov

Robert Eckman
GIS Network Administrator
US Environmental Protection Agency - Region 2
Information Systems Branch
290 Broadway
New York, NY 10007
Telephone: (212) 637-3324
Fax: (212) 637-3354
E-mail: eckman.robert@epamail.epa.gov