Barbara J. Blubaugh and Charlie Ware

Isolating Software Environments and Conquering Version Control - A Model for Software Development Management


ABSTRACT

Application development and maintenance poses specific challenges to system analysts and programmers. Several copies of the application software may sometimes be created to satisfy the needs of users, programmers, and system developers. Users want to work uninterrupted with the current version of the application, programmers need to be able to fix bugs and re-test the current version, developers need to write enhancements, and new releases need to be tested. Version control problems are created when all these activities happen simultaneously. Analysts are overwhelmed by the huge task of tracking multiple copies of applications software and multiple data processing workspaces. The Washington Department of Natural Resources (DNR) has developed a method to facilitate the management of the programs and data associated with large custom applications. This method supports four separate software environments and multiple data environments and is managed with UNIX system variables, symbolic links, and a menu driven DNR utility called SCUM ( Software Control Utility Manager).

INTRODUCTION

Washington Department of Natural Resources (DNR) manages approximately 5 million acres of state forest, agricultural, urban, and aquatic lands. The primary mission of DNR is to manage the resources on state lands to generate revenue for schools, universities, counties and other beneficiaries. DNR also provides natural resource protection, including regulation of all forest-related activities on 1.25 million acres of state and private lands. A GIS was installed in 1983 to provide land management information and to assist in resource planning efforts. There are over 400 users of the DNR's GIS system. They include DNR personnel, other governmental agencies, Indian tribes, environmental groups and the general public.

DNR's GIS consists of a centrally managed database and processing facilities with a large number of remote users across the state. There is a centralized, shared database that contains the DNR corporate data. A central core of programmers develop some of the major applications for the agency. These applications are for the update, display, analysis, and other uses of the corporate data. There are seven DNR regional offices that are directly linked to a centralized GIS in Olympia. These regional offices are largely responsible for the updating of data layers on the centralized database. To do so, they use the applications developed by the programming staff.

DNR's GIS hardware consists of a SUN 690 file server and SUN workstations currently running SUNOS 4.1.3. The sole purpose of the 690 is as a file server via NFS. All processing is done on 35 SUN SPARC10 and IPX workstations which are located in Olympia and are accessed by users with Xwindows and other terminals.

THE PROBLEM

The Geographic Information Section of DNR's Information Technology Division currently maintains many layers of tabular and spatial data in a centralized corporate database. These data layers are created and maintained by DNR using in-house application software. New application software is continually being developed to support new data layers or applications. Current application software must also be maintained and enhanced.

The need often arises to have several versions of certain application software. This occurs when an application that is currently being accessed by a user community is in need of a bug fix or enhancement. The users need a stable production environment in order to work uninterrupted while the enhancement is being developed and installed. This requires two versions of the application software - the current version for the users to access and a second version for programmers to install and test the programming changes.

If enhancements or new requirements are significant and take a few weeks or more to develop, a third version of the application software may be needed. During the time that the program changes are being developed, bug fixes may be necessary to the current version of the software. This would then require three separate versions of the software - one for production, one for bug fixes, and another for long term enhancements (a new release).

The existence of multiple versions of software has long been the bane of the computer software industry's existence. Much time, effort, and sanity is required to keep track of several versions, keeping them current with each other. Days of work may be lost if the wrong version is deleted, or worse yet, installed in the production environment. Maintaining several copies of the application also uses precious disk space. These problems multiply exponentially when several large application systems are being worked on simultaneously by many programmers and analysts.

DNR recently battled the out-of-control version monster while working on a new release of a large software application for the DNR's Forest Practices Division. The version monster was gaining the upper hand until DNR's computer analysts put their noggins together and developed a systematic approach to version control that works! Necessity IS the mother of invention.

THE SOLUTION: PART 1 - THE VERSION CONTROL MODEL

Our solution to version control relies on the following: UNIX environment capabilities (symbolic links and environmental variables), UNIX SCCS software, and SCUM (Software Control Utility Manager) an in-house-developed, menu-driven program for application maintenance.

The version control model supports four separate application software environments and multiple data processing environments. The application environment refers to the program storage area and data processing area specific to each version of the application software.

These environments are described as: 1) PRODUCTION, which contains the application software and data accessed by the user community to process and display data, 2) PATCH, which contains application software and data used by programmers and analysts to fix bugs and re-test software before re-installing it to the production environment, 3) DEMO, which contains application software and data accessed by a controlled set of users to test requested enhancements or new software releases before they are installed in the production environment, 4) DEVELOPMENT, which contains the application software and data used by programmers and analysts to develop enhancements to current applications, to develop new software releases, and to develop new application software.

Additional environments may be created as needed by mixing the data processing workspace of one environment with the application software storage area of another environment. For instance, DNR has identified the need for a user training environment for new software releases. If users are trained in the PRODUCTION environment, PRODUCTION data would be tainted by untrained users. In this case, a new environment called TRAIN can be created by setting pointers to the application software in the PRODUCTION environment, which contains the current software release, and creating a separate data processing workspace with a controlled set of training data. By simply toggling the data pointer to the new location (this process is explained later), a new environment can be created.

THE SOLUTION: PART 2 - THE ROLE OF UNIX

Even though the version control model uses four separate application environments, it does not require four complete copies of all application software. The PRODUCTION environment contains the production version of the application software. The PATCH environment (where bugs are fixed, and re-tested) contains UNIX symbolic links which point to the PRODUCTION environment. Only the programs that actually need to be edited, are checked out of PRODUCTION via SCUM, and edited. The changes can easily be system tested without first having to install them back in the PRODUCTION environment, since all of the PRODUCTION software is accessible from the PATCH environment by way of symbolic links.

The DEMO and DEVELOPMENT environments are set up the same way. The DEVELOPMENT environment contains UNIX symbolic links which point to all the software in the demo environment. Only the programs that need to be edited or new programs that are being added to the new release exist in the DEVELOPMENT environment. All other programs in the application are accessed through symbolic links. Figure 1 diagrams the version control model.

Software Environments in the Model

Figure 1. Software environments in the version control model.

An example of the use of these environments is shown in figure 2. An AML named program1.aml has been found to contain a bug. During its execution, program1.aml calls program2.aml and program3.aml. Program1.aml is checked out of the PRODUCTION environment and is placed into the PATCH environment. The checkout process entails removing the link from PATCH to PRODUCTION and copying the program from PRODUCTION to PATCH. The bug in program1.aml is fixed and is tested while still in the PATCH environment. Program2.aml and program3.aml can still be called from program1.aml because they exist in the PATCH environment as links to the actual programs in PRODUCTION.

Example of checkout process

Figure 2. Example of environment checkout procedure.

UNIX environment variables provide the fastest method of changing the working environment. All of the software applications being developed use a standard set of environment variables designed by the GIS staff. These environment variables are used in place of hardcoded pathnames and data locations in all of our application software (AML, C, FORTRAN, SHELL SCRIPTS). For example, the variable $EXECHOME is used to store the uppermost directory level for one of the four application software environments. Beneath this upper level directory all environments are structured identically. $EXECHOME is used as the front end of every pathname pointing to an application software storage area, i.e. $EXECHOME/soils/ aml. By using these environment variables in place of hard coded upper level directory pathnames, the program code for the location of programs and data is identical in all software environments - there's no need to change the program code in order to change the executable environment. This saves our programming staff much time and effort which can be devoted to other matters.

However, before executing any application software these environment variables must be set. A program to set these variables is executed at the front end of all of the applications that use this methodology. This program is a Korn shell script called dnrenv, which is executed as a dot script. Executing a program as a dot script enables the script to execute in the current environment. This makes it possible to retain the settings for the environment variables set by the dnrenv program until they are reset by another call to dnrenv or until the user logs out. This Korn shell script uses a switch to toggle the environments. For example the command ". dnrenv -dv" will set the user's environment variables to "point to" the DEVELOPMENT environment. Figure 3 shows examples of what is stored in the environment variables. These variables may also take on a default setting if initialized at time of login.

UNIX Environment variable settings

Figure 3. Settings of UNIX environment variables for each of DNR's software environments.

An example of the use of the UNIX environment variables is shown in figure 4. The values for the variables will depend on which software environment the user has set before initiating the program. This will determine which version of the software is used and which data set is being accessed.

Example of use of UNIX variables

Figure 4. Usage of UNIX environmental variables in an AML.

Not all users have access to all environments. Each application displays a menu for users to select the environment they want to work in. If the user is a programmer they will have the option to choose any of the software environments (PRODUCTION, PATCH, DEMO, DEVELOPMENT). If the user is not a member of the programming staff, their options are limited to the PRODUCTION and DEMO environments only.

THE SOLUTION: PART 3 - SCUM

SCUM (Software Control Utility Manager) is a menu-driven application used by the programming staff to move software from one environment to another, to get information regarding which programs are currently being edited and who is working on them, and to get software editing histories. SCUM was designed by our programming staff and written as a Korn shell script by Software Technologies, Inc., a software contractor based in Westchester, Illinois. Figure 5 shows the full SCUM menu options.

SCUM Menu Options

Figure 5. SCUM menu options.

SCUM uses the UNIX SCCS utility to "check out" software from either the PRODUCTION or DEMO areas for editing, to "check in' software that is ready to be installed back to the PRODUCTION or DEMO areas, to display software history, and to track which programmers are working on which programs. The UNIX SCCS utility also keeps track of all changes made to a piece of software each time it is "checked in" to the PRODUCTION or DEMO areas. Previous versions of the software can be retrieved at any time through SCCS. SCUM performs a number of management duties associated with administering the version control model, such as, creating and deleting the symbolic links between environments at appropriate times, setting and maintaining software access levels and SCUM privileges, and maintaining the UNIX SCCS directories.

USING ENVIRONMENT VARIABLES FOR ATOOL DIRECTORIES

The PATCH and PRODUCTION ARC atool directories can also be accessed without changing program code by using environment variables. We use two variables called $ARCATOOL1 and $ARCATOOL2 to access ARC atool directories. The contents of these variables in each environment is described in Figure 3. ARC software allows more than one atool path to be defined in the &ATOOL directive. This allows every AML program to set the ARC atool path as described in Figure 6.

Use of UNIX variables for atools

Figure 6. Lines of AML code used to set pathnames to ATOOL program directories using UNIX environment variables.

ARC software checks the location of the first pathname in the &atool directive for the existence of the AML. If the AML does not exist there, it checks the second pathname. In the PRODUCTION environment $ARCATOOL1 contains the pathname to the PRODUCTION atool directory and $ARCATOOL2 contains the pathname to the PATCH atool directory. In the PATCH environment $ARCATOOL1 contains the pathname to the PATCH atool directory and $ARCATOOL2 contains the pathname to the PRODUCTION atool directory. When executing from the PATCH environment it is possible to run a "test" or "patch" version of an atool that exists in PRODUCTION without having to change the name of the AML or the pathname to the ATOOL directory. ARC will check the PATCH atool directory ($ARCATOOL1) first and will execute that version of the AML. If the AML is simply a link to the PRODUCTION atool directory, then the PRODUCTION version will be executed. If however, the link is broken and replaced with an updated version of the AML, that is the version that is executed.

ADMINISTERING THE VERSION CONTROL MODEL

The version control model is relatively simple to create for a software application. The application is written and tested in the DEVELOPMENT environment. It would then get moved to the DEMO environment (via SCUM or simply use UNIX SCCS commands). Once an exact copy of the application resides on DEMO, the programs in the DEVELOPMENT area are replaced with UNIX symbolic links to the DEMO area. (Links are not created to the UNIX SCCS directories.) We wrote a program to do this step for us.

Next the DEMO software storage area is copied to the PRODUCTION storage area. A copy of the UNIX SCCS directories, and all of the application software will now exist in PRODUCTION. Finally, symbolic links to each program in PRODUCTION are created in the PATCH software storage area. The model is now complete.

As programs are altered in the DEMO environment to support a new software release, the DEMO environment software version becomes out of sync with the PRODUCTION environment. When the new release is ready to be installed to PRODUCTION, the DEMO software storage area is simply copied to the PRODUCTION area, overwriting the old PRODUCTION version. The links from the PATCH environment to PRODUCTION environment are then re-established. The new version of the application software is now installed in PRODUCTION.

SPECIAL CONSIDERATIONS

Some directory structures, such as INFO directories, work together as a unit and it would be senseless to create individual links to each file in these directories. SCUM handles INFO directories as a unit. INFO directories are checked out, edited, and checked in as a whole. Also, INFO programs will not compile with imbedded UNIX environment variables. Since these variables cannot be used, pathnames must be hardcoded in INFO programs. Instead of having to change these pathnames in the INFO program code by hand, we have incorporated this task into SCUM. Thus, when an INFO directory is checked out of the PRODUCTION environment and into the PATCH environment, SCUM searches for all the occurrences of PRODUCTION specific pathnames and changes them to PATCH pathnames. SCUM also enters INFO and re-compiles all the programs to reflect the pathname changes.

When INFO directories are checked back into the PRODUCTION environment, the INFO "src" files are scanned for the occurrence of PATCH specific pathnames. If any are found, a message is displayed to inform the programmer doing the installation. The programmer must then change the pathnames (by way of a program) back to PRODUCTION pathnames and re-compile the INFO programs before installing them back to the PRODUCTION environment. This step is not handled automatically by SCUM to protect the PRODUCTION environment from damage in case a program compilation is not successful.

CONCLUSION

The GIS staff developed this software management model to ease the burden of managing large software applications. Since February, 1994, when this model was installed, it has worked well and has not created any big problems. This software management technique is currently being used by eight programmers to develop and maintain one large software application consisting of approximately 160 Korn shell scripts, 60 FORTRAN programs, 110 AMLs, 10 C programs, and 260 INFO programs and input forms.

REFERENCES

Bolsky, Morris I., and David G. Korn. The Kornshell Command and Programming Language. New Jersey: Prentice-Hall, 1989.

Silverberg, Israel. Source File Management with SCCS. New Jersey: Prentice-Hall , 1992.

Environmental Systems Research Institute, Inc. AML Users Guide. Redlands: Esri, 1991.


Barbara J. Blubaugh Computer Analyst/Programmer
Charlie Ware GIS Specialist
State of Washington
Department of Natural Resources
P.O. Box 47020
Olympia, WA 98504-7020
Telephone: (206)902-1500
Fax: (206)902-1790