Christi Stevens

MANAGING PARCELS WITH ARCSTORM

Abstract: This paper highlights experiences developing an ArcStorm PARCEL management system. Managing parcels in an ArcStorm environment holds many advantages over the ArcInfo Librarian system. Concurrent transaction management and feature locking show great potential but bad performance becomes an issue when managing a large, multi-feature parcel database.

Improvements in performance will go a long way in moving ArcStorm forward to being a truly functional transaction management and storage facility for spatial data. Described in this paper you will find the results of numerous benchmark's that show how dual or multi-CPU processors may help. As well you will find tips that may help in the development of your own ArcStorm PARCEL Management system.


  

Introduction  

  

Because of a much needed reworking of the Louisville & Jefferson County   

Information Consortium's (LOJIC) property database, and the pending   

implementation of ArcInfo 7.0, the Property Valuation Administration   

(PVA) GIS PROPERTY database was redesigned in 1995.  Originally housed in   

the ArcInfo Librarian database subsystem, the PROPERTY Library consisted   

of 990 tiles.  Now residing in ArcStorm 7.03, the database consists of 45 tiles, 

representing approximately 300,000 land parcels.  The data was originally 

digitized from various PROPERTY block maps, of  differing scale and accuracy,

and graphically adjusted to fit the existing planimetric basemap data residing in 

the Consortiums'  GIS.  (For a description of the PROPERTY database schema's 

see Appendix A.)  

  

The primary goal of the PVA PROPERTY Management System is to update and   

maintain property data that accurately describes ownership, shape and   

location of land parcels in Jefferson County for tax assessment purposes.    

Faced with a software upgrade to ArcInfo 7.0, a parcel database and   

management application originally designed in 1991, and a much needed   

reworking of the original parcel database design,  a transition to the new   

ArcStorm Spatial Data Management System had to be considered.     

  

At this time, the data was far from user friendly.  A complex coding system was

used to assign descriptive line attributes in 127 categories of various combinations

of block and lot lines.  To coherently symbolize the data graphically, an equally 

complex look up table was needed to determine the type of line you were working 

with.  For example, an attribute code used for symbolizing line features with a code 

of 1, meant the line was a street right of way.  A code of 14 meant a line feature 

represented a railroad right of way, a subdivision boundary line, and also a tax block 

boundary line, etc.  Making matters worse, much of this information was repeated in

other GIS layers found in the LOJIC GIS libraries.  

  

After almost 10 years of collecting and refining the GIS, LOJIC data was maturing, 

and the number of users were increasing rapidly.  It was becoming exceedingly 

important to provide Consortium members with easy access to LOJIC data.  Our

goals were changing from a data creation oriented focus to a focus on our clients 

need for more user friendly data and easier access to that data.  To meet this 

responsibility it was necessary to redesign the PROPERTY database.  It was only 

after lengthy investigation that ArcStorm was chosen as the new PROPERTY

Management System.  By highlighting the challenges experienced during these 

processes, this paper will reveal practical development practices and methods that 

may be beneficial to anyone wishing to migrate to a new spatial data management 

system.  

  

ArcStorm was chosen as the PARCEL Management System for a   

variety of reasons.  Because there are as many as 8 PVA staff requiring   

concurrent access to update  Jefferson Counties PARCEL data, concurrent   

transaction management is critical.  Additionally, the ability to lock   

features without requiring extraction of an entire tile and the resulting   

unavailability of extraneous features, meant less processing and scheduling   

time, and system recovery mechanisms would allow the database to be   

returned to a consistent state in case of system problems experienced   

during transaction check ins.  

  

Benefits and Challenges of ArcStorm  

  

Prior to making the decision to migrate from the ArcInfo Librarian   

system, a partial copy of the PROPERTY database, representing about 

30,000 parcels, was loaded into ArcStorm 7.0 for integrity and performance   

testing by LOJIC staff.   Working closely with PVA, LOJIC staff spent 2 

months developing the initial application interface and testing ArcStorm 

performance and integrity before the commitment was made to migrate to the 

ArcStorm system.  This testing was followed by 6 months of client interviews, 

demos, and application development before the new ArcStorm PROPERTY 

Management application rolled out to PVA for preliminary use.    

  

The original property data consisted of three layers.  The primary layer,   

the parcel polygons,  consisted of both arc and polygon attributes.  The   

second layer consisted of historic and block boundary lines, many of   

which were duplicated in the PARCEL layer.  And the third layer,   

representing administratively merged parcel polygons, was called tax   

areas.  Because the tax area data could be easily derived from the existing   

PARCEL layer, the decision to drop this data was made.  With the exception   

of block lines that intersect right of ways, block boundaries could also be   

derived from the PARCEL layer.  The line attributes were simplified as well,   

resulting in only two data layers, the PARCEL layer, and the historic and   

non-coincident block line data, called the Propline layer.  The resulting   

database schema's are shown in Appendix A.  

  

Implementing the new database design proved to be a challenge.  First the   

data had to be modified to meet the objective of providing spatial   

information that was easy to interpret and use.  Aside from the initial   

problem of implementing the changes to the parcel data itself, it was   

immediately evident that the continuous nature of  the right of way   

polygon was going to be a problem.  The 990 tile Librarian grid was used   

to isolate right of way, thereby limiting the impact of transaction   

checkouts.  Block boundary lines were also taken from the historic and   

non-coincident block line layer, called PROPLINE, and used to further   

isolate the impact of right of ways.   Because  Jefferson County contains   

some 300,000 parcels spread across approximately 385 square miles,   

various tiling structures were tested and tuned for optimum checkout and   

check in times.  After numerous tests, the data was partitioned by feature   

density into 45, square and rectangular tiles.   

  

The next step was to measure performance.  Transaction check out and   

check in times were measured over the WAN, with the ArcStorm   

database residing on a remote server.  Due to a shortage of disk space,   

preliminary testing was limited to a partial copy of the PROPERTY database   

consisting of approximately 30,000 parcels.  Preliminary benchmark's   

looked good.  Scheduling and locking went smoothly and the nuts and   

bolts of working with property data in an ArcStorm environment were   

being worked out.  Concentration soon shifted from working on the data   

itself and getting it right for ArcStorm, to application development.  

Many challenges followed and various difficulties with ArcStorm were   

solved with creative programming, but  as of this writing, others are   

pending.  A few of the challenges faced during the various processes of   

migrating to ArcStorm are outlined below:  

  

 - Transactions involve two layers, both with annotation, the primary   

PARCEL layer and the secondary layer used to archive historic property   

lines called the PROPLINE layer.  This layer may have areas that when   

overlaid with their related parcels have no features.  In this case it is   

necessary to copy the  .bnd file of the transactions primary PARCEL layer   

to the transactions historic layer.  Oddly, when two or more users   

check out an area with no historic features, ArcStorm always   

schedules the locks to tile_1.  This happens regardless of were the   

transaction location falls in the tile grid.  

  

-  During the locking phase, viewing users who wish to draw parcel data   

residing in the ArcStorm database experience a "wait state" when   

nothing happens.  This occurs even if the view is well away from the   

area being locked or unlocked.  Users accessing data for view and   

query that resides on the ArcStorm PROPERTY database server could   

experience waits in excess of 20 minutes during locking activity.  To   

alleviate this problem, the property data is copied nightly into a   

Librarian structure specifically for view and query access.    

   

-  After PVA staff were on line with the new ArcStorm PARCEL   

Management system for a couple of months, unrelated feature locking   

started to occur and became more and more frequent as time went on.    

As a result, ArcStorm client processes would bottleneck and the   

asmaster would have to be killed and restarted.  Monitoring   

transaction activity in Schemaedit showed that non-adjacent and   

unreasonably high numbers of tiles were being locked during check   

out.  Further analysis showed that duplicate object-id's existed on

distinctly different objects across the database.  To correct the problem 

the database had to be rebuilt, and to prevent it from recurring, the 

object-id's for all feature classes are calculated to 0 prior to check-in.

  

-  For reasons unknown, the database periodically goes into an   

unrecoverable state.  RECOVERDB must be used to put the  database   

back into a consistent state.  

  

-  For reasons unknown, the wservice sometimes dies leaving an   

orphaned asmaster.  The orphaned process must be killed before the   

wservice can be restarted.  These episodes became less frequent after   

moving the ArcStorm server to a dual CPU machine.  

  

-  Transaction recovery during check in is sometimes unable to return   

message that transaction check in was successful causing confusion as   

to whether the transaction was successful or not.  

  

 - Generally bad check out and check in performance times.  

  

Continued bad performance resulted in extensive benchmarking run on   

different system configurations for indications of performance and   

stability.  Because some ArcStorm processes are extremely I.O. intensive,   

while others are CPU intense, the optimum platform was anticipated to   

be one with dual or multi-CPU's and very fast disk speed.  Results of these   

benchmark's follow.  

  

Benchmark's  

  

Benchmark's were performed against the ArcStorm PARCEL database on 4   

different system configurations. Client processing time was subtracted   

from transaction check out and check in times to more accurately gauge   

the performance of the ArcStorm server itself.  The tables found below   

list various server times for transaction check outs and check ins.  To   

ensure consistency, the following parameters were followed:  

  

  Check-in times were modified by subtracting 30 seconds from all   

times to account for processing that takes place locally on the client   

machine.     

   

  To eliminate potential CPU usage by extraneous server processes, all   

testing was done after hours while CPU usage was flat and no other   

processes were running against these machines.    

   

  To keep client performance constant and accurately measure the PVA   

ArcStorm servers performance, all testing was done on the same   

SPARC Station IPX machine.  

   

Hardware configurations tested include:  

  

  ULTRA SPARC, Model 170, 33 MHz, Single CPU  

   

  SPARC Station 20, Model 71, 75 MHz, Single CPU  

   

  SPARC Station 10, Model 41, 33 MHz, Dual CPU  

   

  SPARC Station 10, Model 20, 33 MHz, Single CPU  

  

Transaction logs accumulated since PVA went on line with the ArcStorm   

database server were analyzed to determine actual processor time   

occurring on the server during check out and check in processes.  With   

the exception of Graph D, all graphs show time in minutes plotted over   

the number of days that the original, single CPU ArcStorm server went   

on-line at PVA. Graph D illustrates the difference between processing   

times for the original PVA ArcStorm single CPU server, versus the dual   

CPU server now used.  This data was tabulated and graphed into four   

categories, described below:  

  

- Graph A:  Total transaction times on the original PVA ArcStorm   

single CPU server.  

   

-  Graph B:  Total check-out time on the original PVA ArcStorm single   

CPU server .  

   

-  Graph C:  Total check-in time on the original PVA ArcStorm single   

CPU server.  

   

-  Graph D:  Total transaction times on the original PVA ArcStorm   

single CPU server compared to the currently used, dual CPU PVA ArcStorm   

server.  



(Graphs B. and C. represent a users point of view.  If three transactions start at the   

same time, all of which finish in 15 minutes, these graphs show 3, 15 minute   

transactions.)

  

Performance Graphs  

  

Graph A)    

  

Graph A, shows total average PVA ArcStorm server transaction   

processing times for both data check outs and check ins.  This data was   

derived from the actual PVA transaction logs and shows time in minutes   

over the number of days the PVA's ArcStorm database was in use on the   

single CPU server.  Average transaction time is approximately 8.5   

minutes.  The extreme spikes that occur showing abnormally long   

transaction times appear to correlate to times when bottlenecks occurred   

due to unusually high numbers of simultaneous transaction processes.  

  

Graph B    

  

Graph B depicts total ArcStorm transaction times for check-out   

processing occurring on the original PVA single CPU ArcStorm server.    

Again the extreme spikes appear to correlate to times when unusually   

high numbers of simultaneous transactions were occurring. The variance   

seen in the transactions times is due to fluctuating numbers of   

simultaneous transactions.  

  

Graph C)    

  

Graph C shows total ArcStorm transaction times for check-in processing   

occurring on the original PVA ArcStorm single CPU server.  Again the   

extreme spikes appear to correlate to times when unusually high numbers   

of simultaneous transactions were occurring.  The variance seen in the   

transactions times is due to fluctuating numbers of simultaneous   

transactions.    

  

Graph D)    

  

The final graph, Graph D, shows total transaction time for both check-in   

and check-out processing on both the original, single CPU PVA   

ArcStorm server, and the new, temporary replacement, dual CPU   

ArcStorm server.  Times were plotted in minutes over a 20 day period.    

The first 10 days show the original single CPU server times.  The   

remaining 10 days show processing times on the dual CPU server that is   

currently being used until a permanent solution is identified.  This graph   

is the most revealing as it shows a very distinct improvement in   

transaction processing time since the installation of the dual CPU   

ArcStorm server.  

  

ArcStorm Server Performance Measurements Under   

Increasing CPU Load  

  

As stated previously, graph information was derived from the ArcStorm   

server logs and show actual computer processing time.  From the user   

point of view, times may not correlate directly due to many factors   

including; network traffic, sub-standard workstations, multi-tasking, and   

other daily tasks that can impact the performance of the local machine.  

  

To gauge ArcStorm server performance under increasing load, the   

following benchmark's were performed showing average times for 1 to 3   

simultaneous transactions during check-out and check-in processing on a   

SPARC Station IPX workstation.  Currently, up to 7 transactions can   

occur simultaneously, one for each PVA client.  Because this scenario is   

extremely rare and difficult to model and analysis of PVA logs shows that   

the average maximum number of simultaneous transactions is 3, 3   

simultaneous transactions were chosen as the upper limit for these   

benchmark's.  

  

Three Simultaneous ArcStorm Transactions  

  

ULTRA SPARC  

  

Check Out Time	Check In Time  

  

1:48                       7:40  

2:40                       9:10  

2:59                     13:40  

Avg. = 2:27         Avg. = 10:00  

  

SPARC 20  

  

Check Out Time	Check In Time  

  

4:57                       5:00  

4:58                       6:17  

5:09                       7:55  

Avg. = 5.01         Avg. = 6:24  

  

SPARC 10, Single CPU  

  

Check Out Time	Check In Time  

  

7:04                      10:23  

7:05                      13:33  

7:06                      17:00  

Avg. =7:05           Avg. =13:38  

  

SPARC 10, Dual CPU  

  

Check Out Time	Check In Time  

  

1:55                        4:40  

2:19                        6:33  

3:10                        7:55  

Avg. = 2:28          Avg. = 6:03  

  

Two ArcStorm Transactions  

  

ULTRA SPARC  

  

Check Out Time	Check In Time  

  

1:40                       6:56  

2:20                       8:16	  

Avg. = 2:00          Avg. = 7:18  

  

SPARC 20  

  

Check Out Time	Check In Time  

  

3:28                        4:15  

3:31                        5:30  

Avg. = 3:30          Avg. = 4:53  

  

SPARC 10, Single CPU  

  

Check Out Time	Check In Time  

  

6:00                        8:50  

6:02                      11:37	  

Avg. = 6:01          Avg. = 10:14  

  

SPARC 10, Dual CPU  

  

Check Out Time	Check In Time  

  

2:05                        3:19  

2:15                        4:25  

Avg. = 2:10          Avg. = 3:52  

  

One ArcStorm Transaction  

  

ULTRA SPARC  

  

Check Out Time	Check In Time  

  

Avg. = 1:10         Avg. = 3:30	  

  

SPARC 20  

  

Check Out Time	Check In Time  

  

Avg. = 1:55          Avg. = 3:00  

  

SPARC 10, Single CPU  

  

Check Out Time	Check In Time  

  

Avg. = 4:00         Avg. = 9:00  

  

SPARC 10, Dual CPU  

  

Check Out Time	Check In Time  

  

Avg. = 2:30          Avg. = 4:30  

  

Check-Out Benchmark For Local Processing Times  

  

With one exception, client or local processing time used during ArcStorm   

transaction processing is insignificant.  The exception occurs during the   

selection process when the client chooses the area of interest to be   

extracted from the ArcStorm database.  Client processing during these   

times varies greatly according to the client system.   The following 4   

commonly used system configurations were measured and are listed   

below:  

  

SPARC 20 - 1:15  

SPARC 5  -  1:32  

SPARC IPX - 4:32  

SPARC IPC - 5:20      

  

Conclusion  

  

It is evident that multi-CPU processors are the optimum choice for   

ArcStorm processing.  The dual processor, is approximately twice as fast,   

although some degradation occurs slowly under increasing load.  It should   

be expected that a dual processor SPARC20 or ULTRA SPARC or a   

multi-processor server will perform best, but will also show some   

degradation when transaction processing occurs simultaneously.  

  

Another factor effecting the performance of the ArcStorm PARCEL database   

tasks performed at PVA are sub-standard workstations.  Currently only   

one PVA staff member has access to a SPARC Station 5, Model 70   

machine, the minimum system recommended by Esri.  Upgrading  PVA   

staff workstations in addition to using a dual or multi-CPU dedicated   

server may go a long way in increasing the productivity of the PVA's GIS   

staff.  

  

Many lessons were learned about application and database development   

through this experience.  Utmost is the ability to emulate how the client   

will use the data as much as  possible.  Client load and database size can   

significantly impact the performance of any system.  If the impact of new   

technology causes significant performance decreases, weigh carefully any   

decision to migrate to a new platform.  The investment in development   

time is too costly to turn out a product that adversely impacts   

productivity.  As a general rule, as software development progresses, the   

need for new, faster and more powerful hardware becomes something to   

consider.  The success of LOJICs' conversion of the property data from a   

Librarian system to an ArcStorm system depends heavily on the ability to   

acquire appropriate hardware.   

  

If data integrity is an issue in your current configuration, then a minor   

decrease in performance time may be acceptable.  In this respect,   

ArcStorm holds many benefits over ArcInfo Librarian.  Database   

recovery  mechanisms ensure that the data will revert back to a consistent   

state if the transaction fails to successfully check in, but performance   

decreases can be remedied if budget allows for improved hardware.  

  

In the PVAs' case, productivity has increased.  To increase productivity even

more, the ArcStorm PROPERTY data server was changed from a single CPU

processor to a dual processor.  Plans are under way to acquire a dedicated dual

or multi CPU server, and client machines are being upgraded as well.   

  

Acknowledgments  

  

Special thanks to the property mapping staff at the Property Valuation Administration   

of Louisville and Jefferson County KY for their valuable feedback and help in

making this project a success.

  

Appendix A  

Property Library Data   

Dictionary Property Library Data   

Dictionary Property Library Data   

Dictionary Property Library Data   

Dictionary Property Library Data   

Dictionary Property Library Data   

Dictionary Property Library Data   

Dictionary Property Library Data   

Dictionary Property Library Data   

Dictionary Property Library Data   

Dictionary Property Library Data   

Dictionary Property Library Data   

Dictionary
  

  

Christi Stevens GIS Analyst LOJIC 700 W Liberty St Louisville,KY 40203-1913 (Phone) 502-540-6383 (Fax) 502-540-6562
xiii