Palle Due Larsen

Graphics the hard way: Time Series in Avenue


Abstract

In environmental information systems one of the most important charting features is the ability to visualize temporal trends in the data material. Time series plots, however, are not available in ArcView. This paper presents a number of Avenue scripts to autoscale and display time series in the ArcView environment. In addition to this, the process of optimizing the scripts for very large data sets is outlined.

1. Introduction

Most software packages for data analysis include charting abilities. However these abilities are mainly business graphics such as pie charts and bar charts, which do not suffice in environmental information systems. For the biologist who wants to view the development in eelgrass biomass, a time series plot is crucial when estimating the current and future situations. As many other software packages ArcView does not include time series plots.

In 1995 VKI Water Quality Institute undertook in collaboration with Danish Hydraulic Institute (DHI) an assignment to build an environmental information system for Oresundskonsortiet (OSK), who is responsible for the building of a bridge/tunnel project between Denmark and Sweden. This system EAGLE /1/ would heavily rely on time series plots to visualise trends in data such as eelgrass biomass, mussel biomass, sediment spill and the like. To this end, it was first attempted to develop a solution using a DDE connection to an external graphics program. However it soon became clear that a solution using the built-in graphics capabilities of ArcView was both possible and feasible. This paper deals with the development and application of the scripts used to view time series plots in the EAGLE system.

2. Plotting basics

Figure 1 shows the execution flow of the main script TS.Graph. This script is passed all the necessary parameters, and from there the task is broken up in subtasks, that are located in other scripts, all starting with the TS-prefix.

2.1 Display setup

The basics task of the plotting script is to set up a view and obtain the tools to plot lines, markers and texts on it. This is obtained by creating a new View and using its GetGraphics request to get a pointer to the graphics of the view. This request returns a list that can be used to add graphic shapes to the view. Graphic shapes are created using the GraphicShape.Make request as in figure 2. For performance reasons graphics are added using the AddBatch request instead of the Add request, as the Add request generates a repaint of the view. Using AddBatch it is possible to add all graphics causing only one repaint of the view.

aLine = Line.Make(0.1@0.1,0.5@0.5)

gLine = GraphicShape.Make(aLine)

theGraphics.AddBatch(gLine)



figure 2: The process of adding a line to the view.

2.2 Coordinate system

It was decided that the coordinate system should be independent of the size of the view. Hence all graphics must be scaled to fit the view they are to appear in. This can be obtained by considering the view as 1 unit by unit. This is actually the default coordinate system of a newly created view. To draw in these coordinates, we came up with a design where we defined variables for all the "anchor" points and dimensions of the graph. We defined variables XaxisLength, YaxisLength, OriginX, OriginY, TickLength among others, see figure 3. All plotting takes place relative to these values. For instance plotting a X-axis consists of drawing a line from the point (OriginX, OriginY) to the point (OriginX + XaxisLength, OriginY). Then for each tick mark a line is drawn from the point made up by its x-coordinate and OriginY to the point made up by its x-coordinate and OriginY+TickLength. How the tick marks are established, we will see later.

By defining these constants it is easy to change the size of the plot. When we found that in a special situation it was necessary to display three time series independently, we just put up three views with a plot in each of them, and scaled the views to fit the screen. The graphs scaled according to the view size.

Width = theExtent.GetWidth

Height = theExtent.GetHeight

OriginX = Width * 0.1

OriginY = Height * 0.2

XaxisLength = Width * 0.8

YAxisLength = Height * 0.6

TickLength = XaxisLength * 0.02

MarkerSize = XaxisLength * 0.015



Figure 3: Defining "anchor points" and dimensions

2.3. Defining the interface

The interface to the TS.Graph script is rather complicated. All options are transferred as parameters to the script. There are quite a lot of parameters as it is crucial that the script is as flexible as possible to allow it to be used in a variety of areas. The module plots data from Arcview's internal data type the VTab, and it also has the option to plot a limit passed as a single value as a lnie. VTabs and constant values can be plotted using several different line styles, marker styles and colours. Furthermore there is an option to plot normal or logarithmic axes.

It was decided that these options should be passed as parameters to a central script, rather than utilizing global variables for this. In this 20-40 parameters are passed to the script, depending on how many data series there are on the plot. This makes calling TS.Graph rather complicated, but the script is meant for in-house developers, not for end-users, so clarity is less important.

3. Drawing the Time Series

3.1 Axis scaling

Special interest was taken in scaling of the axes. Illogical scaling is to our experience an annoyance to users. Especially scaling of date axes must be done in a comprehensible way. To this end three different scripts have been developed. Each receives the endpoints of the particular axis and returns a list of points to draw tick marks at, and the labels to draw with them. The points are calculated from how much of the logical axis they take up. This fraction is multiplied by the length of the axis to obtain the actual point where the tick should be located.

Scaling of the date values on the x-axis is done by the script TS.ScaleTim. Usually when scaling, it is possible to determine a start value, an end value and a step value. On a time series axis however, there is no such thing as a step value. If for instance the aim is to display a tick and a label the first day in every month, the "step" would have values such as 31, 28, 31, 30. So instead of a step the script must find "nice" dates to return. These dates will depend heavily on the time range. If the time span of the plot is a week, a tick will be shown for every day. If time span is a year, once a month or every second month is more appropriate. Therefore a scheme has been setup to let "custom logic" decide the plotting. We decided that certain ranges would return certain dates as tick marks. The ranges were tested for usability, and adjusted accordingly. This process continued through the beta testing of EAGLE.

On the Y-axis we used some code, that we have used in several other applications. The general idea, which goes back at least 15 years, is to divide the minimum and maximum values by 10 till they are in the range of 0 to 9, then set predefined values for minimum, maximum and step value and multiply them by 10 till they are back in the previous range. It was fairly simple to translate the Pascal equivalent into Avenue. The only problem was we did not know how to raise a number to a power. Not until one of the final modifications, did we learn of the ^-request

As it turned out all logarithmic axes would need to be scaled from 1 to 1000. Hence the script TS.ScaleLog returns a hard coded list of tick marks and labels. If necessary a slight modification to the TS.ScaleVal script could be made, so that it could return logarithmic values.

3.2 Drawing the Curve

The Curve is generated by a script called TS.GenCurve. It cycles through all the points of the all the VTabs and computes an x-point and a y-point. These values are returned to TS.Graph as a list. The points are calculated from the axis minimum and maximum and the anchor points. For instance the y-point is OriginY+y/(MaxY-MinY)*YaxisLength. When the list is returned to the main script, lines are drawn from point to point and markers are added.

4. Optimising for better performance

When the first version of the time series script was finished performance on an average plot was satisfactory. However a certain spill graph had 7000 points or more. This slowed down repaint time to about 17 minutes on a Pentium 75mhz. We were forced to face this problem, as a plot of this variable would be viewed every day. First attempt was to use AddBatch to add the graphic shapes to the view instead of Add. This way the view was only repainted once.

Next all loops were examined closely. It turned out that calling another script using av.Run was quite time consuming. So small scripts that were there for the clarity of the code, were sacrificed to optimising the loops. Another concern was that of repetitive calculation of values. Being used to programming in Pascal, were the compiler optimises repetitive expressions, I was not aware of this in the beginning. Avenue is not a compiled language, so it turned out that there was quite a difference, whether constant values were calculated before the loop, or within the loop. After these optimisations the script was able to plot 7000 points in 45 seconds, which is still a long time, but acceptable. Most time series still have less than 50 points.

5. Application

The time series scripts have been utilised to plot about 50 different variables in the EAGLE system. Among these are biological variables eelgrass growth, and mussels biomass as well as physical parameters such as sediment spill. The range of the X-axis is 24 hours at the least and up to 10 years at the most, and all date axes display in a comprehensible format. Y-values range from -1000 to several million. An example of a plot can be seen in figure 4.

After conclusion of the time series plot script, it became clear that ArcView was not able to display profile plots. Profile plots are used to visualise a parameter through the water column, hence depth is the Y-axis and the parameter is displayed along the X-axis. They are special in the way, that there can be several Y-values to each X-value. When we found out that ArcView's built-in charting capabilities would draw lines between the points in profile plots, TS.Graph was adapted to produce profile plots as well.

Figure 4:

6. Conclusions

The time series scripts have proved fast and efficient in the application they are a part of. Also they have proved flexible and easy to modify. Though an interpreted language the graphics primitives of Avenue are strong enough to support complex graphics. The time series scripts alone make up 900 lines of Avenue code in a very large project, yet the graphics are fast enough in most cases. It is my belief that the original DDE-based solution would have been considerably slower, especially for small data sets as a result of longer startup times.

7. References

/1/ René Andersen: EAGLE - A GIS FOR ENVIRONMENTAL IMPACT ASSESSMENT in Geographical Information, Vol 1. 1996 IOS Press


Palle Due Larsen, Programmer, B.Sc.
VKI/Water Quality Institute
Agern Allé 11
DK-2970 Hørsholm
Denmark
Phone +45 4286 5211
Fax +45 4286 7273
e-mail pdl@vki.dk