Per Thorlacius

Time-and-space modelling of public transport systems using the new features of the ArcInfo 7.1 NETWORK module

Modelling the movement of vehicles and passengers in time and space in public transport systems is generally a rather difficult task. When finding the shortest path through a public transport network, not only the link impedance has to be taken into account, but also the time at which the link occurs, i.e. the time the bus actually rides the actual segment. This type of modelling requires an otherwise complex topological network. This paper describes how the new features of the 7.1 NETWORK module can be used to create this kind of topology, and discusses the experience gained from a large-scale implementation.
 

Contents


Introduction

When comparing public and individual transport systems one significant difference is the degree of dependence between the travel duration and the exact time one wants to travel. In public transport systems these parameters are closely related to each other, for individual transport this is much less the case.

This difference makes the movement of persons and vehicles in public transport systems more difficult to model than in individual systems.

The duration of a travel in a public transport system (the travelling time) is defined as the sum of the terms waiting time, in-vehicle travel time and interchange time (1).
 

 
(1)
These terms again, are defined as follows (with busses as an example): When examining public transport the time aspect has to be taken into account both in a relative and in an absolute way: The three different types of times defined above are all relative, but they are determined by the absolute time at which the passenger wants to travel and the absolute time at which a connection between the origin and destination stops actually exists.

Conventional models of public transport systems do not take the exact absolute time into consideration and therefore only average values of travelling times can be calculated. Furthermore these models can neither take connection correspondences nor connection frequency distribution in time into account. A conventional model can hence not be used to answer questions like Which is the fastest way to go from A to B, Wednesday morning at 8:17?

To do this, a model of a more complex topological nature is required. Complex networks of this kind have been rather difficult to represent in a GIS prior to ArcInfo 7.1. However, using the data structure proposed in this paper, the new topological features of the ArcInfo 7.1 NETWORK module provide a simple way of solving this problem.


1. Conventional Models for Public Transport Systems

In conventional models of public transport systems each transit line is typically represented by a set of links between the stops that that particular line visits. [Harris et al. (1992), Ortúzar (1994), Hansen (1996)] This type of representation is shown in figure 1.

Figure 1 shows two transit lines serving seven stops in all. Line 1 starts at A, ends at E and visits the stops B, C and D. Line 2 starts at I, ends at H and also visits the stops B, C and D. The representation is not directed, each line on the figure and accordingly in the model represents both directions of that line. To represent the two transit lines in the figure, eight links are needed in the model.
 

Figure 1: The traditional way of modelling public transport in a transport model

These links have feature attributes describing the in-vehicle travel time and the frequency by which the service of transport is provided. The frequency is calculated for the period of time the model is valid, e.g. the morning rush or the whole day. The commonly known transport models EMME/2 and TRIPS both follow this principle. [Ortúzar et al. (1994), HTM (1996)]

The in-vehicle travelling times are taken directly from the link attributes, waiting and interchange times at the stops are calculated from the link frequencies. However, since the frequencies represent the number of connections between stops for a certain period of time, the calculated waiting and interchange times are correspondingly average values for that specific period of time.

Neither can a model of this type take correspondences between transit connections into account. Therefore, models of this type often lead to particularly inaccurate results when used in transit systems with low frequencies, i.e. in rural areas where the few connections often are equally well co-ordinated to correspond with each other.

Furthermore, a model of this type can not take the frequency distribution in time into account. If the model covers the whole day, the travelling times calculated with the model will correspondingly be average travelling times over the day. Typically, the waiting and interchange times in the rush hour are shorter due to the high frequencies with which the services are provided than for example in the evening. Equally the in-vehicle travel times are typically longer in the rush hour due to congestion.


2. The Proposed Public Transport Model

To calculate travelling times more accurately, a different type of model is hence needed. The data model proposed in this paper is called the Three Dimensional Vehicle Run Representation, and is developed by NERI as a part of the ALTRANS project.

2.1. The Principal Concept of the Proposed Model

The conceptual idea behind the proposed model is to look upon a transport model as a projection of streets, bus and train lines onto the plane. In this sense, conventional transport models are two dimensional models.

The idea behind the proposed model is to represent time as the third dimension, and to take each physical vehicle running on the transit lines into consideration. Each vehicle is then represented by a link between points not only in the plane but also in time. The concept of the proposed data model is shown in figure 2.


Figure 2: The proposed 3D vehicle run representation of modelling public transport

The lower part of figure 2 shows the maps of the bus lines from figure 1. Above this map a grid of dotted lines is spanned. The perpendicular lines in this grid represent the same position in the plane, the horizontal lines the same position in time.

In this grid the links of three different vehicle runs are shown. A vehicle run is defined as the movement of a physical vehicle (train, bus, etc.) in one direction from the first stop the vehicle visits to the last. The links thus represent the physical movement of the individual vehicles in time and space.

Due to the fact that one link is needed for each vehicle run between two stops, a total of twelve links is needed to model these three vehicle runs. As opposed to the model shown in figure 1, the proposed model is a so called directed graph as all links are one-way links. [Cormen et al. (1989)]

For reasons of simplicity only the vehicle runs in the general direction of the X-axis are shown in figure 2, in reality vehicles would be running in both directions on each particular line. Note that the direction of the time axis on figure 2 is down, so that the only possible direction of movement is downward.

In figure 2 line 1 runs from A to E via B, C and D and vice versa. The bus connection with the vehicle run number 1a, departs from A at 8 o'clock and arrives at H at 8:20. On the way the stops D, C and B are visited at 8:05, 8:10 and 8:15 respectively.

If a passenger wants to travel from A to I, and arrives at the stop at A at 8:05, he or she has to wait five minutes before the first bus is departing from the stop at 8:10. This bus has vehicle run number 1b and arrives at B at 8:25. Here the passenger has to wait five minutes until a connection to I is provided at 8:30. This connection arrives at I at 8:35. The total travelling time is thus 30 minutes, of which five minutes are waiting time, five minutes interchange time and 20 minutes in-vehicle travel time. The same journey could also be accomplished by changing busses at D or C instead of B.

2.2. The Representation of Changing Possibilities in the Proposed Model

A model as the one shown in figure 2 can be used to determine the movement of vehicles in time and space. However, to model the movement of passengers, changing possibilities at each stop must be provided in the model. These possibilities are represented by the so called interchange links.

In conventional models for public transport systems, each stop is represented by one node, in some models by one node for each line serving the stop. In the proposed model however, there is one node for each unique point of time at which a vehicle arrives or departs from the stop. A transit stop is then represented by a group of nodes as shown in figure 3.


Figure 3: The representation of a transit stop in the proposed model as a group of nodes

As shown in figure 4, the representation of interchange links in a model like the one proposed can be accomplished using several principles.

Figure 4: Different principles of representing the interchange links (grey) at each stop

a) The arriving and departing links can be coupled in a combinatorial manner, so that each arriving link is connected to all of the departing links. If n denotes the number of arriving links, and m the number of departing ones, this results in a total number of interchange links i given by (2).
 

 
(2)
One can assume, that the number of arriving vehicles equals the number of departing ones, otherwise the vehicles would accumulate at the stop. Therefore it is reasonable to assume, that:
 
 
(3)
b) It is not possible however, to change to a connection that has already left the stop. If this is taken into account, the number of interchange links needed can be reduced. Assuming that the number of departing vehicles approximately equals the number of arriving ones, and that the distribution of arrivals and departures is approximately equal, the approximate number of interchange links needed is then given by (4). [Spiegel (1968)]
 
 
 
(4)
c) It is however, possible to reduce the number of necessary links additionally by using a serial approach. If the arriving and departing nodes are ordered in time, each node can be coupled to the next one in time. This gives a total number of necessary interchange links given by (5).
 
 
(5)
Figure 5 shows the path for a change from arrival number 2 to departure number 4 for each of the three interchange link representation principles shown in figure 4.
 

Figure 5: The same path (red) through the graphs of the different interchange link representation principles

The number of necessary interchange links from the different principles of representation can be seen in figure 6. As seen, it is associated with significant advantage to chose the last representation principle, as the number of necessary interchange links only increases proportionally to the number of arrivals. Using the first two representation principles the number of necessary interchange links increases with the square of the number of arrivals.

Figure 6: The number of interchange links needed (i) for the different types of interchange link representation as a function of the number of arrivals to a stop (n)

The PATH command of ArcInfo uses the modified Dijkstra path finding algorithm with a binary heap priority queue. According to Cormen et al. (1989), the complexity of this algorithm for sparse graphs is given by (5), where P is the number of points (nodes) in the graph, E the number of edges (links) and O the order of complexity.
 

 
 
(6)
It may therefore be concluded that both computational and storage advantages are obtained by choosing the interchange link representation principle c).

Principle c) however, has a small disadvantage due to its serial nature. The disadvantage is, that a path finding algorithm will need slightly more iterations to find a path, because an interchange in the model may be represented by a series of several links. This is seen from figure 5. Using principle a) and b) each change is represented by a unique link.

For practical reasons, the interchange links are represented as spirals, which when projected onto the plane as on figure 7 below, become circles.

2.3. The Calculation of Waiting Times in the Proposed Model

As with conventional transport models, it is necessary to connect the network with points at which travels (trips) have their origin and destination, i.e. where travels are generated and attracted. As with conventional models this is done with connection links to the centroids of the so called transport analysis zones, as shown in figure 7. In the proposed model however, these links are also used to calculate the waiting times at the stops, and to model the changes between services using separate networks, e.g. bus and train.
 

Figure 7: Objects in the implemented model

Figure 8a shows the two types of connection links to zone nodes (grey) in the model: The arriving links to the zone node (dark red) and the departing links from the zone node (dark green). Figure 8b shows the interchange links (puple) and figure 8c all the link types in the model around a stop attached to a zone node (zone centroid).
 

Figure 8: Link types of the proposed model

The link impedances on the connection links going from the zone node to the departing links are calculated dynamically according to the exact time at which the travel starts, i.e. the time the passenger wants to travel. The travel start time is subtracted from the departure time of the nodes of the departing links. The result of this subtraction is the time the passenger has to wait from the travel start time until each vehicle departs. The impedance of the connection links is thus the waiting time. If the waiting time for a particular departure is negative, that departure has already left the stop, and the link can not be traversed.


3. Implementation

The model described in chapter 2 has been implemented for the use in relation to the ALTRANS project, initiated by the Department of Policy Analysis at NERI.

The objective of the model is to calculate travelling times corresponding to a set of survey travel data (trips). These travelling times are then used to calibrate an econometric transport mode choice and trip length generation model to investigate political instruments that can lead to change of transport behaviour, and the environmental changes as a result hereof.

3.1. Topological Implementation

The proposed 3D representation is implemented using the new topological features of the RENODE command in the ArcInfo 7.1 NETWORK module. With these features, it is possible to build line topology having more than one node at the same position in the plane.

The third dimension in the model is established through this topological capability. Line topology is built with the RENODE command using elevation items from the arc attribute table (AAT). Thus, the departure time for each link (arc) in the model is used as  from-elevation-item and the arrival time as to-elevation-item.

Due to the fact, that arcs of zero planar length are not permitted in ArcInfo coverages, the interchange links can not be implemented as straight arcs from one node to the next in 3D. Therefore the interchange links are generated as circles. These circles however, represent the projection of 3D spirals onto the plane.

Figure 9 shows a 3D view of the proposed model for a stop with four arriving vehicles and four departing ones. To model the interchange possibilities at the stop, a total of seven interchange links is needed. The green lines show the projection of the model onto the plane represented by the rectangle.
.

Figure 9: 3D view of the proposed model showing vehicle run and interchange links

The projection of the model onto the plane (i.e. the green lines on the figure) is the only visible part of the model when seen from inside ArcInfo as a line coverage. Figure 9 was created using an AML-program to extract the data from the line coverage and attribute table using the UNGENERATE and CURSOR commands to generate a data file. This data file was then plotted with the plotting program GnuPlot.

3.2. The Construction and Operation of the Model

The model is constructed in several steps using data from a variety of different sources. Certain data are stored in tabular form in a relational database, others are geographical data stored in ArcInfo coverages. Figure 10 shows the data flow in both the construction and operation phases of the model.


Figure 10: Data flow in the implemented model for the calculation of travelling times

The different data sets seen on figure 10 are shown in the following section 3.3.

The following pseudo code describes the constructing of the public transport network model:
 

For each transport company timetable database 
  For each vehicle run  
    For each segment between two stops  
      Transform timetable data using the appropriate timetable database interpreter  
      Find the shortest path between stops on appropriate road or railroad network  
      Convert the shortest path route feature to arc feature  
      Join departure and arrival times from timetable to the arc attribute table  
    Next segment  
  Next vehicle run  
Next transport company timetable database  

For each stop  
  Order the nodes to and from the current stop in time  
    For each node  
      Create interchange links by connecting the current node to the next node in time  
    Next node  
Next stop 

Select stops that are to be connected to zones  
For each stop  
  Select all arriving links  
  Create connection links from the arriving nodes to the zone node  
  Select all departing links  
  Create connection links from the zone nodes to the departing nodes  
Next stop  

Select stops that are to be connected with connection links (stations and bus stops)  
For each stop  
  Create connection links  
Next stop  

Build line topology using departure and arrival time as elevation items 
Calculate the link (arc) impedance values as arrival time minus departure time 
 

The following pseudo code describes the operation of the model. The travelling times are calculated for each trip with differentiation between waiting time, in-vehicle travel time and interchange time and calculation of the number of changes.
 
Build internal network in ArcPlot using arc traversing time as impedance item  
For each trip in the travel data  
  Select all connection links (arcs) from the origin zone node  
  Calculate the waiting time on currently selected links (arcs)  
  Update internal network with waiting times for currently selected links (arcs)  
  Find shortest path from origin zone node to destination zone node  
Next trip  

For each route in the shortest path route system  
  Join trip#, link type items and traversing impedance items from RAT and AAT to SEC  
  Calculate the sum of the impedance item for each link of the three link types  
  Count number of unique vehicle run numbers in the SEC, i.e. the number of changes  
  Write waiting, in-vehicle travel and interchange time and number of changes to output file  
Next route 

 
The model construction and operation is accomplished by a series of AML-programs. The above pseudo code only describes the principles of these programs. For various reasons, the command sequence in the AML-programs is slightly different than shown above. Totally more than 4,000 lines of AML code was written for the implementation of the model.

At present the model has only been implemented for  half of the country due to lack of data. The estimated total number of arcs for the complete implementation is 2,500,000.

3.3. Data

Figure 11 below shows the four different geographical data sets used for the implementation of the model.

The upper left corner of figure 11 shows the road network. This data set is managed by the Danish State Road Directorate and the National Survey and Cadastre and contains different line attributes, e.g. road type, width and speed limit. The road network consists of about 85,000 arcs.

In the upper right corner of Figure 11 the railroad network used for the implementation is shown. This data set is managed by the Danish National Survey and Cadastre, however, the data set contains no attributes.


Figure 11: The road map, the railroad map, the public transport stop map and the transport analysis zone map. The extracts of maps belonging to The National Survey and Cadastre are reproduced according to the agreement G18/1997

The two maps shown in the lower part of figure 11 are maps digitised by NERI itself. The destination map (left) consists of about 3,200 destinations (bus stops and train stations) and the transport analysis zone map (right) consists of about 1,500 zones.

Figure 12 shows an example of the timetable data used in the project. Totally, the five different database types contain about 50 MB of timetable data from 17 different transport companies.

Figure 12: A simplified example of the time table data

3.4. Software and Hardware Issues

For the implementation of the model the following operating systems have been used throughout the course of the project: Digital VMS 6.1, Digital UNIX OSF/1 and Windows NT 4.0. At present the whole implemented model is operable under Windows NT 4.0. The present computer running the model is a HP Vectra Pentium Pro 200 MHz with 144 MB RAM.

The timetable database interpreters are implemented using SQL-queries in ORACLE 7.3.2.2.1 and MS Access 7.0.

The digitising of destinations and transport analysis zones has been performed using ArcView 3.0a and ArcInfo 7.1.2.

The model itself is implemented in ArcInfo 7.1.2 using Arc, ArcEdit, ArcPlot, Tables, AML and the NETWORK module.


Discussion

Due to the fact, that the 3D implementation of the model was not possible prior to ArcInfo 7.1, different 2D approaches were investigated in the initial phase of the project.

The first approach was to use turntables to model the interchange possibilities at each stop. Using turntables however, the number of turn records increases with the square of the number of arcs at each node. Given the size of the model, an excessive number of turns thus had to be calculated. The potential turntable was so large that it exceeded the limitations of available soft and hardware. For this reason the approach failed and the method was given up.

The next approach was to implement a model with 2D line topology without the use of turntables. This meant that all arcs representing vehicle runs had to be separated from each other, so that no coincident nodes existed, i.e. so that all nodes were dangling nodes.

The separation of arcs was first performed using the ArcEdit MOVE PARALLEL command. However, given the size of the model and the number of vertices per arc, this process was so time consuming that other methods had to be investigated.

The next method investigated was to use the UNGENERATE command to generate a text file containing the arc geometry and then altering only some of the arc geometry using AML text file functions. This method proved much faster than the MOVE PARALLEL method. However, the problem of ensuring no coincident nodes was still consuming an unacceptable amount of time.

Several ways of solving this problem were investigated, including the building of spatial indices outside the ArcInfo coverage data model. The fastest method investigated was to implement a random generator in AML and to move the nodes a distance chosen at random. This method ensured that only a few nodes were coincident after moving them all the first time.

The resulting model of these investigations is seen on figure 13, showing the interchange links of a bus terminal of a provincial town using principle b) for the representation of interchange links as described in section 2.2.
 

Figure 13: An early 2D version of the model

The implementation of the proposed 3D model proved no analogous problems using the new features in the ArcInfo 7.1 NETWORK module.

With this in mind it may be concluded that there are considerable advantages associated with modelling public transport using a 3D approach in which time is represented as the third dimension rather than trying to force an equivalent model type into only two dimensions.

The most obvious disadvantage of the implemented model however, is still the model data requirements and processing time.

With the present hardware (described in section 3.4.) the model can calculate travelling times at a speed of about 2,000 travels (trips) pr. hour, which is acceptable for most modelling purposes. With the size of the present travel data this is equal to a total processing time of 50 hours to calculate travelling times for all trips in the data set.

The construction process of the model (described in section 3.3.) however still demands considerable processing time. For this reason several optimisation strategies must be investigated in the future. The processing time benefits of using Esri's Spatial Database Engine (SDE) for the storage of data is one of these.

Nevertheless the implemented model will found the basis of several NERI transportation research projects in the near future including projects involving scenario analyses, benchmarking indicators in public transport and others.


Acknowledgements

I would like to thank my friends and colleagues at the Department of Policy Analysis for their support during the course of this project. In particular I would like to thank Henrik Gundorph Bruun, Linda Christensen, Bernhard Fabricius, Henning Sten Hansen, Ole Kveiborg, Jeppe Husted Rich, Christoffer Soltau and Morten Winter.

I addition I would like to thank Peter Ohm and Kurt Andersen from Informi GIS for their persistent technical support in the initial, somewhat problematic phase of the project.

The various tips and tricks of Andreas Bærentzen have been greatly appreciated.

The extracts of maps belonging to The National Survey and Cadastre are reproduced according to the agreement G18/1997.


References

Cormen et al. (1990); Introducion to Algorithms, Thomas H. Cormen, Charles E. Leiserson, Ronald L. Rivest, The MIT Press, Cambridge, Massachusetts 1990

Hansen (1996 ); Comparing the Accessibility Patterns for Public Versus Private Transport Networks, Henning Sten Hansen; In: Rumor, M., McMillan, R. & Ottens, H.F.L. (eds.): Geographical Information. From Research to Application through Cooperation. Volume 1. Second Joint European Conference & Exhibition on Geographical Information, Barcelona, Spain, 1996. IOS Press. pp. 703-707.

Harris et al. (1992); Transit Network Modelling - A New Approach, Nigel Harris, London Underground Ltd., Martin Bach, MVA Systematica; In: Proceedings of the 4th International Conference on Microcomputers in Transportation, Baltimore 1992

HTM (1996); Hovedstadstrafikmodel Version 3.0 - Beskrivelse af døgntrafikmodel, Firma Anders Nyvig A/S, København 1996 (Danish)

Ortúzar et al. (1994); Modelling Transport, Second Edition, Juan de Dios Ortúzar, Luis G. Willumsen, John Whily & Sons, Chichester, England 1994

Spiegel (1968); Mathematical Handbook of Formulas and Tables, Murray R. Spiegel, McGraw-Hill Publishing Company, New York 1968


Author Information

Per Thorlacius
Traffic Analyst, M. Sc.
Department of Policy Analysis
National Environmental Research Institute
Frederiksborgvej 399, P.O. Box 358
DK-4000 Roskilde
Denmark
Phone: + 45 46 30 12 98
Fax: + 45 46 30 12 12
e-mail: pth@dmu.dk
URL: http://www.dmu.dk


 Back to Top
 Back to Contents