Data considerations for location-allocation modelling of public school districts in Copenhagen.

Lasse Møller-Jensen, Institute of Geography, University of Copenhagen.

Abstract

This paper discuss the following issues related to location-allocation modelling of public school districts in Copenhagen:

- Availability of relevant data and issues related to the combination of different data sources.

- Strategies for data aggregation due to either lack of data or restrictions imposed by legislation.

- Network design: strategies for assigning relevant impedance values to reflect the movements of students and to incorporate other 'spatial objectives' than minimizing travel costs (e.g. solutions that are traffically desirable).

Concepts

In a general GIS context the word allocation indicates the process of identifying specific areas, districts, road segments etc. surrounding a centre; areas that should 'serve' the particular centre concerning the in- or outflow of people or goods, generally known as demand. There may be more or less demand than the centre can 'handle' - specified by the supply or capacity of the centre. The word location, on the other hand, indicates the process of identifying optimum locations for new centres given a set of objectives related to f.ex. accessibility.

Being a part of the GIS network analysis tool package, allocation modelling assumes that the area surrounding a centre is represented as a network of linear features. These linear features identify the location of possible transport routes and indicate how these are connected to each other, and reflect the fact that transportation of resources on the ground normally takes place along certain pre-defined corridors. Each linear feature in the network has an associated impedance value indicating the cost (measured in metres, minutes, petrol etc) of moving along this line. This impedance value will often be related to specific means of transportation.

Location-allocation modelling is not new; research into the theory of this concept was conducted during the 1960'ies and early 70'ies in relation to the growth of computer science in general. What was missing back then was the detailed digital maps and the broad access to the necessary computing power which meant that few applications of the methods for realistic and complex situations were seen. This has certainly changed. Computers on the desks of general users are now powerful enough to perform these tasks, and the amount of digital maps is growing every day.

Public school districts in Copenhagen

The process of location-allocation modelling requires a set of digital data of sufficient scale and accuracy to generate information that is relevant to the user by clearly showing the spatial effects of the chosen strategies. In other words, both the network properties - connectivity, assigned impedance - and the distribution of demand must be modelled as accurately and detailed as possible in order to produce valid results.

The following sections discuss some of these issues in connection with the process of assigning children from portions of a street network to the nearest school for a selected area of Copenhagen. The capacity of each school, the demand for school seats in the surrounding areas and other considerations - e.g. a maximum acceptable travel time for any student - is to be taken into account. The general goal is to maximize accessibility while at the same time insuring that no one has to travel an excessively great distance to attend school. (Esri, 1995a).

As it is shown later in this paper other 'spatial objectives' may be reflected in the network allocation modelling, including stimulation of 'desired' traffic behaviour: limitation of street crossings, high priority to roads with bicycle paths and paths separated from motor traffic.

This kind of spatial analysis provides the means for quantifying the relation between school location and capacity on the one hand and the demand for seats in the surrounding areas on the other hand. The number of students allocated to each school, the mean and maximum travel time etc. are reported by the allocation tool and the corresponding school districts are drawn on a map. The consequences of various changes in student location, transport lines, school location or school capacity may thus be visualized and quantified immediately, assuming that the applied digital data are suitable for this purpose. This will enable school planners to make decisions concerning where to increase or decrease school capacity while considering any 'spatial objectives'.

Availability of digital data

The Road Network

Digital maps of roads in Denmark may be obtained from several sources. Detailed technical maps of larger urban areas must be obtained directly from the urban authorities responsible for map production, while The Danish National Survey and Cadastre may provide road maps (road centre line) at the national level. Several private companies are in the process of establishing and marketing digital maps for various purposes, (f.ex. a frequently updated database for in-car navigation systems).

The current analysis is carried out using a digital line map provided by KRAK (KRAK, 1995) which is originally created as a basis for producing map books of the Copenhagen area and which is consequently not produced with network applications in mind. Due to the origin of the map, each line represents the centre line of some road feature that is visible on the surface, while a few transportation links f.ex. roads going into tunnels are not present. This may lead to identification of incorrect 'best routes' and the network has therefore been edited using the ArcInfo Arcedit software to take these situations into account.

The KRAK map is subsequently converted into a network model of transportation costs by assigning impedance values to each road segment. The impedance value is in this case equal to the length of the road segment, which is an acceptable strategy since many children walk to school. A road segment is defined as the road between two node points which are points that normally represent road intersections.

The Student Demand

Because a GIS provides the means for computer-based spatial analysis, one of the main advantages is exactly the potential for analysing very large and detailed datasets compared to what is possible by manual methods. This implies that a GIS often invites for spatial analysis involving information on the individual person in order to exploit its full potential. In Denmark this will normally present a problem because it is inconsistent with the legislation concerning considerations for statistical 'discretion', which implies that statistics should be computed on populations aggregated to a point where it is not possible to point out individuals.

The municipality of Copenhagen will, of course, collect data concerning age and addresses of individuals living within the municipality and make use of these data for various 'internal' planning purposes. In the present context, however, a method is applied which aggregate personal data with a better spatial resolution than what can be achieved from published statistics without going to the individual level; thus maintaining 'geographical discretion'.

What is needed for the location-allocation tasks is information about the location of children of a certain age. Statistics on age distribution in Copenhagen is published frequently at the level of the administrative statistical unit rode (hereafter termed RU). Although a RU normally does not cover a very large area, use of mean figures for complete RUs would be inadequate for the purpose of allocation modelling, giving unreliable information about the extension of the optimum school districts, mean travel time for the students etc.

Data at road level (e.g. 14 children aged 6 living at 'Main Street') would seem adequate for small roads but useless for larger roads going through the area. For the current study the Statistical Office of the Copenhagen Municipality has provided data on the age distribution of the population along each road (identified by a road-id) - with the enhancement that roads that run through several RUs (identified by a RU-id) are split into two segments where they intersect with the RU boundaries, see fig. 1. In order to identify the resulting road/RU-segments a digital map of RUs in the selected areas has been established. On the basis of the KRAK line map and using GIS editing facilities the RU-id is added as attribute information to areas bordered by roads or other linear structures in the line map.

Fig. 1. Impedance values in terms of school children living along a road segment within the administrative unit 'rode'.

The location of the road/RU-segments corresponding to the population data can be found by combining the road network map with the digital RU map and the demand - in terms of students - along each road segment can now be established. This is consistent with common practice when using the ArcInfo Network software namely that network demand (here the students) is assumed to be evenly distributed along the length of a road. This is of course very often not the case: There will be an un-even distribution of persons if the residential housing is located only along a fraction of a road or if the type of housing changes from single-family to multi-family housing. Within a RU, however, one will often find similar housing types.

Better accuracy may be achieved if the exact demand locations (i.e. student addresses) are used. Address matching is an alternative technique for associating population data with specific location. A prerequisite for this approach is the rather labourious task of producing an address map in some form and so far such a thing is not available at a national level. It will, however, appear within the next few years.

Address matching may be based on simple address maps containing house numbers at each road intersection, and geocoding of an address is in this case done by linear interpolation. The result of this would be very similar to the approach described above and some of the same considerations about possible un-even distribution of people along a road would apply. Alternatively, the exact coordinates of the demand locations, i.e. children addresses, may be represented in the network as nodes which would provide the best basis for precise allocation modelling.

The School Centres

Information about actual school capacity for 1996/97 has been provided by the Copenhagen Municipality's Department for Schools. The location of each school is entered into the network and the corresponding student capacity, residing in a relational database, is assigned to each school location.

A number of different location-allocation experiments were performed on the dataset, as described below, and table 1 shows the main results in terms of the overall spatial properties of the identified districts.

(A) The spatial properties of the existing school districts were explored initially by allocating students within the primary district to each school based on population data for each road within a RU - i.e the number of children aged 6.

This task of establishing current conditions proved to be the most demanding because it requires a separate run for each school centre, while the following cases (B-E) can be done in a single run.

(B) Free choice of school: allocation modelling that does not take any district boundaries into consideration but employs school capacity limitations equal to the capacities of the current schools. 39 students are not allocated to a school because the capacity of the neighbouring schools are exhausted before the total demand is met.

(C) Free choice of school: no school capacity limits are imposed during the allocation modelling.

Fig. 2. Allocation of students to 11 existing schools and 1 new school which is located where the total travel costs are minimized.

(D-E) This example deals with the hypothetical situation that a site should be selected for a new school in the area in order to improve the overall accessibility to schools. Location-allocation modelling tools may be used to point out the best location - given 11 existing schools and a specific objective which in this case is to minimize the total travel time for all students. Location-Allocation modelling in ArcInfo will presently not take school capacity into account and the capacity of each school must therefore be adjusted subsequently. It has been shown, that the best location will be at a node point (Hakimi, 1964) and all node points in the area are identified as candidate sites, i.e. possible locations for the new school. Unfortunately, it is nessesary to represent the student locations differently for location-allocation modelling in ArcInfo than what was the case in the previous examples. Student demand has to be associated with node points, rather than road segments. The transformation has been done by assigning half the demand from one street to each of its end node points, but in principle one could have a node for each student. Due to the differences in demand representation it is not possible to compare the mean travel time distances directly. (D) shows the spatial properties of an allocation to existing school - similar to (C). Fig 2 and (E) show the results of locating a 12th school in the area.

Mean student

travel distance

(All schools)

(m)

Mean student

travel distance

(Remotest school)

(m)

Longest student

travel distance

(Remotest school)

(m)

A: 641 971 2435
B: 612 911 -
C: 574 750 -
D: 560 747 1676
E: 500 668 1440
Table 1. Overall spatial properties of school districts identified by the different location-allocation experiments.

Alternative network models

The above described location-allocation examples assume that travel cost may be expressed as distance along centre-of-the-road lines. It is questionable, however, whether this type of network models the real transport lines of pedestrians and cyclists very well. Moreover, minimizing travel distance is only one concern when laying out school districts. Traffic safety for school children is another, and in many cases a longer distance will probably be accepted if a safer route can be found.

In order to show how the choice of best route may be influenced by f.ex. safety consideration a strategy for incorporating risk assesment into the travel cost modelling will be shown. The general goal is to assign higher impedance values to road segments which are less desirable to use for transportation.

A more sophisticated network model has been established for a part of the study area, see fig 3. Roads and paths are now represented - not by their centre line - but by parallel lines that represent the actual transport lines of pedestrians and cyclists. In this case conversion is done by creating buffer zones along the road lines, but a more precise method would be to apply more detailed technical maps of the infra-structure. Lines that represent possibilities for street crossing have been added to the network at street corners and other suitable places.

Traffic modelling and risk assessment have been the subject of numerous investigations and a detailed review of the theories behind these will not be given here. Two major factors seem relevant in the current context: 1) risk associated with walking or bicycling along a road, and 2) risk associated with road crossings. Risk data is assigned to the network lines by adapting methodologies published by The Danish Road Directorate (1992) which present the main principles involved in the calculations of risks.

The following road parameters are added to the network:

YDT: One year average number of cars in 24 hours

V: Average car speed

F: Existence of sidewalk: 0.1= yes, 0.5= no

C: Existence of cycle paths: 0.1:=yes, 0.5= no

RC: Facilities for road crossing, e.g. zebra crossing, tunnels etc: 0.0=Tunnel, 0.2=Zebra crossing & light etc., 1.0:=No facilities

The YDT figures are supplied by the Municipality of Copenhagen (1997) for a selection of roads. Data for the rest of the roads are estimated from the known YDT. The fraction of heavy vehicles in the YDT is also part of the risk assessment but will be ignored here.

The risk (R1) of moving along a road side is computed by the following formula:

R1 = 1 + (0.05 * sqrt(YDT) * (V/50)**3 * (C + F))

For the network in question the resulting range is 1 (safe) to 6.6 (most un-safe).

The risk of crossing a road (R2) is computed from the amount of traffic and the average speed by:

R2 = 0.1 * sqrt(YDT) * (V/50)**3 * RC

The risk of crossing is subsequently reduced to 20% in the case of a zebra crossing and to 0 in the case of a tunnel. For the network in question the resulting range is 0 (safe) to 26.2 (most un-safe). 90% of the crossings have a R2 value of 5 or less.

The calculated risk values apply to a complete road unity for which the parameters are identical. The risk of travelling along this road will in some way depend on the distance travelled, and the risk values are therefore multiplied by a length factor in order to produce impedance values for the best route examples given below. Fig. 3 shows the network used for modelling the transport lines of pedestrians and cyclists.

Fig. 3. Identification of best routes in terms of risk and distance for pedestrians and cyclists.

It is, of course, a question of individual preferences how much longer one is prepared to travel in order to minimize risk, and it is, therefore, impossible to produce one absolute 'best route'. Three examples of best routes that reflect different preferences concerning distance vs. risk are indicated in fig 3.

Green route is the shortest route between the two points. The total length is 1600m.

Red route takes into account the risk of travelling along the road (R1) weighted with road length using the expression: impedance = R1 * length (m) / 100

This route runs along small roads in residential areas with little traffic, but has to cross several major roads. The red route is only approx. 6% longer than the shortest, green route.

Blue route also includes the risk associated with road crossings using the expression: impedance = (R1 * length (m) / 100 ) + R2

This route is longer because it avoids dangerous crossings. It takes a detour for the purpose of using a tunnel for crossing one of the major roads. The blue route is approx. 60% longer than the green route.

Conclusions and acknowledgements

The examples shown above are all computed based on the present population distribution. The actual computations only require a few minutes, once the digital map base has been established, making the tools suitable for scenario and forecasting purposes.

The current study shows how to perform location-allocation modelling in a situation where only aggregated data are available. It may be possible to obtain a more accurate result in the near future following the establishment of a database of geocoded addresses in Denmark.

Other spatial objectives, e.g. concerning traffic safety, may be incorporated by establishing suitable network models.

The data collected are suitable for experiments with a wide range of other issues related to network analysis. Other means of transportation, e.g. bus, may be incorporated, and more emphasis may be put on the suitability of the transportation links.

I would like to thank the Statistical Office and the Department for Schools, Copenhagen Municipality, and KRAK's Map Publishing Firm for helping to bring about the necessary data.

References

Danish Road Directorate (1992): Evaluation of highway investment projects. Review of appraisal methodology. Economic-Statistical Department, The Danish Road Directorate.

Dept. for schools (1996): Bilag til styrelsesvedtægt, Direktoratet, Københavns Skolevæsen

Esri (1995a): Location-Allocation in ArcInfo. Guide. Environmental Systems Research Institute, 1995.

Esri (1995b): Network Guide. Environmental Systems Research Institute

Ghosh, A & G. Rushton, eds. (1987): Spatial analysis and location-allocation models. New York.

Haggett, P., A. D. Cliff & A. Frey (1977): Locational Models II: Allocating. London.

Hakimi, S.L (1964): Optimum location of switching centres. In Operations Research, 12. Here cited from A. Markan, 1974.

KRAK (1995): The digital maps are used by permission (R/950726/1).

Markan, Anette (1974): Location-Allocation-Models og deres anvendelse i lokaliseringsplanlægning. København, 1974.

Municipality of Copenhagen (1997): Færdselstællinger og andre trafikundersøgelser 1992-1996.

Author

Lasse Møller-Jensen, Assc.Prof.

Institute of Geography, University of Copenhagen

Øster Voldgade 10

1350 København K

Tlf: 35322500

Fax 35322501

Email: LMJ@GEOGR.KU.DK