Using Shape Analyses for Placement of Polygon Labels

Polygons may be convex or non-convex, and they may or may not have holes. Label placement algorithms that locate a label at a centroid have only been fully successful for convex polygons without holes. The automatic labeling function in ArcView does not work well for non-convex polygons or for polygons with holes. The objective of our proposed method is to provide a better label placement strategy to improve the probability that labels are placed in meaningful locations inside all polygons, convex or otherwise, with or without holes. The algorithm we develop uses shape analyses such as compactness, weighted areas, region decompositions, and region skeletons, and is implemented in Avenue. Our implementation has been used in Ohio for Delaware County's Current Agriculture Use Valuation (CAUV) program. In this large-scale mapping application, labels were placed on areas representing different combinations of soils and land cover within each parcel participating in the program.

Text labels play an important role in identifying map objects. Labels placed at an unsuitable location might lead the map maker and the map user to fail in map communication. The problem of finding a label position inside a convex or non-convex polygon that may or may not have holes has been resolved by geometric methods or rule-based methods. For example, finding naive center point, centroid, bounding box, and largest empty rectangle inside a polygon are classified into geometric based algorithms. Rule-based methods use label placement rules based on a small number of predefined candidate label positions. Geometric based methods have the advantage that they do not require predefined label positions. If a polygon is convex without holes, the simple geometric methods such as finding a simple center point, centroid, or the center point of bounding box usually provide an acceptable label position. However, it is far less likely that a calculated label position is inside a polygon or at a location that would represent the entire area of a polygon if the polygon is non-convex or has holes. More advanced methods such as finding the largest area rectangle or circle inside a polygon and then finding a center point of the rectangle or circle could be considered.

Polygons created by Delaware County's Current Agriculture Use Valuation (CAUV) program result from the polygon intersection between soils shape polygon and parcels shape polygon. The soils shape file contains polygon boundaries based changes of soil type and the polylines frequently bend and wind and almost appear to be smooth curves, possibly with holes. The parcels shape file contains property boundaries, most of which are convex polygons. The intersection operation between the polygon layers creates a combination of soils/land use polygons that are inside of a parcel boundary and the boundary of the parcel subdivides the original soils/land use polygons (Figure 1).

The labeling options provided by ArcView do not provide a perfect performance for the CAUV application. "Use theme's text label placement property" provides an unacceptable label position that is very close to the boundary of polygon. "Find best label placement" provides more acceptable positions but it does not satisfy all forms of polygons (Figure 2). Some cases are acceptable while others are not acceptable in either options (unacceptable polygons are highlighted in yellow color). In "Use theme's text label placement property", the label is placed very close to the polygon boundary. This situation is overall improved in "Find best label placement", but the label is placed on the narrow path that connects two large areas in one of the highlighted polygons. It would have been better to place a label on the largest sub area in the case of a concave polygon so that the label accomplishes its role to identify a polygon accurately. The other highlighted polygon illustrates that the label is also close to the inner boundary of the polygon. Another solution is needed for label placement of CAUV project to provide an optimum label placement for any forms of polygons. The results of this solution should satisfy two conditions: the resulting label location should stay inside in a polygon and also should represent the entire area of a polygon (skeleton of a polygon).

This paper consists of five sections. Section 1 is the current section. Section 2 describes several existing algorithms for finding a label position for a polygon. Section 3 introduces our proposed algorithm, and Section 4 describes applying the proposed algorithm to the CAUV application as well as the analysis of the algorithm. Finally, section 5 concludes the paper with a summary of the algorithm.

The simplest method calculating a center point is as follows [1][2]. It averages points by n, the total number of points.

If n is 3, the calculated position is the center of gravity of triangle. If n is increased and shape is no longer convex, this simple method picks an incorrect location because it finds the center of mass of a massless polygon with equal point masses at its points [2]. Thus, this simple method is suitable if the polygon is simple convex and the distribution of points are favorable.

The above equation can be described by barycentric combination [3]. These are weighted sum of points where the weights sum to one:

In the case of triangle, the barycentric combination is the centroid ( xc, yc) of a triangle with three points.

The term "barycentric combination" is derived from barycenter or center of gravity [3]. If the Pi are centers of gravity of objects with masses mi, then their center of gravity Pc is given by:

This is the same equation given for barycentric combination if the normalization is applied. The centroid could be found if iteration is done until weights (subdivision area with centroid) for each points are equivalent or within the threshold.

The center of gravity is also calculated using the first moments of a closed planar region R along the x- and y-coordinates and an area of polygon [2].

Where A is area of polygon and µx and µy are first moment along x and y, respectively.

However, this method provides an incorrect result if a polygon has holes. Finding the largest area axis-parallel rectangle in a polygon [4] or find a label location using bounding box could be also considered. The way to find a center point of a polygon in Arc View is explained in its help section. It is described as "the center of polygon is determined by taking the center point of a polygon's bounding box. If the center of the bounding box does not fall inside a polygon, then the point is moved in the shortest X direction to put it inside a polygon".

Many methods have been discussed and the comparison for each method using polygons of CAUV is given in Figure 3. In the next section, we discuss the proposed method for CAUV.

Figure 3. Naive center(left), Centroid(center), and center of bounding box(right)

3. Inside buffer based label placement for CAUV polygon

Label placement is known to be an NP-hard or intractable problem [5]. In order to deal with provably intractable problems and find an acceptable solution, not the exact best solution, several reasonable approaches have been suggested [5].

1. Look for special cases of the general problem that apply to your particular situation.
2. Add constraints or make limiting assumptions.
3. Change the problem domain.
4. Change your expectation for a solution by accepting piecewise or sub-optimal solutions.
5. Accept approximations after studying the latest results in approximation theory.

In order to provide an acceptable label placement for CAUV polygon, two conditions have been discussed in the introduction. First, the label location should stay inside in a polygon, whenever possible. Second, it also should represent the entire area of a polygon. To satisfy these two conditions, we developed the following method.

One of fundamental operations in Geographic Information Systems (GIS) is buffering. Buffering is used to create a zone of specified width around a point, line, or polygon. The resulting buffer forms a new polygon that is used in proximity analysis to find which entities occur within or outside the buffer area. For example, find all wells within 100 feet from a proposed location. The proposed algorithm is based on inside buffering of polygon to find the largest area inside of a polygon no matter what it is convex, concave, or with holes.

Inside buffering can be considered as the skeleton of a shape such as point, line, or polygon. For example, the buffering of a point creates a circle. Vice versa, if we increase the buffer distance for the inside (not outside), the inside buffering of a circle will end up at a point. The main concept of this proposed algorithm is to find the largest sub polygon by inside buffering until it is convex without holes and then find a label position inside it. The resulting label is definitely placed inside the polygon and it also represents the largest area part of the polygon.

Compactness and rectangularity [6] are used to describe a region and determine the pseudo-convex type polygon. The term "pseudo-convex type polygon" means a polygon that is suitable to find a label location using algorithms for convex polygons and the result satisfies the objectives of label placement. Compactness is a ratio between the area and the square of the border length of a shape. For example, a circle is the most compact region in a Euclidean space. The higher values represent more winding shape. Rectangularity is the ratio between the area of any polygon and the area of its bounding rectangle. It represents 1 for a perfectly rectangular shape. It allows us to filter candidate shapes for label placement by using the combination of both criteria. It is enough to use pseudo-convex type polygon to find a label position for CAUV instead of creating a convex polygon from pseudo-convex type polygon because pseudo-convex type polygon is created from inside buffer so it is located inside the polygon and there is no big difference in label positions in both cases. This is not a necessary condition but a sufficient condition to become a perfect convex polygon.

There are several scenarios for inside buffering. First, in the case of non-convex polygons without holes, inside buffering can create multiple polygons depending on the shape of a polygon. The largest one among them is selected if multiple polygons are created and this operation is iterated until compactness and rectangularity conditions are met(if each multiple polygons has almost same area, multiple labeling could be considered). Those multiple polygons consist of skeleton region of the original polygon. If the compactness and rectangularity conditions are not met after iteration, the iteration is ended if the area of the subpolygon is small enough compared to the area of the original polygon. Second, the case of convex with holes, inside buffering creates multiple polygons with no holes if the shape of polygon is not symmetrical. Narrow neck is not connected if the buffer distance is increased. In the case of symmetrical polygons, for example, a circle with a smaller circular hole, this method neither creates pseudo-convex type polygon nor any polygon without a hole. Even though this case rarely occurs in CAUV application, the alternative solution is discussed for that case in section 4. The case of concave with or without holes can be applied to the first and the second case, respectively. Figure 4 shows each case with animated effects of inside buffering. Animation is only shown in the HTML version of the paper.

Figure 4-1 Figure 4-2 Figure 4-3 Figure 4-4

Figure 4. Inside Buffering and pseudo-convex type polygon

4. CAUV project in Delaware County, OHIO

The purpose of Ohio's Current Agricultural Use Valuation (CAUV) is to provide tax breaks to farmers who keep their land in farming. The CAUV value of a parcel is determined by a formula that involves five factors such as yield information, crop patterns, non-land production costs, and capitalization rates. It is also critical to identify soil types, land uses, and the area for each soil type in the parcel to calculate the final CAUV value. Avenue is used to create the CAUV application for Delaware County Auditor's office. Figure 5 shows the Avenue based Graphic User Interface (GUI) for this application. The main dialog consists of "Call Find Table", "Search by Pin", "Manual Label", "Auto Label", "Show Map", and "Print Map".

Figure 5. CAUV Application System

The "Call Find Table" assists in finding a Parcel Identification Number (PIN), cross-referenced with the owner name, by providing a search table that is sorted by PIN. After inserting the PIN and selecting the "Search by Pin" button, the search engine finds the proper parcel and intersects it with the soils/land use layer and calculates each soils/land use combinations'acreage. This application requires that the parcel's deeded acreage will be maintained and therefore, the ratio of the total calculated acreage and total deeded acreage is calculated and then it is applied to each combination to then be inserted into the final table on the layout. The operator only sees the results which is a parcel highlighted by a blue outline and centered on the window. At this point, there are two options to label a combination of soil types and land use texts on each intersected soil/land use type polygons. The manual option guides the operator by highlighting each polygon selected from the table. The auto option uses Arc/View's labeling options. The "Show Map" button, displays the layout including the map of the parcel overlaid with the soils and land uses, a table including each soils/land use combinations'acreages, and finally the "Print Map" button sends the customized layout to the printer.

In order to solve problems mentioned in the introduction, the proposed algorithm is applied to the CAUV program to provide optimal label locations for each polygon. ArcView provides very useful classes and their associated functions, which allow the users to create customized applications using Avenue scripts. The proposed algorithm is implemented in Avenue script. The results for several parcels are shown in Figure 6.

Figure 6. New Label Location by Inside Buffering.

The proposed algorithm uses inside buffer to get skeleton or sub polygons and shape analysis to find a proper skeleton polygon for calculating a label location. The problem of finding label location for any cases of polygon in CAUV can be solved in this proposed algorithm. However, the disadvantage of this method is that there is a special case that cannot be solved. For example, if a circle has circle hole at the center of the circle or a square has square hole at the center of the square, this algorithm will create circle boundary or rectangle boundary inside the region instead of pseudo-convex type polygon. For that case, other criteria should be added. One possible solution may involve finding the label location on the skeleton line instead of pseudo-convex type polygon. Inside buffer creates skeleton line instead of point if the shape has a symmetry hole (Figure 7). In other words, the skeletons of a polygon by inside buffer consist of line and point. The other possible solution is to find the middle location between outer and inner boundary.

Figure 7. Skeleton line in the case of symmetry shape.

5. Conclusion

In this paper, we presented an algorithm to find a desirable label location for all cases of polygons for the CAUV application in Delaware County. This algorithm provides a simple but reliable solution to find label locations to identify soil type region. The main advantages of this algorithm are: (i) it does not require convex polygon to guarantee finding a label location inside a polygon; (ii) it uses buffer, one of fundamental operations of GIS, and shape analysis to find pseudo-convex type polygons; (iii) it provides the skeleton of polygons by either points or lines; (iii) we can guarantee that the resulting label location is inside of a polygon and that it represents the largest area of a polygon.

6. Reference

[1] Andrew S. Glassner, 1990, Graphics Gems, AP Professional.

[2] Paul Heckbert, 1994, Graphics Gems IV, AP Professional.

[3] Gerald Farin, 1996, Curves and Surfaces for Computer Geometric Design: A Practical Guide Fourth Edition, Academic Press.

[4] Karen Daniels, Victor Milenkovic, and Dan Roth, 1997, Finding the largest area axis-parallel rectangle in a polygon, Computational Geometry, No. 7, 1997, pp125-148.

[5] Alan Saalfeld, 2000, Complexity and Intractability: Limitations to Implementation in Analytical Cartography, Cartography and Geographical Information Science, Vol. 27, No. 3, 2000, pp.239-249.

[6] Milan Sonka, Vaclav Halvac, and Roger Boyle, 1998, Image Processing, Analysis, and Machine Vision, PWS Publishing.

Hoseok Kang, Ph. D. Candidate
Civil and Environmental Engineering and Geodetic Science
The Ohio State University
470 Hitchcock Hall, 2070 Neil Avenue
Columbus, Ohio 43210-1275
E-mail: kang.77@osu.edu

Shoreh Elhami, GIS Director
Delaware County Auditor/DALIS Project
94 n. Sandusky Street
Delaware, Ohio 43015
Phone: 740-833-2070
Fax: 740-833-2069
Email: selhami@co.delaware.oh.us