Gregory A. Newkirk

Determining Vacant Buildable Lands Using Vector and Cadastral Data: Problems in Model Building

Abstract

GIS can assist in urban growth management by identifying vacant and buildable lands within a given area. However, numerous assumptions about the data can result in a model which is complex and may not accommodate later refinements without a corresponding loss in historic and comparative representation. This paper examines these problems and how model building can move beyond analytical considerations toward establishing a long-term comparative history.


Introduction

At the heart of land use planning is the development of a long-range plan (20-50 years) which identifies the demands of urban growth and how to accommodate them. The plan consists of a population projection and the corresponding need for land to accommodate residential, commercial and industrial development as well as public services such as roads, parks and schools. The plan begins with a inventory of all land uses within and surrounding an urban area. Every parcel of land is coded regarding its type and intensity of use (including vacant and buildable land) to determine if there is sufficient capacity to absorb the projected growth. However, this determination is filled with assumptions that result in a fuzzy logic that permeates the entire process. The logic is fuzzy because the assumptions are largely qualitative and reach into the distant future. Reasonable people could disagree about them, while the slightest adjustment could have substantially differing effects. Historically, this was not a great concern since development constraints were much less substantial then. At that time only the location of vacant and buildable lands was of interest so that urban services such as roads and schools could be planned in areas where they were likely to be built. This is no longer the case as all urban development can be restricted within a tight geographic boundary. Concurrently, environmental concerns can remove a substantial amount of land from within the that boundary. Now, the question arises, is there sufficient vacant and buildable land to accommodate future development? And, the answer lies in the ability to accurately determine vacant and buildable land as well as to track it over time so that it is not used up faster than projected.

Because of this demand for greater accuracy and given the complexity of land capacity analysis, GIS has become the perfect tool with its power and speed. Yet, the assumptions that created the fuzzy logic remain and the slightest change in one of them can substantially alter the analysis. This would not be a problem if a sole determination were made as part of the long-range plan. However, yearly determinations are expected since GIS renders this possible and because of the need to identify and to track vacant and buildable land over time. However, comparisons of yearly determinations can be unreliable if any of the assumptions are changed from year to year. And, given that these assumptions are based upon an ever changing world, there is little doubt that the assumptions will change during the planning period.

Assumptions and Variables

Vacant Land

In beginning a land capacity analysis, a number of assumptions that must be made so that numbers can be plugged into the variables. The first assumption requires a determination of what constitutes vacant land. It can be identified as any land that is either clear of buildings or any land where buildings possess insufficient value to constitute usefulness. Setting this value is an assumption that may or may not prove to be correct and can change over time. However, once a value has been set, it can be plugged into the equation. Land can also be partially developed and likely to experience additional development during the planning period. The process of determining underdevelopment includes many variable conditions regarding land size, zoning category and building size. For example, a five-acre parcel within a small-lot, single-family zone with a 1,500 square foot house is assumed to be available for residential subdivision. However, the same parcel with a 10,000 square foot house would be considered an estate and not likely to be further subdivided. Again, the same parcel with a 10,000 square foot apartment building in a high-density multi-family zone (i.e. 20 units per acre) would likely experience additional development.

Buildable Land

The second assumption requires a determination of what constitutes buildable land. Some parcels contain severe or unstable slopes. Others contain floodplains, wetlands, and other environmental features that can restrict development. However, there is often a problem of scale regarding these data. They are usually obtained from state and federal sources who capture them at much larger scales than parcel data are captured. This limits the ability to determine the precise extent of the overlay. Even where data may be captured at closer scales, later field surveys can reveal a very different extent than what has been developed from aerial photography or computer models.

Development Density

The third assumption requires the incorporation of the zoning geodataset into model so that it can be classified into density categories. This pertains to the range of densities allowed within various zoning categories as well as multi-use zones (i.e. zones that allow both residential and commercial development). Assumptions need to be made regarding the density at which vacant and buildable land within each zoning classification will develop. However, they can be made at a later time. For now, each parcel needs only to be coded with the applicable zone.

Figure 1 below shows how data from these three data types are merged together to produce a single geodataset that can be used in determining the amount of vacant and buildable land within a delineated urban growth boundary. The actual process can be extremely complex as it merges multiple geodatasets and queries numerous Cadastral records before processing each parcel to determine to what extent it can absorb urban growth.

Figure 1

Developing a Methodology

Developing and implementing a methodology is a substantial challenge. With the use of urban growth boundaries and other techniques to curtail urban sprawl, land capacity analysis is no longer just a technical exercise. It has become subject to extreme political pressure because of its impact on the development industry. Development interests want large tracts of land available to pursue economic opportunity. Yet, advocates for environmental protection and open space want growth restricted within tight geographic boundaries and away from sensitive areas. As well, neighborhood and livable-community advocates want to regulate all aspects of development to reduce its overall impact. In all, each group exercises whatever political muscle it may have and is prepared to challenge in a court of law any part of the process, including GIS analysis.

If GIS is to become part of a long-term comparative analysis, certain problems must be cleared up at the beginning. First, all assumptions must be settled at a high level. Second, metadata documentation must be substantial. Third, data summaries must be rounded to generalized levels. Fourth, derived data must be field checked by GIS staff. Figure 2 below identifies the steps to achieving a long-term comparative analysis.

Figure 2

Settle Assumptions at a High Level

When developing the assumptions that go into the vacant and buildable lands model, process is critical. A few technicians building the model without public scrutiny room is a recipe for disaster. A diverse technical committee is needed to develop the assumptions, which are then explained to and approved by policy boards where environmental and development advocates have the opportunity to provide public input. Without this process, there will be inadequate support for the model's outcome. As well, local jurisdictions and other stakeholders should be included in the process. Failure to provide them with adequate participation undermines their support which will be needed to sustain credibility for the process. Lastly, a concern for openness and public participation needs to be a consistent part of the entire process including which data are chosen for the analysis and other aspects of developing the model.

Another matter that needs to be settled at the highest level is the date of the data measurement. The long-range plan has a tradition of measuring data up to the very last moment possible and deriving it from as many sources as possible. For example, Cadastral data on vacant parcels may not reflect construction in progress. Usually, once a building is completed and a final inspection or certificate of occupancy is issued, a building value is updated in the Cadastral database. However, data can be acquired from other sources to indicate construction in progress, land that is being platted though not yet recorded, and so on. Often this data is obtained manually and from various sources. A builder may be aware of construction on a site or a surveyor may be aware of land that is currently undergoing the platting process. As well, another data source could be introduced such as permit tracking software. The word-of-mouth data is especially problematic for maintaining consistency of methodology from year to year and the data within a permit tracking system may be as elaborate as Cadastral data. If permit data is to be used, it should undergo the same rigorous documentation as Cadastral data before use.

Document Data and Methodologies

Extensive Metadata is needed for all data that is used. For example, if the National Wetland Inventory is used in determining buildable area on vacant parcels, it is essential that data characteristics be identified. For example, if I were using NWI data from Washington State, I would identify that it was created by the Washington State Department of Ecology from aerial photography at a scale of 1:100,000 without any field checks. However, my parcel data is developed at a scale of 1:2,400 from field surveys and legal descriptions. When I create a spatial join between these two geodatasets, I know from the beginning that there will be limits to the data's usefulness. These limitations should be identified up front and made part of the public discussion. Lack of funding may prevent the gathering of wetlands data at the same scale as the parcels data and the choice to use other data poses limitations to the methodology or its outcome. Knowing and publicly discussing these limitations will assure a necessary rigor as the model is developed and implemented.

As Cadastral data is particularly rich, metadata on each data field should be equally rich. Property usage is finely distinguished, including as many as 1,000 categories. Vacant land can be represented by as many as 10 categories and other uses such as mobile homes are often classified into more than one category. Understanding these distinctions can increase the level of refinement, sophistication and robustness of the model. For example, all vacant land could be combined into one category for analysis or it could be differentiated according to acreage categories, type of vegetation (i.e. timber or brush), existence of vacant or abandoned buildings, and so on. In addition to data fields, operations should be fully documented such as the type of spatial join that is chosen and the expected outcome.

Finally, the entire methodology should be documented and become part of the public debate. After it has been scrutinized by environmental, development and other interests and accepted by public officials, it should be published. A good example of a published methodology can be found at the following web site: http://www.metro.dst.or.us/growth/doclibrary.html.

Data Reporting

Because of different scales and coordinate reference points, the level of precision at which the data is reported should be reduced. For example, a spatial join can be done on the 1:100,000 scale wetlands data with the 1:2,400 scale parcel data and report findings to the nearest hundredth of an acre. However, doing so would indicate a level of precision that does not exist. For this reason, summary data obtained from such a spatial join should be rounded to the nearest 10 acre or higher increment. Also, raw data obtained after a spatial join of differently scaled data should not be reported unless these parcels are sufficiently large to accommodate this level of generalization. A similar approach for mapping should also be used. For example, if both wetlands (1:100,000) and parcel (1:2400) data are depicted on the same map, the wetlands data should be shown as a thick fuzzy line or a granular shade pattern which would provide better representation of the data's accuracy.

While summary data should be rounded to a level of generalization, no other generalization of the data should be made at the level of GIS analysis. Such generalization requires assumptions about the data and the methodology is already loaded with assumptions. Instead, report the data according to its various categories.

Checking the Data

If maps are produced that indicate the vacant and buildable status of individual parcels, it is critical that the data indicating vacancy is field checked. If errors are found by antagonistic interests, they are not beyond using these errors to undermine the credibility of the GIS. However, instead of field checking, a more productive way to check the data is to acquire aerial photography that can be registered to parcel data. High-resolution photography (sub-meter) is a valuable addition to any GIS database that can be used for many purposes. For this particular process, it provides a source of documentation and the ability to perform in-house reviews.

Conclusion

Once the model is developed, writing code for vacant and buildable lands analysis can be quite complex. An elaborate and interrelated set of AMLs (or CASE-developed model) can involve hundreds of lines of code, and a concern arises when the code must be altered from year-to-year as assumptions are changed. This concern is not in modifying the AMLs or CASE model, but in comparing the outcome of one process against another. It may be possible to capture all of the raw data to CD-ROM or in versioning software so that the new process can be run on historic data as well as current data for comparison, but this in not advisable. Instead, if the above steps are followed, the methodology that is developed will be better situated to resist change brought about by political pressure and internal manipulation.


Gregory A. Newkirk, AICP
Geographic Information Systems Coordinator
City of Vancouver
P.O. Box 1995
Vancouver, WA 98668-1995
Telephone: (360)696-8012
Fax: (360)696-8029
Email:greg.newkirk@ci.vancouver.wa.us