Tong Zhou
SHOW ME THE LOCATION
- A GIS Approach on Discount Store Location Study
ABSTRACT
Store location research is a relatively young field and is becoming increasingly important to a company's survival and success as well as increasingly popular because of application of information technology. This study will focus on the locational characteristics of large discount stores in five counties of Metropolitan Atlanta. The common factors which affect discount store locations will be identified, and form the basis of this study.
One part of this study describes the factors apparently affecting the distribution of discount stores in the study area. The spatial patterns of store distribution are presented and analyzed using GIS. Descriptive statistics of major variables are presented, as are correlation and regression analysis on the variables.
While each selected variable has various effects on discount store locations, variables of Population Density, Number of Highway Exits, and Housing Units built in 1980-1990 do show some statistical significance under certain conditions.
Finally, this paper describes the implications and explanations of the findings for both developers and planners. The aspects needed to be taken into account in the future location evaluation process, the limitations of the study and the potential future research direction will also be discussed.
INTRODUCTION
Store location research started about the beginning of this century and gradually evolved into an independent field with its own theories and methodologies. Its purpose is to define the internal relationship between a store's location and its performance, and its external effects on land development patterns and urban systems. The findings have been widely used by various retailers, especially retail chain store firms, to pick the most feasible site for their investments, and by planners to promote public welfare, safety, and health (Applebaum, 1968). While store location research may not guarantee the best site for a new store, it definitely can help avoid bad locations which may cause a huge loss for retailers and diseconomy for the public since retail investment is a costly long-term economic and social commitment.
PURPOSE AND SCOPE OF THIS STUDY
There exists plenty of work done by the previous researchers on store location. Several models have been developed and widely applied in the real world. However, most of the studies are site-specific and look at individual store or store chains from a micro viewpoint. Consequently, the findings are limited. This is probably due to the reason that store location research started on behalf of retail chain firms (Applebaum, 1968), and was concerned about the practical application of the research rather than with establishing theory. Thus, the generalization of findings is weak.
Another reason is that, historically, there were fewer chain stores being operated, resulting in lack of available data for research at a larger scale. With several big stores' national expansions, some research did start to look at this topic at a broader scale such as at a Metropolitan Statistical Area (MSA) level (Farris, 1996; Ingene, 1984). While the findings from this scale could be generalized to explain the situations of other stores, in fact meaningful findings are limited because of the heterogeneous nature of regions.
The scale of this study is at a census tract level. This could be a good scale with expectations to eliminate the limitations of previous ones. The size of census tracts allows some site-oriented analysis, as well as analysis in the context of the MSA. It is neither too detailed nor too coarse. The type of store I study is the large discount store for which the location studies are limited. The following list summarizes the primary objectives of this study:
CHARACTERISTICS OF DISCOUNT STORES
As their name indicates, discount stores discount prices on many merchandise items. This is a major factor for their survival and success. Discount Merchandiser (June 1993, 37) defines a discount store as:
A limited service retail establishment utilizing many self-service techniques to sell hardgoods, health & beauty aids, apparel and other soft goods, and other general merchandise with centralized checkout service. It operates at uniquely low margins, has a minimum annual volume of $1 million, and has at least 10,000 square feet of total space.
From this definition some differences between discount stores and traditional retail stores can be noted. First, the store scale is much larger; second, the strategy of the store is to maximize profit depending mainly on the quantity of merchandise sold rather than high profit on individual products; third, the range of merchandise is more complete and diversified; finally, the operation is highly efficient with active use of modern technology.
The evolution of the discount industry produced several giant corporations with national or even international markets. The three biggest, Wal-Mart, Kmart, and Target, have shown the strongest expansion of their market share. They are full-line discount department stores with a variety of goods and services, which is the largest retail form in the discount industry. The big three captured almost half of the total discount market share with $151.1 billion in sales in 1995 and were expected to reach $157.3 billion in sales in 1996 (DNS, July 1, 1996). (see Table 1)
Table 1. Sales and Store Count of Wal-Mart, Kmart, and Target, 1995-1997
|
Sales (in million $) |
Store Count |
|||
Store |
1995 |
% change from 1994 |
1/1995 |
1/1996 |
1/1997 |
Wal-Mart |
54,330 |
1.84 |
1,990 |
2,090 |
2,110 |
Kmart |
26,779 |
-0.77 |
2,256 |
2,096 |
2,076 |
Target |
15,800 |
16.18 |
611 |
676 |
743 |
Source: Discount Store News, July 1, 1996.
The numbers in the table show the important position of the discount industry in retailing and the increase of new stores nationally. This development trend reaches beyond the USA, into the international retail arena: Wal-Mart predicts that 25% of its total earnings will come from international markets in the next five years (DSN, July 15, 1996).
This global and national business expansion will result in more stores being built on sites with varied economic, social, geographical, political, and legal factors. Understanding the common factors which may affect every new store location decision is thus crucial to establish a guideline for decisions. Some previous research on discount store has identified such factors.
One is Childs's (Roca ed., 1980) study on discount-anchored shopping centers. Those factors he discussed are the most important, but not exhaustive. (see Table 2)
Childs (Roca ed., 1980) also discusses customers' shopping behavior. He points out that the discount store customer considers convenience of location to be the most important reason for shopping at a particular outlet, which implies the importance of accessibility.
Table 2. Key Factors to Be Considered for Location of A Discount-Anchored Shopping Center
Key Factor |
Issues to Be Taken into Account |
Data Sources |
Population |
Population volume Past and future trends Area affected by population change |
U. S. Census of Population Current Population Reports |
Income |
Composition of total or partial market |
Survey of Current Business Current Population Reports |
Competition |
Competitors and their location Competitors' strengths and weaknesses |
Shopping Center Directory Census of Retail Trade |
Economy |
Historical and future employment situations Composition of employment sectors and employers |
U. S. Census of Population |
Source: Roca, Ruben A. ed.. "Discount-Anchored Centers." Market Research for Shopping Centers. 1980.
While Childs's study is conceptual without data testing, a dissertation by Farris (1996) tested his hypothesis on discount store location. He focused his research on the structural determinants of discount store locations in the central cities of the top 50 MSAs in USA.
Farris (1996) applied linear multiple regression to his data analysis. The results show that only the variable C%80-90H (percentage of housing structures built between 1980-1990) is significant with an adjusted R square of 0.47. This indicates the land availability is an important factor for large discount store development in central cities since the amount of housing construction reflects the degree of land availability.
METHODOLOGY OF THIS STUDY
The methodology for this study stems from the previous research and my own perspectives on store locations. It applies both GIS and statistical approaches.
Dependent and Independent Variables
My selection criteria for variables are based on the importance and generalization of the variables, the scale of my study, and the data availability. I included the variables relevant to demographics, competition and accessibility. However, subjective variables, such as managerial judgment, are not chosen because of the difficulty and doubtful validity of quantifying the values to be incorporated into a multiple regression formula. (see Table 3)
Table 3. Selected Variables and Their Sources
Variable |
Source |
|
Independent |
Population Density (persons/square mile) |
Bureau of Census |
Independent |
Median Household Income ($) |
Bureau of Census |
Independent |
Percentage of Black (%) |
Bureau of Census |
Independent |
Percentage of Poverty (%) |
Bureau of census |
Independent |
Housing Units Built, 1980-90 |
Bureau of census |
Independent |
Total Area of Competitive Stores (sf) |
Chain Store Guide |
Independent |
Number of Highway Exits |
Tiger/Line |
Independent |
Edge |
Tiger/Line |
Dependent |
Store Size (sf) |
Chain Store Guide |
The dependent variable I chose for this study is store size rather than store sales. The square footage of discount stores shows the evolution of discount stores over time. It is also an indication of attractiveness of a store in the gravity model and provides a better comprehension of the scale of discount store space in a community (Farris 1996). It is expressed under the context that how large a trade area is required to accommodate a certain size of store. Another advantage of using store size as a dependent variable is that the findings can be applied to evaluate the feasibility of a retail development in terms of needed retail space.
The selected independent variables can be classified into three categories: Demographics, Competition, and Accessibility. All the values of the variables used in regression analysis are calculated according to correspondent census tract values covered by a certain service area. If the centroid of a census tract is within a service area, the whole census tract is considered within the service area. Then any data associated with the census tract will be taken into the calculations.
Number of highway exits is an important variable representing accessibility to each store. Whether the discount stores are located near a highway with plenty of accessible exits reflects their attractiveness to consumers and the willingness or capability of consumers to make a specific trip, which is associated with transportation costs. If the costs can not be offset by the benefits brought by the trip, then there is no justification for consumers to generate the trip. The number of highway exits in a certain service area will be the number of arcs coded "A63" in the road network within the service area. I expect a positive effect of number of highway exits in regression analysis.
Competition is another important factor in retail location. In my regression formula it is expressed as the total area of competitive stores in a certain trade area within my study area and selected stores. The discount stores included in my study are the largest ones in my study area. Their dominant positions determine that the competition for them mainly exists among themselves. Other smaller stores have little impact on competing with them. The values used in data analysis will be the sum of areas of all the selected stores falling within a certain service area except for the store from which the service area is drawn. I expect a negative effect in regression formula.
The Edge variable is added to differentiate a complete service area from an incomplete one. Since the boundary limitation of the study area, certain service areas of a store will inevitably touch the boundary and become incomplete. Therefore, an Edge variable is introduced to the model. Any service area touching the boundary has a value of 1, otherwise 0.
Besides Number of Highway Exits, Total Area of Competitive Stores and the Edge variables, all the other five independent variables are demographic ones. They are associated with the centroid of a census tract.
Population Density is always an important factor in retail research because population forms the basis of demand. As mentioned early, a threshold of demand is required for the survival of a store of a certain size. This is specially important for large discount stores since they depend more on quantity of goods sold. The population density of a service area will be the sum of population within the trade area divided by the total area of the service area in square miles. I expect a positive effect of Population Density on store size.
Percentage of Black and Poverty Rate are also important demographic variables to be looked at when social equality is an issue in location decisions. Both will be calculated as the sum of black people or people below the poverty line divided by total population in a certain trade area. A negative effect of both variables is expected.
Income is an important measurement of potential retailing expenditure. It indicates the potential of people's expenditure on retailing. The higher expenditure may imply longer time staying in the store and more shopping trips, which may indicate the need of more retailing spaces. Average of median household income of all census tracts falling into a service area will be used in data analysis. I expect a positive effect on store size.
Finally the Housing Units Built in 1980-1990 is also selected as a positive variable. This implies two potential facts. One is that the place where more new housing is built indicates land availability for new development and positive economic growth. The other is that the place where development occurs indicates it is in the early stage of neighborhood life cycle. The sum of housing units in all census tracts within a service area will be the value used in the analysis.
All the values of the above variables applied into regression calculation will be defined by service areas, as explained later. And the census tracts whose centroids fall into a service area will be treated as data calculation unit. The data of these variables associated with those census tracts will be either aggregated or averaged according to the characteristics of each variable.
Hypothesis and Testing Techniques
Summarizing the above discussion about variables, a hypothesis for testing can be formulated as follows.
Store size = Constant + a 1 * Population Density + a 2 * Median Household Income - a 3 * Percentage of Black - a 4 * Percentage of Poverty + a 5 * Number of Housing Built in 1980-1990 - a 6 * Total Area of Competitive Stores + a 7 * Number of Highway Exits - a 8 * Edge
Constant and a 1 to a 8 are parameters to be estimated using least square methods. The values of the variables are based on the unit of service areas.
Before putting all the variables into one formula, an individual regression analysis of each variable against store size is conducted to see whether each variable itself has statistical significance or not.
The regression analysis for all variables will look at three sets of data for three sets of service areas. For each set of service areas, one regression analysis will be conducted without consideration of the problem of multicollinearity.
However, another set of regression analyses will consider the problem of multicollinearity. In this case, I will first identify highly correlated independent variables which will not be included simultaneously in regression analysis later.
Under each condition, the hypothesis will be tested. This process will be executed using the regression analysis capacity of Microsoft Excel.
Descriptive Analysis
Using ArcView's mapping capabilities, I conducted a series of analyses of the relationships between the selected variables and the stores. First I looked at store location in relation to the five variables. Next, the analysis was further broken down into different store types to see if different store chains have unique locational characteristics in relation to the variables.
In addition, I calculated descriptive statistics about the variables. The findings are used to support the conclusions from these analyses. These include Mean and Standard Deviation for each variable for all the stores and by individual store type. This was done with Microsoft Excel's statistical package.
I also produced a service area analysis to see if there are areas not being serviced by these stores under different service areas of 5, 10, and 15 minutes' driving time. The details of determination of service areas are discussed later.
Data Preparation and Transformation
Prior to data analysis, some steps were taken to prepare the data. It includes information on discount stores within the study area and coverages for network analysis.
Study Area
The area chosen for study is Metropolitan Atlanta. There are two reasons for selecting this area. First, it is the largest regional center of politics, economics, and culture in the southeast USA. Second, it is one of the fastest growing areas in the nation.
Due to resource limitations, only those counties in the metro area with a population over 100,000 are included in the study area. Therefore, the study area includes five counties: Clayton, Cobb, DeKalb, Fulton and Gwinnett County. The City of Atlanta is in Fulton County. There are 367 census tracts within the five counties. (see Map 1)
Selection of Discount Stores
According to data from the Chain Store Guide (1996), there are a total of 273 discount stores in the State of Georgia, 137 of which are located in the Atlanta Metropolitan Statistical Area (AMSA). This study will focus on large national chain stores such as Wal-Mart, Kmart and Sam's Club and some regional chain stores such as Target. Some smaller discount stores such as "50-off" were omitted from this study. Consequently, 63 discount stores in the five counties of Metropolitan Atlanta were selected as study subjects.
Among the 63 stores, the addresses of 3 stores are neither shown in the map book nor included in the address ranges of the road network coverage, so they were also omitted. Therefore, 60 discount stores are study subjects. These stores represent approximately 95% of the large discount stores in the study area, so this study is representative for large discount stores in this area. (see Table 4)
Table 4. Number of Selected Discount Stores in Each County
|
Discount Store |
|
|||
County |
Kmart |
Sam's Club |
Target |
Wal-Mart |
Total |
Clayton |
2 |
1 |
1 |
2 |
6 |
Cobb |
7 |
2 |
3 |
4 |
16 |
DeKalb |
4 |
1 |
3 |
1 |
9 |
Fulton |
6 |
1 |
4 |
3 |
14 |
Gwinnett |
5 |
1 |
4 |
5 |
15 |
Total |
24 |
6 |
15 |
15 |
60 |
Source: Chain Store Guide, 1996.
Census Tract and Road Network Maps
Census tract and road network maps are essential in this study. These digital maps are produced from 1995 Tiger/Line data file through ArcInfo 7.0 on Unix platform. The following flow chart illustrates the process and the commands applied. Commands are in italics.
The final digital maps obtained from this process are a single census tract polygon coverage and a single road network line coverage of the five counties. The attributes associated with the census tract coverage include area and perimeter with units of square meters and meters respectively. The attributes associated with the road line coverage include address range and length in meters, as well as its FIPS code.
One item in the arc attribute table (AAT) of the line coverage is CFCC (Census Feature Class Codes). It represents the identification of the most noticeable characteristic of a feature. CFCC is defined as a three-character code. The first character is a letter describing the feature class; the second character is a number describing the major category; the third character is a number describing the minor category. The CFCCs in my final road coverage include major categories from "A1" through "A4" and the category of "A63", which represent "Primary highway with limited access", represent "Primary highway without limited access", "Secondary and connecting road", "Local, neighborhood, and rural road" and "Access ramp" respectively.
Geocoding Store Locations
After obtaining the road network coverage, the next step was to transform the addresses of selected stores to correspondent point coverages on the road network. This task was done through ArcView 3.0's Address Geocoding application.
Because of the limitations of address ranges available in the road network coverage, some store addresses were out of the available address range. Therefore, for these stores their locations were visually estimated and adjusted. The reference for this estimation and adjustment comes from The Original Street Map Book for Metro Atlanta published by ADC.
Census Data Linked to Tracts
The values for demographic variables come from the 1990 Census data. I acquired the data on the census tract level from Census for each county and each variable, then combined each individual dbase file into one complete dbase file. In ArcView, this dbase file is joined with the Info attribute file associated with the census tract coverage.
Determination of Service Area
The defining point of a service area starts at the store location. From there the service area is defined as an area to the furthest point in any direction in the road network within a certain amount of driving time expressed in minutes, without exceeding the posted speed limits. All the furthest points are connected to form the service area. Therefore, the sizes of the service areas are defined by selected driving times under certain conditions and the speed limits of the roads in the road network.
According to Shopping Center and Other Retail Projects (White, 1996), the typical customer's driving time for a large discount store is about 8-12 minutes in a suburban area and 15-20 minutes in areas near regional malls. Since the study area is a mix of urban and suburban areas, with City of Atlanta in the center, a range of driving times may be needed to cover the different areas. I decided to pick three driving times as my criteria for defining the service area. They are 5, 10, and 15 minutes. Generally they represent the customers in urban, suburban, and relatively rural areas. The assumption here is that people in the study area will not make a one-way shopping trip to a discount store requiring more than 15 minutes' driving time.
According to The World Almanac and Book of Facts (1997), the posted state speed limit in Georgia in rural areas is 70mph on interstate highways and 65mph on other primary highways. Since my study area is in Metro Atlanta, the speed limits are lower. The speed limits for major highways in urban areas are 55-65mph, for urban arterials are 45-55mph, and for local streets are 25-35mph. Those are all posted speed limits. Considering turns, stops, congestion or other unexpected conditions, some discounts on the posted speed limits are necessary to reflect the reality more accurately. The final speed limits I assigned to the road network are 55mph for roads coded "A1", 50mph for roads coded "A2", 40mph for roads coded "A3", and 25mph for roads coded "A4". For highway exits with CFCC of "A63", I assigned the speed limit of 25mph to them because most of them connect the major highway and urban arterial or local street and there is a stop sign or traffic light in most cases.
After defining the speed limits for the road network, I added two items - "splm" for speed limit and "minutes" - to the arc attribute table. The following formula was used to calculate the necessary driving time for each arc.
Minutes = { 60 * [ ( Length / 1609.52 ) / Splm ] }
Now the road network coverage is ready to be used for defining service areas. This is done by the "Find a service area" module in ArcView 3.0. Using the locations of stores from the geocoding process as the center points and 5, 10, 15 minutes as travel costs, I obtain three rings of service areas for each store. Samples of the service areas of two stores are shown in Map 2 and Map 3.
FINDINGS AND RESULTS OF DESCRIPTIVE ANALYSIS
I examined the overall store location pattern as well as patterns of each store type and service areas. After putting together these location patterns with distribution patterns of independent variables, I obtained some interesting and meaningful findings for the stores overall and individual store types. Descriptive statistics of selected variables will also be introduced to support those findings. Some conclusions are drawn from these findings.
Store Location Patterns in Relation to Demographic Variables
The spatial distributions of the five demographic variables is plotted based on census tracts with 60 locations of discount stores overlaid on them. The following five tables summarize the relation of store location and each demographic variable. (see Table 5 - 9) The corresponding maps are Map 4, Map 5, Map 6, Map 7, and Map 8.
Table 5. Store Location with Relation to Population Density in Metro Atlanta
Population Density |
Kmart |
Sam's Club |
Target |
Wal-Mart |
Total |
|||||
(Persons/square miles) |
# |
% |
# |
% |
# |
% |
# |
% |
# |
% |
0-3,000 |
16 |
67 |
6 |
100 |
13 |
87 |
14 |
93 |
49 |
82 |
3,001-6,000 |
8 |
33 |
0 |
0 |
2 |
13 |
1 |
7 |
11 |
18 |
6,001-9,000 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
9,001-12,000 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
12,001-15,000 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
15,001-30,000 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
Total |
24 |
100 |
6 |
100 |
15 |
100 |
15 |
100 |
60 |
100 |
Table 6. Store Location with Relation to Median Household Income in Metro Atlanta
Median Household |
Kmart |
Sam's Club |
Target |
Wal-Mart |
Total |
|||||
Income ($) |
# |
% |
# |
% |
# |
% |
# |
% |
# |
% |
0-25,000 |
5 |
21 |
1 |
17 |
0 |
0 |
0 |
0 |
6 |
10 |
25,001-50,000 |
14 |
58 |
5 |
83 |
13 |
87 |
11 |
73 |
43 |
72 |
50,001-75,000 |
5 |
21 |
0 |
0 |
2 |
13 |
4 |
27 |
11 |
18 |
75,001-100,000 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
100,001-125,000 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
125,001-150,001 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
Total |
24 |
100 |
6 |
100 |
15 |
100 |
15 |
100 |
60 |
100 |
Table 7. Store Location with Relation to Percentage of Black in Metro Atlanta
|
Kmart |
Sam's Club |
Target |
Wal-Mart |
Total |
|||||
% of Black |
# |
% |
# |
% |
# |
% |
# |
% |
# |
% |
0-20 |
17 |
71 |
4 |
67 |
10 |
66 |
11 |
73 |
42 |
70 |
20.01-40 |
3 |
12 |
2 |
33 |
3 |
20 |
2 |
13 |
8 |
13 |
40.01-60 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
7 |
3 |
5 |
60.01-80 |
0 |
0 |
0 |
0 |
1 |
7 |
0 |
0 |
1 |
2 |
80.01-100 |
4 |
17 |
0 |
0 |
1 |
7 |
1 |
7 |
6 |
10 |
Total |
24 |
100 |
6 |
100 |
15 |
100 |
15 |
100 |
60 |
100 |
Table 8. Store Location with Relation to Percentage of Poverty in Metro Atlanta
|
Kmart |
Sam's Club |
Target |
Wal-Mart |
Total |
|||||
% of Poverty |
# |
% |
# |
% |
# |
% |
# |
% |
# |
% |
0-15 |
21 |
88 |
5 |
83 |
15 |
100 |
14 |
93 |
55 |
91 |
15.01-30 |
1 |
4 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
2 |
30.01-45 |
2 |
8 |
1 |
17 |
0 |
0 |
1 |
7 |
4 |
7 |
45.01-60 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
60.01-75 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
75.01-90 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
0 |
Total |
24 |
100 |
6 |
100 |
15 |
100 |
15 |
100 |
60 |
100 |
Table 9. Store Location with Relation to Housing Units Built in 1980-1990 in Metro Atlanta
Housing Unit Built |
Kmart |
Sam's Club |
Target |
Wal-Mart |
Total |
|||||
in 1980-1990 |
# |
% |
# |
% |
# |
% |
# |
% |
# |
% |
0-200 |
18 |
75 |
5 |
83 |
10 |
67 |
9 |
60 |
42 |
70 |
201-400 |
3 |
12.5 |
1 |
17 |
3 |
20 |
3 |
19 |
10 |
17 |
401-600 |
3 |
12.5 |
0 |
0 |
0 |
0 |
1 |
7 |
4 |
7 |
601-800 |
0 |
0 |
0 |
0 |
2 |
13 |
1 |
7 |
3 |
5 |
801-1000 |
0 |
0 |
0 |
0 |
0 |
0 |
1 |
7 |
1 |
1 |
Total |
24 |
100 |
6 |
100 |
15 |
100 |
15 |
100 |
56 |
100 |
From the Tables and Maps, I conclude that most discount stores in the study area are located in areas with lower black and poverty rates and lower population density, and mostly in suburban areas. They tend to locate in relatively moderate income areas, and none of them are located in extremely wealthy communities. It is a little surprising to see that not many discount stores are located in newly growing areas where more new housing units are being built. This may be partly because that those areas have not grown to a degree that a strong customer base has been established for a large discount store.
Generally, Kmart is on one end of the location spectrum. It has the most stores located in higher density, higher black and poverty rates, lower income, and older central city areas. All the four stores within the city limits of Atlanta are Kmart stores. This is because Kmart started from the central city, which was once the center of prosperity prior to automobile-oriented suburban areas being developed. However, presently many central cities are typically in the later stage of the life cycle and may lack large pieces of land for large discount stores. But, even though the central city was declining, a place within the central city, where a discount store is operating, is still relatively better than other surrounding places in terms of poverty rates. This indicates that a large discount store could be a positive force to block decline and stimulate growth.
Since Wal-Mart started from small rural and suburban areas, its stores tend to locate in lower density, higher income, lower black rate and lower poverty rate suburban areas. Sam's Club has a similar situation. Target has the highest profile in merchandise among large discount stores, so it also tends to locate in lower density, higher income, lower black rate and lower poverty rate suburban areas.
No single variable is the dominant factor on discount store location decision. The location is affected by the combinations of these variables as well as the historical evolution of the stores. However, the patterns shown in figures and tables do reflect the effects of these variables on store locations.
Service Area and Demographic Variables Analysis
Looking at Map 9, 10, 11, 12, 13, and 14, I observe some general patterns for each set of service areas. While many areas are covered by 5 minute service areas except for tracts surrounding the edge of the study area, more than half the area within the city limits of Atlanta is not covered by 5 minute service rings, where census tracts with the highest density, the highest black and poverty rates, the lowest income and the least new housing units are concentrated. This indicates people there need to travel more than 5 minutes to get to a large discount store. There are some overlaps even in 5 minute service areas resulted from the presence of different stores closely located. But for each store type, there are almost no such overlaps among themselves in 5 minute service ring.
Within 10 and 15 minute service areas, almost all the census tracts in the study area are serviced and there are many overlaps among the same or different store types. Since overlaps in service areas mean competition among stores, the strategy for individual store types on determination of their store sizes and locations may be based on the driving time range of 5-10 minutes. Therefore, they will compete with other stores and avoid competition among themselves.
Overall, while residents in some census tracts on the edge of the study area need a little more time for a trip to large discount stores, the majority of customers in this area is within the reasonable time range for reaching a discount store, which is less than a 10 minute driving time.
Store Locations in Relation to Major Highways
Map 15 illustrates the relationship between store locations and major highways. In the 60 stores, 50 or 83.33% of stores are located close to or on the major highways. Among them, 32 or 53.33% of stores are directly on the major highways. This indicates that accessibility is a very important factor for the locations of these stores and these stores are pretty much automobile-oriented.
Descriptive Statistics of Variables
The descriptive statistics show that the overall means of population densities for all the three service rings are in the lowest range of 0-3,000 persons/square mile, with the highest overall mean of 1879 persons/square mile. For all the three service rings, Kmart stores have the highest mean of population density while Wal-Mart stores have the lowest. Target and Sam's Club are in between.
Looking at median household income, I discovered a similar situation. The overall means of median household income fall into the lower range of $25,001 - $50,000, with the highest overall mean of $40,962. For all three service rings, Kmart has the lowest mean of median household income except for its 5 minute service ring, while Wal-Mart has the highest. Target and Sam's Club are in between.
The overall means of percentage of black population for all three service rings are in the lower range of 20% - 40%. Still, Kmart has the highest mean of percentage of black population and Wal-Mart has the lowest, except for its 15 minute service area. Target and Sam's Club are in between.
The overall means of percentage of poverty for all three service rings are in the lowest range of 0 - 15%, with the highest mean of 9%. Once again, Kmart has the highest mean of percentage of poverty while Wal-Mart has the lowest percentage of poverty for all three service rings. Target and Sam's Club are in between.
The above findings indicate that overall the large discount stores are located in areas with lower densities, low to middle income, lower percentages of the black and lower percentages of poverty. Additionally, more Kmart stores are located in areas with higher densities, lower incomes, higher percentages of the black and higher percentages of poverty. On the other hand, Wal-Mart stores tend to be located in lower density, higher income, lower percentage of the black and lower percentage of poverty areas.
Now let's look at the variable of Housing. Generally, Wal-Mart stores are located in areas with the least housing units of 1980-1990 being built while Target stores are located in areas with the most housing units of 1980-1990 being built. Sam's Club and Kmart are in between. Since this variable implies the land availability that areas with more housing units being built have more vacant land, it may be related to store sizes. Target and Sam's Club have larger average store sizes than those of Wal-Mart and Kmart, therefore, they may tend to locate in areas with more land for constructions.
Kmart stores are located in areas with the most highway exits. This is consistent with the fact that it started from central city to suburban areas, where more highways are being built. On the other hand, Wal-Mart stores are located in areas with the least highway exits since it started from small rural town to suburban areas where, less highways are being built. Sam's Club and Target are in between.
Finally, there is no clear pattern on the variable of total competitive store area for all three service rings. Overall, Wal-Mart stores are in areas with less competitive stores.
In summary, the findings from the descriptive statistics are very much supportive of the conclusions from the map analysis.
FINDINGS AND RESULTS OF LINEAR REGRESSION ANALYSIS
This analysis focuses on how well the selected variables fit the hypothesized linear model, where the selected variables are seen as explanatory factors in store locations. Results of the regression analysis for each set of service areas without consideration of multicollinearity are presented, followed by results of the regression analyses with consideration of the problem of multicollinearity.
Regression Analysis without Consideration of Multicollinearity
The statistical results for the 5 minute service area show the highest R value among the 3 service rings for all stores. But it is still lower than 0.5. (see Table 10) The coefficient values of Population Density, Percent in Black, Percent in Poverty, and Number of Highway Exits generally have the expected positive or negative effects on the dependent variable. However none of their coefficients is very big. The results show Median Household Income has a slightly positive effect. Housing Units Built in 1980-1990 has a negative effect while Total Area of Competitive Stores seems to have no effect. All the F values are lower than the critical values. None of the models has the statistical significance on the dependent variable.
However, when looking at t statistics for each independent variable, I do observe that some statistical significance exists. For the 5 minute service ring, Population Density and Number of Highway Exits show statistical significance. This indicates the generic importance of these two variables on store location within the immediate service areas. For the 15 minute service ring, only Housing Units Built in 1980-1990 shows statistical significance. Since this service ring covers a larger area, it is reasonable that new housing units built counts more on the dependent variable. This may be related to land availability. For the 10 minute service ring, there was no indication of statistical significance for any variable.
The lower correlation of regression analysis of all variables may be partly due to the problem of multicollinearity. In order to further statistical analyses to a deeper degree, I also did regression analysis with consideration of multicollinearity.
Table 10. Summary of Statistical Results of Regression Analyses
|
Type of Regression Analysis |
|||||||||||
Summary of Statistical Results of Regression Analyses |
5 Min Service Area of All Stores |
10 Min Service Area of All Stores |
15 Min Service Area of All Stores |
|||||||||
|
R |
0.47 |
0.27 |
0.35 |
||||||||
Regression |
R Square |
0.22 |
0.07 |
0.12 |
||||||||
Statistics |
Adjusted R Square |
0.10 |
0 |
0 |
||||||||
|
Standard Error |
19903 |
21702 |
21098 |
||||||||
|
Population Density |
9.15 |
9.62 |
3.73 |
||||||||
|
Median HH Income |
-0.09 |
0.08 |
0.03 |
||||||||
|
Percentage of Black |
-103.88 |
-13.87 |
127.71 |
||||||||
Coefficients |
Percentage of Poverty |
-729.86 |
-364.05 |
254.78 |
||||||||
|
Housing Units Built in 1980-1990 |
-12.80 |
-2.09 |
4.80 |
||||||||
|
# of Highway Exits |
413.52 |
30.85 |
-59.43 |
||||||||
|
Total Area of Competitive Stores |
0.01 |
0.01 |
-0.01 |
||||||||
|
Population Density |
2.10 |
1.28 |
0.41 |
||||||||
|
M. HH Income |
-0.28 |
0.18 |
0.05 |
||||||||
|
Percentage of Black |
-0.67 |
-0.06 |
0.60 |
||||||||
t Statistics |
Percentage of Poverty |
-0.91 |
-0.23 |
0.13 |
||||||||
|
Housing Units Built in 1980-1990 |
-1.68 |
-0.55 |
2.02 |
||||||||
|
# of Highway Exits |
2.22 |
0.37 |
-1.00 |
||||||||
|
Total Area of Competitive Stores |
0.45 |
0.56 |
-0.64 |
||||||||
Significance F Value |
0.10 |
0.85 |
0.52 |
Regression Analysis with Consideration of Multicollinearity
First I did regression analysis for each independent variable against the dependent variable. A summary of the results is included in Table 11.
Table 11. Results of Regression Analysis of Individual Independent Variable against Dependent Variable
|
Store Size against |
||||||
|
PD-05 |
MI-05 |
BR-05 |
PR-05 |
HS-05 |
EX-05 |
SS-05 |
R |
0.35 |
0.14 |
0.07 |
0.12 |
0.02 |
0.30 |
0.16 |
R Square |
0.12 |
0.02 |
0.01 |
0.01 |
0 |
0.09 |
0.03 |
Adjusted R Square |
0.11 |
0 |
0 |
0 |
0 |
0.07 |
0.01 |
t Statistics |
2.84 |
-1.08 |
0.56 |
0.92 |
-0.16 |
2.37 |
1.26 |
Significance F |
0.01 |
0.29 |
0.57 |
0.36 |
0.87 |
0.02 |
0.21 |
|
PD-10 |
MI-10 |
BR-10 |
PR-10 |
HS-10 |
EX-10 |
SS-10 |
R |
0.23 |
0.04 |
0.03 |
0.06 |
0.09 |
0.12 |
0.17 |
R Square |
0.05 |
0 |
0 |
0 |
0.01 |
0.01 |
0.03 |
Adjusted R Square |
0.04 |
0 |
0 |
0 |
0 |
0 |
0.01 |
t Statistics |
1.80 |
-0.27 |
0.19 |
0.48 |
0.73 |
0.91 |
1.28 |
Significance F |
0.08 |
0.79 |
0.85 |
0.63 |
0.47 |
0.37 |
0.21 |
|
PD-15 |
MI-15 |
BR-15 |
PR-15 |
HS-15 |
EX-15 |
SS-15 |
R |
0.13 |
0.01 |
0.09 |
0.05 |
0.22 |
0.02 |
0.11 |
R Square |
0.02 |
0 |
0.01 |
0 |
0.05 |
0 |
0.01 |
Adjusted R Square |
0 |
0 |
0 |
0 |
0.03 |
0 |
0 |
t Statistics |
1.03 |
-0.06 |
0.66 |
0.36 |
1.71 |
0.18 |
0.80 |
Significance F |
0.31 |
0.96 |
0.51 |
0.72 |
0.09 |
0.86 |
0.42 |
Note: PD - Population Density; MI - Median Household Income; BR - Percentage of Black; PR - Percentage of Poverty; HS - Housing Units Built in 1980-1990; EX - Number of Highway Exits; SS - Total Area of Competitive Stores; EG - Edge.
The results indicate that only variables of Population Density and Number of Highway Exits for the 5 minute service ring show statistical significance. None of other variables shows statistical significance. This is consistent with the results from the previous regression analysis.
Next, I performed a regression analysis excluding independent variables with high correlation among themselves. To identify these independent variables, a correlation analysis was conducted.
The value of the Coefficient of Determination (CD) indicates the percentage of total variation tract can be explained by the regression model. CD is the square of the Correlation Coefficient. I assume that at least 50 percent of total variations should be explained by the regression line if the two variables are considered highly correlated, so the correlation coefficient should be approximately 0.7. Therefore, any pair of variables with correlation coefficient higher than 0.7 will be considered highly correlated.
The results show that MI-05 & PR-05, BR-05 & PR-05, BP-05 & PR-05, MI-10 & BR-10, MI-10 & PR-10, BR-10 & PR-10, PR-10 & EX-10, MI-15 & BR-15, MI-15 & PR-15, BR-15 & PR-15, and PR-15 & EX-15 are pairs of independent variables with high correlation. None of the pairs will be included in the same regression model. The results of the adjusted regression analysis under this condition are listed in Table 12, Table 13, and Table 14.
Table 12. Results of Regression Analysis with Consideration of Multicollinearity for 5 Minute Service Area
Regression Results |
Without PR-05 |
Without MI-05 and BR-05 |
||
|
R |
0.46 |
0.46 |
|
Regression |
R Square |
0.21 |
0.21 |
|
Statistics |
Adjusted R Square |
0.1 |
0.12 |
|
|
PD |
1.96 |
2.21 |
|
|
MI |
0.23 |
|
|
|
BR |
-1.02 |
|
|
t Statistics |
PR |
|
-1.37 |
|
|
HS |
-1.56 |
-1.57 |
|
|
EX |
2.07 |
2.16 |
|
|
SS |
0.61 |
0.73 |
|
Significance F |
0.08 |
0.04 |
For the 5 minute service ring, the results show that the model does have statistical significance when variables of MI and BR are excluded. The model can be expressed in the following formula.
Store Size = 10.41 + 9.37*PD - 820*PR - 11.24*HS + 368*EX + 0.02*SS + 3603*EG
No other model has statistical significance. However, in both cases Population Density and # of Highway Exits show statistical significance. Statistical significance is shown in darker shades.
Table 13. Results of Regression Analysis with Consideration of Multicollinearity for 10 Minute Service Area
Regression Results |
Without BR-10 and PR-10 |
Without MI-10 and PR-10 |
Without MI-10, BR-10 and EX-10 |
||
|
R |
0.27 |
0.26 |
0.26 |
|
Regression |
R Square |
0.07 |
0.07 |
0.07 |
|
Statistics |
Adjusted R Square |
0 |
0 |
0 |
|
|
PD |
1.3 |
1.28 |
1.34 |
|
|
MI |
0.5 |
|
|
|
|
BR |
|
-0.43 |
|
|
t Statistics |
PR |
|
|
-0.37 |
|
|
HS |
-0.48 |
-0.44 |
-0.39 |
|
|
EX |
0.25 |
0.26 |
|
|
|
SS |
0.69 |
0.63 |
0.59 |
|
Significance F |
0.67 |
0.68 |
0.55 |
For the 10 minute service ring, the results indicate that neither the models nor the individual variables show statistical significance for the 10 minute service area. Therefore, no single variable shows dominant effects on store location.
Table 14. Results of Regression Analysis with Consideration of Multicollinearity for 15 Minute Service Area
Regression Results |
Without BR-15, PR-15, and SS-15 |
Without BR-15, PR-15, and HS-15 |
Without MI-15, PR-15, and SS-15 |
Without MI-15, PR-15, and HS-15 |
Without MI-15, BR-15, EX-15, and SS-15 |
Without MI-15, BR-15, EX-15, and HS-15 |
||
|
R |
0.32 |
0.20 |
0.34 |
0.22 |
0.29 |
0.20 |
|
Regression |
R Square |
0.10 |
0.04 |
0.11 |
0.05 |
0.08 |
0.04 |
|
Statistics |
Adjusted R Square |
0.02 |
0 |
0.03 |
0 |
0.02 |
0 |
|
|
PD |
0.85 |
1.13 |
0.69 |
0.83 |
0.68 |
0.92 |
|
|
MI |
-0.56 |
0.02 |
|
|
|
|
|
|
BR |
|
|
0.98 |
0.70 |
|
|
|
t Statistics |
PR |
|
|
|
|
-0.09 |
-0.23 |
|
|
HS |
2.07 |
|
2.27 |
|
1.81 |
|
|
|
EX |
-1.03 |
0.78 |
-1.23 |
1.13 |
|
|
|
|
SS |
|
-0.41 |
|
-0.82 |
|
0.69 |
|
Significance F |
0.31 |
0.80 |
0.25 |
0.72 |
0.29 |
0.70 |
For the 15 minute service ring, the results show that none of the models has statistical significance. However, the variable of Housing Units Built in 1980-1990 does show statistical significance. This indicates the importance of this variable for the 15 minute service ring. Since the 15 minute service ring covers a larger area and new housing units must be built in areas with land availability, it makes sense that the Housing variable becomes significant in larger service ring.
CONCLUSION
From both descriptive and linear regression analyses, many valuable findings can be concluded. However, some results are not as expected. Large discount stores have the dominant position in today's retail environment. Most people go to discount stores for a variety of goods and services. The locations of large discount stores are largely based on where those customers live, the economic status of those customers, and where the major road network extends.
According to the findings of this study, overall large discount stores in Metro Atlanta tend to locate in newly growing suburban areas. I feel that this is partly because of land availability in suburban areas.
Although discount stores need a certain amount of customer expenditure for them to sustain, few of them are located in very high income communities. This may imply that the richer people have more power to control discount store locations or the land there is simply too expensive or too secluded to be developed.
This finding also reflects the major customer base of large discount stores - lower and middle income people, since discount stores sell cheaper goods. However, it also shows the potential for stores to expand their customer base through changing some of their development strategies.
Generally, Kmart has the most stores located in higher density, higher black and poverty rates, lower income, and older central city areas. The four stores within the city limits of Atlanta are Kmart stores. This may be because Kmart started from the central city, which was once the center of prosperity prior to development of automobile-oriented suburban areas.
I find that Wal-Mart stores tend to locate in lower density, higher income, lower black rate and lower poverty rate suburban areas. Sam's Club has a similar situation. Target has the highest profile in merchandise, so it also tends to locate in lower density, higher income, lower black rate and lower poverty rate suburban areas.
One conclusion from my research is that there are many factors in the process of large discount store location determination and none of them is dominant. Each factor plays its role with some degree of importance under certain circumstances. Among them, Population Density and Number of Highway Exits of the 5 minute service ring and Housing Units Built in 1980-1990 of the 15 minute service ring do show statistical significance. This indicates that customer base, highway accessibility, land availability and local economies are still major factors on store location.
From this study, the regression model for the 5 minute service ring does show statistical significance when variables of MI-05 and BR-05 are excluded. Therefore, a mathematical formula is formed. This formula could be used to estimate the square footage of a proposed store under the same assumptions.
The degree of importance of variables changes from site to site and from store to store. While demographics, transportation networks and competition have effects on the location decision, other factors excluded from this study obviously have their roles to play, given low amount of explanation in my model. These factors may be the basis of location decisions or only the supporting rationale to justify those decisions. After all, a store location may depend on the force of supply and demand and the strategy or the ambition a store tries to dominate the market.
DISCUSSION
Apparently, the above facts provide retail developers a clear direction for their site selections. New and growing suburban areas are still the obvious choices for them. However, the potential of other sites and classes can not be ignored. The central city still has a central position in provision of political, economic and cultural amenities. Apparently, there is also a demand for discount retailing in central city. This may be the ultimate reason for a developer going into this market. The diversity on population, race and culture in central cities is a big challenge for general retailing. A homogenous area is easier for retailers to meet the requirements of customers with less cost. However, it is also a large opportunity which only large stores may be able to take.
This opportunity actually provides another direction for the expansion and further development of discount stores. This kind of opportunity may not be transformed into a big success immediately. It needs a visionary strategy to look beyond today's limitations. From the location of the stores to the merchandise profile, each aspect of store portfolio should be based on this vision.
To planners, large discount store locations are also important in realizing their professional ideology. Planners are educated to promote public safety, health, and welfare. While newly growing suburban areas are hot spots for all kinds of development, a big issue is how to avoid suburban sprawl and unorganized development. With the important position of large discount stores in suburban areas, their location can be applied as a focal point when developing plans. Discount stores' relation to other sectors of local economy can anchor a development strategy.
In the process of promoting public interest, equality is a big issue. While some apparent good places such as growing suburban areas are easily developed, declining areas such as central cities are the places where planners need to pay more attention. The continuing decline of city ultimately will do harm to the healthy development of suburban areas. Without a prosperous and functional central city, we are losing the connections which are necessary to maintain a society and a nation as a whole.
Given the city's position in our social structure, its revitalization benefits the whole society and nation. This kind of revitalization requires large investments and visionary collaboration from both the public and private sectors to support it. Large discount retailing has the potential to act as a leader in this process. Research and findings on discount store location will help planners understand which measures they might take to improve the investment environment of central cities.
Limitations
The findings from this study are limited to the way the study is conducted and the methodology applied. First, for the convenience of study, some factors were excluded for the study, especially subjective variables such as managerial judgments. This may be why some of the selected variables didn't show consistency among different store types.
The regression model I applied is based on linear assumptions. Apparently this is not true for the store location factors. For example, from descriptive analysis, most stores are located in the middle income tracts. It suggests that some other statistical models may have a better fit on the same set of data.
In my data analysis, I treated each store as an individual one trying to find a best site for itself. In reality, when a store chain is getting bigger and bigger to a regional and national scale, the efficiency of store network has been discussed as important in location decisions. Individual stores may be required to give up the best site for itself to accommodate the overall strategy of reaching the most efficient store network.
Limitations of study certainly point out the future research areas. The opinions of the retail developers and store managers are vital to location research. After all they are the ones to make the final decisions on site selection. The relation between discount stores and other sectors of economies are also worthy of examination. Study of discount store chains may also provide information of the changing strategies of retailers. Additionally, more advanced analytical tools may be explored to better explain the meanings of the data such as logit model or fine-toned GIS model.
While statistical methods may produce an easy formula, it is certainly not the only research direction. As many researchers and my study have done, descriptive analysis of store locations still shows a bright future. Understanding all sorts of information justifying proposed sites and their surrounding areas, combined with limited calculations, could be the best tool for decision-makers to secure a good location.
BIBLIOGRAPHY
Applebaum, William and others. Guide to Store Location Research. Reading, Massachusetts: Addison-Wesley Publishing Company, 1968.
Farris, J. Terrence. Structural Determinants of Discount Department Store Locations in the Central Cities of the Top 50 Metropolitan Areas. Ann Arbor, Michigan: A Bell & Howell Company, 1996.
Ingene, Charles A.. "Structural Determinants of Market Potential." Journal of Retailing, Spring 1984, v60, n1:37-64.
Roca, Ruben A., ed.. Market Research for Shopping Centers. New York: International Council of Shopping Centers, 1980.
White, John K. and Kevin D. Gray. Shopping Centers and Other Retail Projects. New York: John Wiley & Sons, Inc., 1996.
Tong Zhou
GIS Planner
North Delta Planning & Development District, Inc.
PO Box 1496
Batesville, MS 38606
Tel (601) 561-4100 Fax (601) 561-4112