SAS Institute and Environmental Systems Research Institute (Esri) have many customers that use each company's software in loosely integrated applications and solutions. Usually, these applications rely on SAS software for data manipulation and statistical analysis and Esri software for mapping and spatial analysis. The 'loose' level of integration typically involves importing and exporting data from and to proprietary formats using a variety of non-standard processes. However, this type of integration limits the potential and scope-of-use of these combined applications.
Open access to SAS compute services and data sources is available beginning with SAS Version 8. SAS Integration Technologies is a new product that gives client applications access to a wide range of features that are part of the SAS System both on the desktop and over the Web. Object Linking and Embedding for Databases (OLE DB) consumers enable SAS users to directly create and manipulate external database management system (DBMS) tables, and OLE DB providers deliver SAS data sets to consumers. Coupled with the fine granularity of the Component Object Model (COM) objects available in Esri ArcInfo 8, the Java application program interfaces (APIs) available with Esri ArcIMS, and the geodatabase model in ArcInfo 8, a wide range of integrated applications can now be developed.
This paper discusses the tools available with SAS Integration Technologies and provides simple coding examples of how these tools may be used.
For many years, application developers in organizations around the world have loosely integrated SAS and Esri software to solve a variety of business, environmental, and scientific problems. Typically, the applications involve spatial analysis in ArcInfo and data manipulation and statistical analysis in SAS.
However, a big limitation of these applications is the lack of a common data store or standard access methods. For example, data from the results of spatial analysis performed in ArcInfo have to be transferred to SAS for further analysis, and vice versa. This is accomplished in a variety of ways that usually involve creating flat files and home-grown import and export procedures.
Without doubt, this level of integration has limited the effectiveness of some of these applications. In addition, development of more advanced applications that can take advantage of the strengths of software from SAS and Esri have not been undertaken due to this limitation. Fortunately, with the software industry trends moving toward conforming to open access, this problem might soon be solved.
Over the years, SAS introduced software that follows this trend, beginning with the release of Open Database Connectivity (ODBC) drivers that allow non-SAS clients to query SAS data sources. Open Web standards (Java, CGI, HTML) that give access to SAS data sources and compute services through Web clients were then introduced. Now, with Version 8 and support for application integration standards through the Integration Technologies software and OLE DB providers available through SAS/ACCESS, SAS computational and data servers are more openly available. These advancements make it much easier for application developers to create solutions that use the strengths of both SAS and Esri software in a seamless manner.
The SAS technology that enables these integrated solutions with ArcInfo 8 is introduced in the following section. This paper is not meant to be a guide for developing these applications – it is an overview of the technology that is available. For those who want more detailed information about this topic, a “Recommended Reading” section is provided in this paper. The reader is assumed to have a basic familiarity with both SAS and Esri software.
Open data access has evolved since the early 1990s when an industry consortium promoting portability and interoperability among DBMS vendors released the Open Database Connectivity (ODBC) API. Accordingly, SAS developed an ODBC driver that allows access to SAS data sources from ODBC-compliant applications.
Due to a number of problems with ODBC and the need to support the Web and multi-threaded applications, Microsoft developed the OLE DB specification. OLE DB evolved from ODBC. OLE DB interfaces provide a standard by which applications can uniformly access data located over an enterprise's entire network and stored in a variety of formats, such as SAS data sets, database files, and non-relational and multidimensional stores. OLE DB interfaces can provide much of the same functionality that is appropriate for the data in its native DBMS.
Providing standard interfaces is part of an overall industry trend toward universal data access. The goal of this approach is to develop software programs that conform to a widely used specification, thereby giving these programs the ability to use many different types of data files. As this trend continues, OLE DB is expected to become the dominant method of accessing new database products. Operationally, OLE DB interfaces work through data providers built to the OLE DB specification. Software manufacturers, who want to have their data sources consumed, build the data providers.
To make OLE DB technology more accessible to languages such as Visual Basic, Microsoft provides an abstraction layer called ActiveX Data Objects (ADO). ADO provides most of the power of OLE DB but requires less coding and is easier to learn. (The Visual Basic examples in this paper use ADO.)
OLE DB providers are any data sources that implement OLE DB interfaces. There are three SAS OLE DB data providers that conform to the OLE DB specification. They are:
Each of these providers can be used programmatically in external applications to access SAS data sets. Visual Basic coding samples that use the SAS Local Data Provider and the SAS IOM Data Provider are in the “Examples” section of this paper.
An OLE DB consumer is a piece of system or application code that implements an OLE DB interface. The SAS OLE DB interface implements OLE DB data sources from within a SAS session. This interface allows you to treat any OLE DB compliant data source as if it were a SAS data set. For example, you, can assign a SAS library to an MS Access database and manipulate tables in that database as if they were SAS data sets. Because MS Access is the database for personal geodatabases in ArcGIS, attribute data stored in those geodatabases can be directly manipulated in SAS.
The OLE DB interface is available with SAS/ACCESS software.
Open application access allows solutions to be developed by mixing and matching component software that has been designed to open standards. For SAS and Esri, the common standard to which these components have been developed is the Component Object Model (COM). COM is a language-independent, open architecture software design standard that promotes interoperability and reuse through object-oriented technology. Esri provides open application access through a very extensive set of COM objects available with ArcGIS. SAS provides application access through a set of COM objects in IOM that are distributed with the SAS Integration Technologies software.
IOM is a distributed object model. It gives application developers a set of objects that they can script in their client application. The distributed object then provides the requested services on the SAS server. In addition, clients can connect to SAS through IOM servers with both Common Object Request Broker Architecture (CORBA) methods and Java Database Connectivity (JDBC) methods.
IOM consists of a hierarchy of objects, which are depicted in Figure 1. The object at the top of the hierarchy is the SAS workspace. From the SAS workspace object, the application can create the following:
For additional information about these IOM objects, please refer to the papers listed in the “Recommended Reading” section of this paper.
Figure 1: IOM Object Hiearchy
Following are three examples of open data and application access using Visual Basic and ADO. The first example exhibits simple open data access using the SAS Local Data Provider, and the second example shows SAS as a consumer of OLE DB data. The third example uses the SAS IOM Data Provider for both data and application access. More detailed examples are included in the SAS white paper, Developing an Open Client in Visual Basic
The main advantage to using the local data provider as opposed to the IOM data provider is that the former can be used independently of a SAS session. This reduces overhead. However, if further processing in SAS is needed, then the IOM data provider should be used.
The following code opens a SAS data set using ADO connection and record set objects. To view the file you can add a DataGrid control, or you can manipulate the data further using ADO.
‘Declare ADO connection and recordset objects Dim obConnection as New ADODB.Connection Dim obRecordset as New ADODB.Recordset ‘Open the connection object using the Local Data Provider obConnection.Provider = “SAS.LocalProvider.1” obConnection.Properties(”Data source”) = c:\”SAS data set” obConnection.Open ‘Open the data set obRecordset.Open ‘Manipulate or display the data set ... ‘Close the objects when done obRecordset.Close obConnection.Close
This example shows SAS as a consumer of OLE DB data by opening an MS Access table then printing information about the table. A user that is comfortable with programming in SAS would find this a viable option for manipulating data from an ArcGIS geodatabase.
*Assign a SAS library reference Libname accessdb oledb Provider="Microsoft.JET.OLEDB.4.0" Properties=( "Data source"="c:\arcgis\arctutor\map\airport.mdb"); *List the data sets in the library Proc Datasets Lib=Accessdb; *Display the Rivers table Proc Contents Data=Accessdb.Rivers;
The SAS IOM Data Provider requires a running SAS session (SAS workspace). As in the previous example, this provider can be used to open and manipulate SAS data sets. Once a SAS workspace is established, the SAS code can be submitted with either the LanguageService.Submit method as inline code or as a stored procedure using the StoredProcessService object.
The data in this example is stored in an MS Access personal geodatabase, and the table (customers) contains the results of a drive-time analysis for each customer to a store location. This table is passed to the SAS Cluster procedure, and the results are stored back in the personal geodatabase for further analysis in ArcMap.
‘Declare SAS workspace objects Dim obWsMgr As New SASWorkspaceManager.WorkspaceManager Dim obSAS As SAS.Workspace Dim obSASProc as SAS.StoredProcessService Dim xmlInfo As String ‘Declare ADO Connection and Recordset objects Dim obConnection As New ADODB.Connection Dim obRecordset as New ADODB.Recordset ‘Open a connection to a local SAS server Set obSAS = obWsMgr.Workspaces.CreateWorkspaceByServer _ ("",VisibilityProcess,””,"","", xmlInfo) obConnection.Open "provider=sas.iomprovider.1;" & _ "SAS Workspace ID=" & obSAS.UniqueIdentifier 'Assign a SAS library reference to an ArcGIS personal geodatabase stored in ‘ MSAccess using the StoredProcessService object Set obSASProc = obSAS.LanguageService.StoredProcessService obSASProc.Repository = "file:c:\code\sas" obSASProc.Execute "MSAccess" MsgBox obSAS.LanguageService.FlushLog(100000) 'Submit SAS code to do a simple cluster analysis and save the output ‘ an MSAccess table where it can be displayed in ArcMap obSAS.LanguageService.Submit _ "proc cluster data=accessdb.customers method=centroid pseudo outtree=accessdb.tree;" & _ "id StoreName; var Distance; run;" MsgBox obSAS.LanguageService.FlushLog(100000) ‘Shut down the SAS server obWsMgr.Workspaces.RemoveWorkspaceByUUID obSAS.UniqueIdentifier obSAS.Close ‘Close the objects when done obConnection.Close
SAS Version 8.2 offers many methods to develop integrated solutions using open access technology. These range from OLE DB providers and consumers to COM and CORBA objects. This technology can be used now to develop truly integrated solutions with ArcInfo 8 software from Esri.
Although this paper gives a somewhat simplified overview of this technology, the information shows the potential for future applications. Subsequent papers will discuss more advanced integration issues and will provide more detailed examples. In the interim, it is hoped that this paper will stimulate the development of these integrated solutions.
SAS Institute Staff who contributed to the completion of this paper:Kathleen Walch