SQL Server Analysis Services ( SSAS ) – Introduction
Client tools (generic Reporting and OLAP clients as well as custom applications) access the data through the UDM. The client API is the public standard XML/A, a SOAP-based protocol for issuing commands and receiving responses, exposed as a web service. In addition, client object models are provided over XML/A, including both a managed provider (ADO MD.Net) and a native OLE DB provider. The UDM-server is Microsoft Analysis Services. The client tools can be any of the following:
- Office tools
- Numerous external analysis and reporting tools (such as ProClarity and Crystal Reports).
- SQL Server Reporting Services
Analysis Services Objects
A SSAS instance contains database objects and assemblies for use with online analytical processing (OLAP) and data mining.
- Databases contain OLAP and data mining objects, such as data sources, data source views, cubes, measures, measure groups, dimensions, attributes, hierarchies, mining structures, mining models and roles.
- Assemblies contain user-defined functions that extend the functionality of the intrinsic functions provided with the Multidimensional Expressions (MDX) and Data Mining Extensions (DMX) languages.
1) Data Sources
A data source in Microsoft represents a connection to a data source and contains the connection string that defines how Analysis Services connects to a physical data store using a managed Microsoft .NET Framework or native OLE DB provider. The connection string contains server name, database, security, timeout, and other connection-related information. Analysis Services directly supports many data sources. Supported data sources include Microsoft SQL Server databases and databases created by other products, including Oracle, DB2, and Teradata.
2) Data Source Views
A data source view contains the logical model of the schema used by Analysis Services database objects—namely cubes, dimensions, and mining structures. A data source view is the metadata definition, stored in an XML format, of these schema elements used by the Unified Dimensional Model (UDM) and by the mining structures.
A data source view:
- Contains the metadata that represents selected objects from one or more underlying data sources, or the metadata that will be used to generate an underlying relational data store.
- Can be built over one or more data sources, letting user define OLAP and data mining objects that integrate data from multiple sources.
- Can contain relationships, primary keys, object names, calculated columns, and queries that are not present in an underlying data source and which exist separate from the underlying data sources.
- Is not visible to or available to be queried by client applications.
A cube is a set of related measures and dimensions that is used to analyze data.
- A measure is a fact, which is a transactional value or measurement that a user may want to aggregate. Measures are sourced from columns in one or more source tables, and are grouped into measure groups.
- A dimension is a group of attributes that represent an area of interest related to the measures in the cube, and which are used to analyze the measures in the cube. For example, a Customer dimension might include the attributes Customer Name, Customer Gender, and Customer City, which would enable measures in the cube to be analyzed by Customer Name, Customer Gender, and Customer City. Attributes are sourced from columns in one or more source tables. The attributes within each dimension can be organized into hierarchies to provide paths for analysis.
A cube is then augmented with calculations, key performance indicators (KPIs), actions, partitions, perspectives, and translations. A cube is essentially synonymous with a Unified Dimensional Model (UDM).
A calculation is a Multidimensional Expressions (MDX) expression or script that is used to define a calculated member, a named set, or a scoped assignment in a cube in Microsoft SQL Server 2005 Analysis Services. Calculations lets the user to add objects that are defined not by the data of the cube, but by expressions that can reference other parts of the cube, other cubes, or even information outside the Analysis Services database. A calculation lets to extend the capabilities of a cube, adding flexibility and power to business intelligence applications.
B) KPI (Key performance indicator)
A Key Performance Indicator (KPI) is a quantifiable measurement for gauging business success. A KPI is frequently evaluated over time. For example, the sales department of an organization may use monthly gross profit as a KPI, but the human resources department of the same organization may use quarterly employee turnover.
An action is a stored MDX statement that can be presented to and employed by client applications. In other words, an action is a client command that is defined and stored on the server. An action also contains information that specifies when and how the MDX statement should be displayed and handled by the client application. The operation that is specified by the action can start an application, using the information in the action as a parameter, or can retrieve information based on criteria supplied by the action.
Partitions are used by Analysis Services to manage and store data and aggregations for a measure group in a cube. Partitions are a powerful and flexible means of managing cubes, especially large cubes. For example, a cube that contains sales information can contain a partition for the data of each past year and also partitions for each quarter of the current year. Only the current quarter partition needs to be processed when current information is added to the cube.Processing a smaller amount of data will improve processing performance by decreasing processing time. At the end of the year the four quarterly partitions can be merged into a single partition for the year and a new partition created for the first quarter of the new year. Further, this new partition creation process can be automated as part of your data warehouse loading and cube processing procedures.
Cubes can be very complex objects for users to explore in Analysis Services. A single cube can represent the contents of a complete data warehouse, with multiple measure groups in a cube representing multiple fact tables, and multiple dimensions based on multiple dimension tables. Such a cube can be very complex and powerful, but daunting to users who may only need to interact with a small part of the cube in order to satisfy their business intelligence and reporting requirements.
One can use a perspective to reduce the perceived complexity of a cube in Analysis Services. A perspective defines a viewable subset of a cube that provides focused, business-specific or application-specific viewpoints on the cube. The perspective controls the visibility of objects that are contained by a cube. The following objects can be displayed or hidden in a perspective:
- Measure groups
- Key Performance Indicators (KPIs)
- Calculations (calculated members, named sets, and script commands)
Objects in a cube that are not visible to the user through a perspective can still be directly referenced and retrieved using XML for Analysis (XMLA), Multidimensional Expressions (MDX), or Data Mining Extensions (DMX) statements.
A perspective is a read-only view of the cube. Objects in the cube cannot be renamed or changed by using a perspective. Similarly, the behavior or features of a cube, such as the use of visual totals, cannot be changed by using a perspective.
F) Cube translations
Cube translation is a language-specific representation of the name of a cube object, such as a caption or a display folder. Analysis Services also supports translations of dimension and member names.
Translations provide server support for client applications that can support multiple languages. Frequently, users from different countries view cube data. It is useful to be able to translate various elements of a cube into a different language so that these users can view and understand the cube’s metadata.
Dimensions are a fundamental component of cubes. Dimensions organize data with relation to an area of interest, such as customers, stores, or employees, to users. Dimensions are groups of attributes based on columns from tables or views in a data source view. Dimensions exist independent of a cube, can be used in multiple cubes, can be used multiple times in a single cube, and can be linked between Analysis Services instances. A dimension that exists independent of a cube is called a database dimension and an instance of a database dimension within a cube is called a cube dimension.
Dimensions in Analysis Services contain attributes that correspond to columns in dimension tables. These attributes appear as attribute hierarchies and can be organized into user-defined hierarchies, or can be defined as parent-child hierarchies based on columns in the underlying dimension table. Hierarchies are used to organize measures that are contained in a cube.
5) Mining structures
There are several objects involved in data mining in SSAS. Following are the two primary objects that are used:
- Data mining structure
- Data mining model
The mining structure is a data structure that defines the data domain from which mining models are built. A single mining structure can contain multiple mining models that share the same domain.
The building blocks of the mining structure are the mining structure columns, which describe the data that the data source contains. These columns contain information such as data type, content type, and how the data is distributed.
B) Data Mining Model
A data mining model applies a mining model algorithm to the data that is represented by a mining structure. Like the mining structure, the mining model contains columns. A mining model is contained within the mining structure, and inherits all the values of the properties that are defined by the mining structure. The model can use all the columns that the mining structure contains or a subset of the columns.
In addition to the parameters that are defined on the mining structure, the mining model contains two properties: Algorithm and Usage. The algorithm parameter is defined on the mining model, and the usage parameter is defined on the mining model column.
Roles are used to manage security for Analysis Services objects and data. In basic terms, a role associates the security identifiers (SIDs) of Microsoft Windows users and groups that have specific access rights and permissions defined for objects managed by an instance of Analysis Services. Two types of roles are provided in Analysis Services:
- The server role, a fixed role that provides administrator access to an instance of Analysis Services.
- Database roles, roles defined by administrators to control access to objects and data for non-administrator users.
Analysis Services lets the user add assemblies to an Analysis Services instance or database. Assemblies lets the user create external, user-defined functions using any common language runtime (CLR) language, such as Visual Basic .NET or Visual C#. Assemblies help to extend the business functionality of MDX and DMX. The user can build the functionality that the user wants into a library, such as a dynamic link library (DLL) and add the library as an assembly to an instance of Analysis Services or to an Analysis Services database. The public methods in the library are then exposed as user-defined functions to MDX and DMX expressions, procedures, calculations, actions, and client applications.