A Journey from OLTP to OLAP


Introduction

In this first Part  of the
series I will discuss the OLTP and OLAP and the transition from OLTP to OLAP in pursuit performance improvements as well as the
benefits of the each technology.

Note**: This article
is intended for the users who are new to the concepts of the data warehousing.

What is OLTP?

OLTP stands for the Online Transaction Processing 
which is the SQL Server database engine which has been optimized for the insert
updates and deletes. Any kind of the insert, update and delete is termed a
transaction – hence the term transaction processing. While designing this
kind of a database, relationships are defined between the master and
transaction tables to analyze the data and get the proper correlation between
the various entities and attributes of the database. The tables should be
highly normalized using various normal forms to achieve the query optimization.
This relational OLTP database forms the backend of most desktop/web applications.

Benefits:

  • Uses E-R Model for the Database design
  • Real time and very detailed data analysis
  • Used for simple reporting
  • Better performance when the database is very small

What is OLAP?

The OLAP stands for Online Analytical Processing.
The OLAP database is highly de-normalized and is mostly used for analysis and
reporting purposes. OLAP has the capability to enable business users to generate detailed reports without having a dependence on the IT department. OLAP comes with a multi-dimensional model to analyze data using N number of dimensions and measures
which are the part of the standard Data warehouse.

Benefits:

  • Uses the Star and Snowflake schema
    for the data warehouse design
  • Stores historical data for the
    trend analysis
  • Used for ‘slicing and dicing’
    data
  • Improved performance as the database
    is de-normalized
  • Data can be integrated from disparate
    data sources

What it BI?

Business Intelligence is the ability for an organization to take all
its capabilities and convert them into knowledge, ultimately, getting the right
information to the right people, at the right time, for
making strategic decisions.

What is Data Warehousing?

Data Warehousing is the process of
creating data marts, by identifying the dimensions and facts  which will cater
to the need  of the organization’s  various departments and the key performance
indicators (KPI) which are useful in determining the organization’s trend  in
the real time. Though there is more to Data warehousing than just
identifying the dimensions and facts and slicing and dicing of the data, the purpose
of this article series is to focus on how to built one.

The following are the properties of the
data warehouse:

  1. Subject-oriented:The data in the data warehouse
    is organized so that all the data elements relating to the same real-world
    event or object are linked
  2. Time-variant:The changes to the data in the
    database are tracked and recorded so that reports can be produced showing
    changes over time
  3. Non-volatile:Data in the data
    warehouse is never over-written or deleted. Once committed, the data is static,
    read-only, and retained for future reporting
  4. Integrated:The database contains
    data from most or all of an organization’s operational applications, and that
    this data is made consistent

Data Warehouse Development Approaches

There two primary models for the development of a
Data Warehouse :

Inmon Model:
Enterprise Data Warehouse (EDW)  – Top down Approach

The Inmon model focuses on the Big bang
approach while developing a Data warehouse. The top down approach means the
implementation of the data warehouse for an entire organization simultaneously.
The data marts are then derived from the data warehouse. The data in the Data
Warehouse should be in the 3rd Normal form.

Kimball Model: Data mart approach – Bottom
up Approach

The Kimball model is based on the bottom
up approach which means the implementation of the data warehouse is by first
developing the department specific data marts and then merging them to
get one big data warehouse.

Though both the methods have their own
pros and cons, There is no one-size-fits-all strategy to data warehousing.

Elements of BI

The elements of BI comprises of the
dimensions and the facts. The KPI’s are derived from the dimensions and the
facts based on the given business scenarios.

Dimensions

A dimension is any entity which can be used for
analyzing facts. Dimensions can be related with the master tables of the OLTP systems

Typical examples of dimensions are : Product, Location, Customers. These are the standard sets of dimensions which you may find across any
Industry.

Facts

Facts are the quantities and calculated field such
as Amounts, counts, quantity etc. which are to be analyzed and can be related with the transaction tables of
OLTP

Typicaly examples of Facts are Sales Amount by Product, by
Customer, By Year , By Region.

When developing the data warehouse
model  the following schemas are used to identify the relationship between the
facts and the dimensions:

Star Schema

In Star schema, the central fact table  is connected to the dimension tables surrounding
the fact table in a star-like shape.

Snowflake Schema

In the
Snowflake schema, the dimension table is further normalized to capture the
details of an entity. The location table is further normalized to detail the
state data as shown below:

 

 

Thus in this
first part of the series we have looked at the basics of the data warehousing, next we will look at how to go about
implementing a data warehouse.




Array

2 Responses to “A Journey from OLTP to OLAP”

  1. Thanks for sharing knowledge on this topic. Keep it up.

Trackbacks/Pingbacks

  1. A Journey from OLTP to OLAP- Part 2 - SQL Server Performance - September 17, 2013

    […] my last  article we saw the basics of the data warehousing in which we saw looked at how OLAP and OLTP technologies […]

Software Reviews | Book Reviews | FAQs | Tips | Articles | Performance Tuning | Audit | BI | Clustering | Developer | Reporting | DBA | ASP.NET Ado | Views tips | | Developer FAQs | Replication Tips | OS Tips | Misc Tips | Index Tuning Tips | Hints Tips | High Availability Tips | Hardware Tips | ETL Tips | Components Tips | Configuration Tips | App Dev Tips | OLAP Tips | Admin Tips | Software Reviews | Error | Clustering FAQs | Performance Tuning FAQs | DBA FAQs |