January 16, 2014

Dimensional Modelling

There isn’t any specific standard to model data warehouse. It can be built either using the “dimensional” model or the “normalised” model methodologies. Normalised model normalises the data into third normal form (3NF) whereas dimensional model collects the transactional data in the form of facts and dimensions. Normalised model is easy to use as we can add related topics without affecting the existing data. But one must have good knowledge of how data is associated before performing specific query, so it might be difficult to handle. Reporting queries may not execute as well because massive numbers of tables may be involved in each query. Dimensional model is very efficient to use for non experts and performs pretty well as data is classified in a logical way and similar types of data are stored together. But while adding new topics whole warehouse must be reprocessed (Jones and Johnson, 2010).
Dimensional model is designed to optimise decision support query function in relational databases, where as normalised model is designed to eliminate redundancy problem of the data model, retrieving data which contains identifiers and therefore optimise online transaction processing (OLTP) performance (Firestone, 1998). Therefore it can be said dimensional model is the best modelling method in data warehousing. 
Data Cube
OLAP stores data in arrays which are the logical presentation of business dimensions. Multidimensional array represents intelligence of data elements relationships because analyst’s views depend on it. This multidimensional array data model is called “Data Cube” (Kirkgiize et al., 1997). It consists of facts and dimensions instead of rows and columns as in relational data model. Facts are the accurate or numeric data of business activity and dimensions are the sets of attributes that put facts into context (Wang, Chen, Chiu, 2005). Dimensions are interconnected in hierarchies, for example, city, state, region, country and continent. Figure 1 shows three dimensional views data cube of sales data (Kirkgiize et al., 1997).
In data cube each cell contains one or several values called measures or metrics. It helps to analyse aggregated facts and the level of detail is directly proportional to number of dimensions in the cube. Here axes represents dimension and the space represents facts (Wang et al., 2005). Above shown figure of data cube contains three dimensions namely Time, Geography and Product. Each cell consists of (T, P, G) and business measure is the total sales. Here the amount of product P along with total sales sold in geography G in time period T is stored. Figure 2 shows hierarchy of dimensions. 

Features of data cube:

Slice and dice

The user can generate the exact view of the data required with the help of data cube. This process is called slice and dice because the data is evaluated according to a subset alignment. The subset can also be rotated and figured as a parent one. For example, to find out the sales in a particular place in the specific time, slice and dice operation can be operated in the cube. This will help to choose arrangement along each dimension, it may be time, geography or product according to user needs (Kirkgiize et al., 1997).

Drill-down and roll-up

Data cube may contain hierarchical dimensions. This hierarchical dimension provides data manipulation and detailed analysis for various levels of the dimensions. Moving up and down in a hierarchy is known as roll up and drill-down respectively. After choosing the level of dimension, it can be sliced and diced. According to user need, they can drill-down and roll-up to view the data either in region wise, in product wise or in city wise level (Kirkgiize et al., 1997).
References:
1) Jones, J. and Johnson, E. (2010). Data Modelling for Business Intelligence with Microsoft SQL Server: Modeling a Data Warehouse. CA Erwin, pp. 1-9. [Online] Available from: http://www.ca.com/files/whitepapers/ca-erwin-db-modeling-bus-intel-....
2) Jones, J. and Johnson, E. (2010). Beyond the data model: designing the data warehouse. CA Erwin, pp. 1-9. [Online] Available from:http://www.ca.com/files/whitepapers/part3_beyond_dmd_data_warehouse...
3) Firestone, J. M. (1998). Dimensional Modeling and E-R Modeling In The Data Warehouse. Dimensional Modeling and E-R Modeling In The Data Warehouse, White Paper No. Eight. pp. 1-9.
4) Kirkgiize, R., Katie, N., Stolba, M. Tjoa, A. M. (1997). A Security Concept for OLAP. Proceedings of the 8th International Workshop on Database and Expert Systems Applications (DEXA '97), Institute of Software Technology (IFS). pp. 620-626.
5) Wang, J., Chen, T., Chiu, S. (2005). Literature Review on Data Warehouse Development. IACIS Pacific 2005 Conference Program, National Chengchi University, Taiwan. pp. 987-994.

No comments: