Tuesday 12 February 2013

CHAPTER EIGHT -> Accessing Organizational Information (Data Warehouse)

1. ROLES AND PURPOSES OF DATA WAREHOUSES AND DATA MART IN ORGANIZATION



The purpose of the Data Warehouse in the overall Data Warehousing Architecture is to integrate corporate data.  It contains the "single version of truth" for the organization that has been carefully constructed from data stored in disparate internal and external operational databases.The amount of data in the Data Warehouse is massive.  Data is stored at a very granular level of detail.  For example, every "sale" that has ever occurred in the organization is recorded and related to dimensions of interest.  This allows data to be sliced and diced, summed and grouped in unimaginable ways. 
Typical Data Warehousing Environment
 Contrary to popular opinion, the Data Warehouses does not contain all the data in the organization.  It's purpose is to provide key business metrics that are needed by the organization for strategic and tactical decision making.Decision makers don't access the Data Warehouse directly.  This is done through various front-end Data Warehouse Tools that read data from subject specific Data Marts.The Data Warehouse can be either "relational" or "dimensional".  This depends on how the business intends to use the information.ETL (Extract Transform Load) jobs extract data from the Data Warehouse and populate one or more Data Marts for use by groups of decision makers in the organizations.  The Data Marts can be Dimensional (Star Schemas) or relational, depending on how the information is to be used and what "front end" Data Warehousing Tools will be used to present the information.Each Data Mart can contain different combinations of tables, columns and rows from the Enterprise Data Warehouse.  For example, an business unit or user group that doesn't require a lot of historical data might only need transactions from the current calendar year in the database. The Personnel Department might need to see all details about employees, whereas data such as "salary" or "home address" might not be appropriate for a Data Mart that focuses on Sales.
Typical Data Warehousing Environment

Some Data Mart might need to be refreshed from the Data Warehouse daily, whereas user groups might want refreshes only monthly.
 2. The relationship of business intelligence and data warehousing 



Many of the tool vendors who sell their products or softwares call it business Intelligence software rather than data warehousing software. Business Intelligence is a term commonly associated with data warehousing. Business Intelligence is a generalized term where a company initiates various activities to gather today's market information which also includes about their competitor. Today's business Intelligence systems are contrasted to more classical way of information gathering in mining and crunching the data in the most optimal manner. In short we can say BI simplifies information discovery and analysis. In this way the company will have a competitive advantage of business and intelligently using the available data in strategic and effective decision making. it has the ability to bring disparate data under one roof  with a meaningful information and ready for analysis.
Business intelligence usually refers to the information that is available for the enterprise to make decisions on. A data warehousing (or data mart) system is the backend, or the infrastructural, component for achieving business intelligence. Business intelligence also includes the insight gained from doing data mining analysis, as well as unstructured data (thus the need for content management systems). All the source data from disparate sources are used to load/Stage data. Different sources can be flat files, another database or some other process. The starting point of the Data warehouse should extract the data in order to load into its environment.This data may not be the expected format or size. your business demands are different or your organization business requirements are different. So the business process has to modify the data or better word is to transform the incoming data to meet requirements and objectives. This is called Transformation. Once every slicing and dicing of the data is done along with applied business rules, this data is ready for loading into the target tables. This process is called Loading. So overall till now we have done Extraction, Transformation and Loading. In short we call this ETL. There are lot of tools available in today's market which does help in achieving the ETL process. Once this data is loaded in to the database, this is ready for next processing. We call that database as Data warehouse database. The next process could be building of data marts or directly reporting from it. There are lot of tools/software available for reporting/analysis. Some call it business reporting or analysis tool.










No comments:

Post a Comment