Introduction to Data Warehouse


In the '90s, Bill Inmon has come up with the idea of the Data Warehouse.
"A data warehouse is a concept, how we can make multiple sources of data which is unstructured and in different format into structured data for analytical purposes."

In the above quote, I have given you a high-level overview of a Data Warehouse. In other words, the big picture of it. So, let's break it up to points wise.

Big Data and Data Warehouse are the same? 

NO! A big no for it. Data Warehouse is a concept and Big Data is a technology. Simple as that.



Data Warehouse uses for,

  • Gather analytical operations for future business purposes.
  • Avoid slowdown in day-to-day transaction response time.
  • Minimize the increased CPU cost.

Why Data Warehouse?  

  • Consolidation of information resources.

In an organization, there may be data in different physical locations, in different formats. Therefore, we can use Data Warehouse as a store to store those data in one place.
  • Improve query performance.

Because of a well designed Data Warehouse, we can easily query.  

  • Helps in the operational system.

We can search separately in the Data Warehouse which stores more than years data, by applying some operations to them and the results might be very productive in the future operational level in the organization. 
  • Foundation for Data Mining, Data Visualizing, Advanced Reporting, and OLAP Tools.
Data Warehouse in the foundation for the above areas. Machine Learning, Deep Learning, and Artificial Intelligence all these advanced topics are based on this. If we are into those, the Data Warehouse plays the key role. 

What is a Data Warehouse?

A Data Warehouse is a relational database which is designed for query analysis (OLAP) rather than for transaction processing (OLTP). 
Note: OLAP - Online Transaction Processing
          OLTP - Online Analytical Processing

Benefits of having a Data Warehouse
  1. It maintains historical records.
  2. Analyzing the data to gain a better understanding of the business and it will help to improve the business.
  3.  ETL solutions - Capability to read any kind of data from any source. (ETL stands for Extraction, Transformation, and Loading)
  4. Statistical analysis
  5. Reporting
  6. Data mining capabilities
  7. Client analysis tools - Eg: SAP, Power BI, Sisense, Dundas, etc.

Characteristics of a Data Warehouse

  • Subject-Oriented 
This means the data inside a Data Warehouse is related to a single subject. So, it depends on the domain. As an example, in a supermarket or a retail shop store, their daily transactions and a bank or an insurance company store their finance data related to their business. 
  • Integrated
The Data Warehouse extracts data from multiple systems, multiple places, and multiple formats to integrate. By combining all those data, we can make solutions for business questions.
  • Non-volatile
Data cannot be disappeared from the Data Warehouse. It should keep on storing data. The time duration which those data storing is depending on the organization and the business requirement. 
  • Time-variant
Data should always available for you. And again this also may depend on the organization.

Key features of a Data Warehouse
  • Structured data - How the organization wants to look at the data.
  • Better response time - Indexing, partitioning, etc.
  • Available historical data - Depends on the organization and the business requirement.
  • Options for Ad-Hoc queries.

Comments

Post a Comment

Popular Posts