Q1 : Define Data Warehouse ?
A1 : An Enterprise structured repository of subject-oriented, time-variant, historical data used for information retrieval and decision support. The data warehouse (DWH) stores atomic and summary data
- Subject Oriented
- Non Volatile
- Time Variant
Q2 : Difference between Data Warehouse vs OLTP
Nature of Data
Sub Seconds to Seconds
DML(INSERT, UPDATE, DELETE)
Small to Large
Operational, Internal Processes
Seconds to hours
Primarily read (SELECT)
Snapshots over time
Large to Very Large
Operational , Enternal Process
Q3 : Data Warehouses vs Data Mart
100 GB to > 1TB
Months to Years
Single-Subject, LOB (Line of Business)
< 100 GB
Q4 : Define Operational Data Store
Stores tactical data from production systems that are subject-oriented and integrated to address operational needs (OR)
An operational data store is used for operational reporting and as a source of data for the Enterprise Data Warehouse. (OR)
An operational data store (ODS) is an alternative to having operational decision support system (DSS) applications access data directly from the database that supports transaction processing (TP). It is kind of transaction processing system. (OR)
An operational data store (ODS) is a central database that provides a snapshot of the latest data from multiple transactional systems for operational reporting.
Q5 : Relational Database vs No SQL
Table based data structure
Requires predefined schema
Doesn’t require predefined schema
- NoSQL tends to be a better option for modern applications that have more complex, constantly changing data sets, requiring a flexible data model that doesn’t need to be immediately defined. Most developers or organizations that prefer NoSQL databases, are attracted to the agile features that allow them to go to market faster, make updates faster. Unlike traditional, SQL based, relational databases, NoSQL databases can store and process data in real-time.
- While SQL databases do still have some specific use cases, NoSQL databases have many features that SQL databases are not capable of handling without tremendous costs, and critical sacrifices of speed, agility, etc.
Ref: MongoDB (No SQL) vs Teradata (Relational)