Q1 : Define Data Warehouse ?
A1 : An Enterprise structured repository of subject-oriented, time-variant, historical data used for information retrieval and decision support. The data warehouse (DWH) stores atomic and summary data
- Subject Oriented
- Non Volatile
- Integrated
- Time Variant
Q2 : Difference between Data Warehouse vs OLTP
A2 :
Property
Response Time
Operation
Nature of Data
Data
Organization Size
Data Source
Activities
Operational (OLTP)
Sub Seconds to Seconds
DML(INSERT, UPDATE, DELETE)
30-60 Days
Applications
Small to Large
Operational, Internal Processes
Processes
Data Warehouse
Seconds to hours
Primarily read (SELECT)
Snapshots over time
Subject, Time
Large to Very Large
Operational , Enternal Process
Analysis
Q3 : Data Warehouses vs Data Mart
Property
Scope
Subject
Data Source
Size(typical)
Implementation Time
Data Warehouse
Enterprise
Multiple
Many
100 GB to > 1TB
Months to Years
Data Mart
Department
Single-Subject, LOB (Line of Business)
Few
< 100 GB
Months
Q4 : Define Operational Data Store
Stores tactical data from production systems that are subject-oriented and integrated to address operational needs (OR)
An operational data store is used for operational reporting and as a source of data for the Enterprise Data Warehouse. (OR)
An operational data store (ODS) is an alternative to having operational decision support system (DSS) applications access data directly from the database that supports transaction processing (TP). It is kind of transaction processing system. (OR)
An operational data store (ODS) is a central database that provides a snapshot of the latest data from multiple transactional systems for operational reporting.
Q5 : Relational Database vs No SQL
Relational Database
Table based data structure
Requires predefined schema
Vertically scalable
Mostly Expensive
No SQL
Non-relational database
Doesn’t require predefined schema
Horizontal Scaling
Cost Efficient
- NoSQL tends to be a better option for modern applications that have more complex, constantly changing data sets, requiring a flexible data model that doesn’t need to be immediately defined. Most developers or organizations that prefer NoSQL databases, are attracted to the agile features that allow them to go to market faster, make updates faster. Unlike traditional, SQL based, relational databases, NoSQL databases can store and process data in real-time.
- While SQL databases do still have some specific use cases, NoSQL databases have many features that SQL databases are not capable of handling without tremendous costs, and critical sacrifices of speed, agility, etc.
Ref: MongoDB (No SQL) vs Teradata (Relational)