SQLOLTP – Online Transaction Processing
Short transactions
Simple queries
Touch small portions of data
Frequent updates
OLAP – Online Analytical Processing
Long transactions
Complex queries
Touch large portions of the data
Infrequent updates
More terminology
Data warehousing: Bring data from operational (OLTP) sources into a single “warehouse” for (OLAP) analysis
Decision support system (DSS): Infrastructure for data analysis
E.g., data warehouse tuned for OLAP
“Star Schema” fact table references dimension tables
Fact table: Updated frequently, often append-only, very large
Dimension tables: Updated infrequently, not as large
OLAP Queries
Join -> Filter -> Group -> Aggregate
Performance
Inherently very slow: special indexes, query processing techniques
Extensive use of materialized views
Data Cube(a.k.a. multidimensional OLAP)
Dimension data forms axes of “cube”
Fact (dependent) data in cells
Aggregated data on sides, edges, corner
Fact table uniqueness for data cube
If dimension attributes not key, must aggregate
Date can be used to create key
Drill-down: Examining summary data, break out by dimension attribute
Roll-up: Examining data, summarize by dimension attribute
SQL Constructs
With Cube
Add to results: faces, edges, and corners using NULL values
With Rullup
For hierarchical dimensions, portion of With Cube
NoSQL Systems
Not every data management/analysit problem is best solved using a traditional DBMS
Not using traditional relation DBMS
!= "not use SQL language"
NoSQL = not only SQL
Alternative to tranditional relational DBMS
Pros: Flexible schema, Quicker/cheaper to set up, massive scalability, relaxed consistency -> higher performance and availability
Cons:
No declarative query language -> more programming;
Relaxed consistency -> fewer guarantees
Example #1: Web log analysis higher parallel
Example #2: Social-network graph
Example #3: Wikipedia pages
Several incarnations
MapReduce framework
Key-value stores
Document stores
Graph database systems