awesome-database-learning
  
  
    A list of learning materials to understand databases internals 
    https://github.com/pingcap/awesome-database-learning
  
        Last synced: 1 minute ago 
        JSON representation
    
- 
            
Recommended Courses, Books and Talks
- 
                    
Courses
 - 
                    
Books
- Database Systems: The Complete Book
 - Designing Data-Intensive Applications
 - Database Internals
 - Foundations of Databases
 - Readings in Database Systems, 5th Edition
 - Database Design and Implementation: Second Edition (Data-Centric Systems and Applications)
 - Principles of Distributed Database Systems, 4th ed
 - Inside SQLite
 - Architecture of a Database System
 - Relational Database Index Design and the Optimizers
 - Transactional Information Systems: Theory, Algorithms, and the Practice of Concurrency Control
 
 - 
                    
Talks
 - 
                    
Blogs
 
 - 
                    
 - 
            
Query Optimizer
- 
                    
Subquery Optimization
 - 
                    
Planner Models
- 数据库内核杂谈(九):开源优化器 ORCA
 - SQL 查询优化原理与 Volcano Optimizer 介绍 - zh)
 - Cascades Optimizer - ming)
 - Access Path Selection in a Relational Database Management System
 - Query Processing in Main Memory Database Management Systems
 - Query Optimization by Simulated Annealing
 - Grammar-like Functional Rules for Representing Query Optimization Alternatives
 - The Volcano Optimizer Generator- Extensibility and Efficient Search
 - The Cascades Framework for Query Optimization
 - An Overview of Query Optimization in Relational Systems
 - LEO – DB2’s LEarning Optimizer
 - Robust Query Processing through Progressive Optimization
 - Orca: A Modular Query Optimizer Architecture for Big Data
 - Parallelizing Query Optimization on Shared-Nothing Architectures
 - The MemSQL Query Optimizer: A modern optimizer for real-time analytics in a distributed database
 - 数据库内核杂谈
 - Access Path Selection in a Relational Database Management System
 
 - 
                    
Blogs
 - 
                    
Join Order Optimization
 - 
                    
Functional Dependency & Physical Properties
 - 
                    
Cost Model
- Approximation Schemes for Many-Objective Query Optimization
 - Multi-Objective Parametric Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 - Approximation Schemes for Many-Objective Query Optimization
 
 - 
                    
Statistics
- Accurate Estimation of the Number of Tuples Satisfying a Condition
 - Optimal Histograms for Limiting Worst-Case Error Propagation in the Size of Join Results
 - Universality of Serial Histograms
 - Balancing Histogram Optimality and Practicality for Query Result Size Estimation
 - Improved Histograms for Selectivity Estimation of Range Predicates
 - SEEKing the truth about ad hoc join costs
 - Towards Estimation Error Guarantees for Distinct Values
 - Distinct Sampling for Highly-Accurate Answers to Distinct Values Queries and Event Reports
 - The History of Histograms
 - An Improved Data Stream Summary: The Count-Min Sketch and its Applications
 - New Estimation Algorithms for Streaming Data: Count-min Can Do More
 - Preventing Bad Plans by Bounding the Impact of Cardinality Estimation Errors
 - Histograms Reloaded: The Merits of Bucket Diversity
 - Exploiting Ordered Dictionaries to Efficiently Construct Histograms with Q-Error Guarantees in SAP HANA
 - Adaptive Statistics in Oracle 12c
 - Pessimistic Cardinality Estimation: Tighter Upper Bounds for Intermediate Join Cardinalities
 - Deep Unsupervised Cardinality Estimation
 - NeuroCard: One Cardinality Estimator for All Tables
 - Synopses for Massive Data: Samples, Histograms, Wavelets, Sketches
 - SEEKing the truth about ad hoc join costs
 - Towards Estimation Error Guarantees for Distinct Values
 
 
 - 
                    
 - 
            
Query Execution
- 
                    
Execution Framework
 - 
                    
Vectorization vs Compilization
- Overhead of a Generalized Query Execution Engine
 - MonetDB/X100: Hyper-Pipelining Query Execution
 - Efficiently Compiling Efficient Query Plans for Modern Hardware
 - Relaxed Operator Fusion for In-Memory Databases: Making Compilation, Vectorization, and Prefetching Work Together At Last
 - Everything You Always Wanted to Know About Compiled and Vectorized Queries But Were Afraid to Ask
 - Adaptive Execution of Compiled Queries
 
 - 
                    
Join
 - 
                    
Hash Table
- Fibonacci Hashing: The Optimization that the World Forgot (or: a Better Alternative to Integer Modulo)
 - All hash table sizes you will ever need - Thomas Neumann](https://databasearchitects.blogspot.com/)
 
 - 
                    
Bloom Filter
 
 - 
                    
 - 
            
DDL
- 
                    
Bloom Filter
 
 - 
                    
 - 
            
Relational Model
- 
                    
Bloom Filter
 - 
                    
Relational Data Model
 - 
                    
Relational Algebra
 - 
                    
ER to Relational Model
 - 
                    
SQL - Overview
- An Overview of SQL Text Functions - rahman/)
 
 - 
                    
Codd's Rules
 
 - 
                    
 - 
            
Network
 - 
            
Storage
- 
                    
NoSQL Systems
 - 
                    
Buffer Management
- The 5 Minute Rule for Trading Memory for Disc Accesses and the 5 Byte Rule for Trading Memory for CPU Time
 - The Five Minute Rule 20 Years Later and How Flash Memory Changes the Rules
 - Managing Non-Volatile Memory in Database Systems
 - LeanStore: In-Memory Data Management Beyond Main Memory
 - Umbra: A Disk-Based System with In-Memory Performance
 
 - 
                    
Disk IO
- On Disk IO, Part 1: Flavors of IO
 - On Disk IO, Part 2: More Flavours of IO
 - On Disk IO, Part 3: LSM Trees
 - On Disk IO, Part 4: B-Trees and RUM Conjecture
 - On Disk IO, Part 5: Access Patterns in LSM Trees
 - Ensuring data reaches disk(LWN)
 - Read, write & space amplification - pick 2
 - Design Tradeoffs of Data Access Methods
 - Designing Access Methods: The RUM Conjecture
 
 - 
                    
B-Tree
 - 
                    
LSM-Tree
 - 
                    
Learned Indexes Structures
 
 - 
                    
 - 
            
Transaction
- 
                    
Concurrency Control
- The Notions of Consistency and Predicate Locks in a Database System
 - Concurrency Control in Distributed Database Systems
 - On Optimistic Methods for Concurrency Control
 - Multiversion Concurrency Control - Theory and Algorithms
 - Serializable Snapshot Isolation in PostgreSQL
 - Calvin: Fast Distributed Transactions for Partitioned Database Systems
 - MaaT: effective and scalable coordination of distributed transactions in the cloud
 - Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores
 - An Evaluation of the Advantages and Disadvantages of Deterministic Database Systems
 - Fast Serializable Multi-Version Concurrency Control for Main-Memory Database Systems
 - An Empirical Evaluation of In-Memory Multi-Version Concurrency Control
 - An Evaluation of Distributed Concurrency Control
 - Scalable Garbage Collection for In-Memory MVCC Systems
 
 - 
                    
Isolation Levels
 
 - 
                    
 - 
            
Serializing & RPC
- 
                    
Learned Indexes Structures
 
 - 
                    
 - 
            
Data Partitioning
- 
                    
Learned Indexes Structures
 
 - 
                    
 - 
            
Replication & Consistency
 - 
            
Consensus
- 
                    
Learned Indexes Structures
- Distributed consensus revised - Related algorithms, by Heidi Howard
 - Ark: A Real-World Consensus Implementation
 
 
 - 
                    
 - 
            
Scheduling
 - 
            
Benchmark & Testing
 - 
            
HTAP
- 
                    
Learned Indexes Structures
 
 - 
                    
 - 
            
TLA+
- 
                    
Learned Indexes Structures
 
 - 
                    
 
            Programming Languages
          
          
        
            Categories
          
          
        
            Sub Categories
          
          
              
                Cost Model
                35
              
              
                Learned Indexes Structures
                24
              
              
                Statistics
                21
              
              
                Planner Models
                17
              
              
                Concurrency Control
                16
              
              
                Books
                11
              
              
                Disk IO
                9
              
              
                Blogs
                8
              
              
                Courses
                6
              
              
                Vectorization vs Compilization
                6
              
              
                Subquery Optimization
                6
              
              
                NoSQL Systems
                5
              
              
                Buffer Management
                5
              
              
                Bloom Filter
                4
              
              
                Functional Dependency & Physical Properties
                4
              
              
                LSM-Tree
                4
              
              
                Join Order Optimization
                3
              
              
                Isolation Levels
                3
              
              
                B-Tree
                3
              
              
                Talks
                2
              
              
                Hash Table
                2
              
              
                Execution Framework
                2
              
              
                Join
                2
              
              
                Relational Data Model
                1
              
              
                Codd's Rules
                1
              
              
                SQL - Overview
                1
              
              
                Relational Algebra
                1
              
              
                ER to Relational Model
                1
              
          
        
            Keywords