https://github.com/rikulauttia/database-fundamentals
Database performance engineering lab demonstrating 148x query optimization through advanced indexing strategies and ACID-compliant transaction patterns. Features production-ready implementations spanning both SQL and NoSQL paradigms with measured performance analytics. A practical showcase of database engineering principles.
https://github.com/rikulauttia/database-fundamentals
acid concurrency database database-design database-optimization indexing mongodb normalization nosql performance-testing python-database query-performance relational-algebra sql sqlite transactions
Last synced: 3 months ago
JSON representation
Database performance engineering lab demonstrating 148x query optimization through advanced indexing strategies and ACID-compliant transaction patterns. Features production-ready implementations spanning both SQL and NoSQL paradigms with measured performance analytics. A practical showcase of database engineering principles.
- Host: GitHub
- URL: https://github.com/rikulauttia/database-fundamentals
- Owner: rikulauttia
- Created: 2025-02-27T14:38:49.000Z (3 months ago)
- Default Branch: main
- Last Pushed: 2025-02-27T14:43:12.000Z (3 months ago)
- Last Synced: 2025-02-27T20:25:50.842Z (3 months ago)
- Topics: acid, concurrency, database, database-design, database-optimization, indexing, mongodb, normalization, nosql, performance-testing, python-database, query-performance, relational-algebra, sql, sqlite, transactions
- Language: Python
- Homepage:
- Size: 7.81 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# 🛢️ Database Fundamentals
[](https://github.com/rikulauttia/database-fundamentals)
[](https://github.com/rikulauttia/database-fundamentals)
[](https://github.com/rikulauttia/database-fundamentals)
[](https://github.com/rikulauttia/database-fundamentals)
[](https://github.com/rikulauttia/database-fundamentals)## 📋 Overview
This repository contains practical exercises and implementations from a database fundamentals course. It explores both theoretical concepts and hands-on applications of database systems, from SQL queries to NoSQL databases, indexing performance, and transaction management.
## 🧩 Exercise Collection
### 1️⃣ Database Programming with SQLite
Implementations of Python functions to query and analyze city bike trip data stored in SQLite. Features include:
- User distance and speed calculations
- City-based trip duration analysis
- User tracking across multiple cities
- Daily trip statistics
- Popular starting points identificationView Implementation Details
```python
# Sample function: Finding the total distance traveled by a user
def distance_of_user(user):
conn = get_db_connection()
cursor = conn.cursor()query = """
SELECT SUM(distance)
FROM Trips
JOIN Users ON Trips.user_id = Users.id
WHERE Users.name = ?
"""cursor.execute(query, (user,))
result = cursor.fetchone()[0]
conn.close()
return result
```### 2️⃣ Database Design & Schema Creation
Design and implementation of a university course management system with:
- Tables for teachers, courses, students, credits, and groups
- Advanced query capabilities for course-student-teacher relationships
- Grade distribution analysis
- Cross-relational statistics gatheringView Database Schema
```sql
CREATE TABLE teachers (
id INTEGER PRIMARY KEY,
name TEXT
);CREATE TABLE courses (
id INTEGER PRIMARY KEY,
name TEXT,
credits INTEGER
);CREATE TABLE course_teachers (
id INTEGER PRIMARY KEY,
course_id INTEGER,
teacher_id INTEGER,
FOREIGN KEY (course_id) REFERENCES courses(id),
FOREIGN KEY (teacher_id) REFERENCES teachers(id)
);/* Additional tables for students, credits, groups, etc. */
```### 3️⃣ Indexing Performance Analysis
Experimental evaluation of database indexing strategies, measuring:
- Query performance with and without indexes
- Effect of index creation timing (before vs. after data insertion)
- Database file size impacts
- Performance metrics across different strategiesView Test Results
| Test Scenario | Row Insertion Time | Query Execution Time | Database Size |
| ------------- | :----------------: | :------------------: | :-----------: |
| No Index | 1.50 sec | 25.25 sec | 19.19 MB |
| Index Before | 2.03 sec | 0.18 sec | 30.97 MB |
| Index After | 1.51 sec | 0.17 sec | 29.70 MB |### 4️⃣ Transaction Behavior Analysis
Investigation of how multiple concurrent transactions interact in SQLite, examining:
- Lock behavior between transactions
- Transaction isolation levels
- Conflict resolution strategies
- Success/failure patterns in concurrent operationsView Transaction Test Cases
**Test Case Example:**
```
Transaction 1 Transaction 2
-------------- --------------
BEGIN;
SELECT x FROM Test;
BEGIN;
SELECT x FROM Test;
UPDATE Test SET x=2;
COMMIT;
UPDATE Test SET x=3;
COMMIT;
```**Result Analysis:**
Transaction isolation demonstrates that Transaction 2's changes are not visible to Transaction 1 until committed, while Transaction 1's lock prevents Transaction 2 from modifying the same data concurrently.### 5️⃣ NoSQL Database Queries (MongoDB)
Implementation of MongoDB queries using Python to analyze real estate data:
- Filtering by zip code, construction year, and apartment size
- Transaction history analysis
- Complex aggregation pipelines
- Document-based data retrievalView MongoDB Query Examples
```python
# Finding apartments sold between 2010-2012
myynti_count = apartments.count_documents({
"transactions": {
"$elemMatch": {
"date": {
"$gte": "2010-01-01",
"$lte": "2012-12-31"
}
}
}
})# Finding maximum selling price using aggregation
pipeline = [
{"$unwind": "$transactions"},
{"$group": {"_id": None, "max_price": {"$max": "$transactions.selling_price"}}},
{"$project": {"_id": 0, "max_price": 1}}
]
result = list(apartments.aggregate(pipeline))
max_price = result[0]["max_price"]
```## 🧠 Key Database Concepts Covered
Relational Model
NoSQL
Performance
• SQL Queries
• Normalization
• Relationships
• Schema Design
• Document Databases
• MongoDB Queries
• Non-relational Modeling
• Aggregation Pipelines
• Indexing Strategies
• Query Optimization
• Execution Analysis
• Performance Testing
Transactions
Database Theory
Implementation
• ACID Properties
• Concurrency Control
• Lock Management
• Conflict Resolution
• Relational Algebra
• Functional Dependencies
• Normal Forms
• Constraints & Keys
• Python Integration
• SQLite Operations
• MongoDB Access
• Performance Measurement
## 📊 Performance Insights from Indexing Analysis
The implementation demonstrates significant query optimization through proper indexing:
```
Query execution without index: 25.25 seconds
Query execution with index: 0.17 seconds
Speedup factor: ~148x
```This highlights how critical indexing is for real-world database performance, especially for frequently queried fields.
## 🔍 Transaction Isolation Observations
Analysis of transaction behavior reveals:
1. **Transaction Locks** - Active transactions can block other transactions from modifying the same data
2. **Visibility Rules** - Changes are only visible to other transactions after being committed
3. **Concurrent Operations** - Read operations don't block other reads, but writes can block both reads and writes## 🌐 Future Work
- [ ] Implement more complex database scenarios
- [ ] Add distributed database examples
- [ ] Compare performance across different database engines
- [ ] Explore advanced indexing techniques## 📖 References
- Database Systems: The Complete Book
- MongoDB Documentation
- SQLite Documentation
- Transaction Processing: Concepts and Techniques---
Built with ❤️ by Riku Lauttia