Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/davidpissarra/ddbs-project
Tsinghua University | Distributed Database Systems | Final Project
https://github.com/davidpissarra/ddbs-project
distributed-database distributed-systems hadoop hdfs mongodb redis tkinter
Last synced: about 2 months ago
JSON representation
Tsinghua University | Distributed Database Systems | Final Project
- Host: GitHub
- URL: https://github.com/davidpissarra/ddbs-project
- Owner: davidpissarra
- Created: 2021-12-11T01:29:38.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2022-03-05T11:35:09.000Z (almost 3 years ago)
- Last Synced: 2024-10-28T20:47:20.382Z (3 months ago)
- Topics: distributed-database, distributed-systems, hadoop, hdfs, mongodb, redis, tkinter
- Language: Python
- Homepage:
- Size: 6.55 MB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# THU DDBS Final Project
This repository regards our final project for the Distributed Database Systems 2021 course @ Tsinghua University.
## Authors
[Armando Fortes](https://github.com/atfortes) & [David Pissarra](https://github.com/davidpissarra)## Overview
Distributed Database Systems have become the dominant data management tool for Big Data. Since it is the main purpose of this course project, we have built a Distributed Database System using MongoDB to store the given structured data, from a fictitious online library, and Hadoop Distributed File System for the remaining and unstructured data. Furthermore, we developed a simple TKinter application, in order to combine everything together in a single interactive UI, and a data cache using Redis, as similar requests to the database may happen. Every component of our architecture is running on a different Docker container, with the intention of simulating a distributed environment.
The following figure describes the interaction of components within the architecture:
## Repository organization
The repository is organized in the following 6 main folders:```
├── app # Tkinter app implementation
├── data-generation # Data generation files (MongoDB or MySQL)
├── docs # Project report, manual and useful figures
├── hadoop # HDFS containerized configuration
├── mongodb # MongoDB Cluster implementation
│ ├── configsvrs # MongoDB Configuration Servers implementation
│ ├── router # MongoDB Query Router implementation
│ └── shards # MongoDB Shards implementation
└── redis_cache # Redis cache implementation and commands
```## Built With
- [Docker](https://docs.docker.com/) (Component Containerization)
- [MongoDB](https://docs.mongodb.com/manual/sharding/) (MongoDB Sharded Cluster)
- [Hadoop](https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html) (Hadoop Distributed File System)
- [Redis](https://redis.io/topics/client-side-caching) (Client side caching)
- [TKinter](https://docs.python.org/3/library/tkinter.html) (Simple Python app for interaction)Further details on our solution may be found in the [report](https://github.com/davidpissarra/ddbs-project/blob/main/docs/report.pdf).
Also, for configuration details please refer to our [manual](https://github.com/davidpissarra/ddbs-project/blob/main/docs/manual.md).