Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/davidpissarra/ddbs-project

Tsinghua University | Distributed Database Systems | Final Project
https://github.com/davidpissarra/ddbs-project

distributed-database distributed-systems hadoop hdfs mongodb redis tkinter

Last synced: about 2 months ago
JSON representation

Tsinghua University | Distributed Database Systems | Final Project

Awesome Lists containing this project

README

        

# THU DDBS Final Project

This repository regards our final project for the Distributed Database Systems 2021 course @ Tsinghua University.

## Authors
[Armando Fortes](https://github.com/atfortes) & [David Pissarra](https://github.com/davidpissarra)

## Overview

Distributed Database Systems have become the dominant data management tool for Big Data. Since it is the main purpose of this course project, we have built a Distributed Database System using MongoDB to store the given structured data, from a fictitious online library, and Hadoop Distributed File System for the remaining and unstructured data. Furthermore, we developed a simple TKinter application, in order to combine everything together in a single interactive UI, and a data cache using Redis, as similar requests to the database may happen. Every component of our architecture is running on a different Docker container, with the intention of simulating a distributed environment.

The following figure describes the interaction of components within the architecture:

drawing

## Repository organization
The repository is organized in the following 6 main folders:

```
├── app # Tkinter app implementation
├── data-generation # Data generation files (MongoDB or MySQL)
├── docs # Project report, manual and useful figures
├── hadoop # HDFS containerized configuration
├── mongodb # MongoDB Cluster implementation
│ ├── configsvrs # MongoDB Configuration Servers implementation
│ ├── router # MongoDB Query Router implementation
│ └── shards # MongoDB Shards implementation
└── redis_cache # Redis cache implementation and commands
```

## Built With

- [Docker](https://docs.docker.com/) (Component Containerization)
- [MongoDB](https://docs.mongodb.com/manual/sharding/) (MongoDB Sharded Cluster)
- [Hadoop](https://hadoop.apache.org/docs/r1.2.1/hdfs_design.html) (Hadoop Distributed File System)
- [Redis](https://redis.io/topics/client-side-caching) (Client side caching)
- [TKinter](https://docs.python.org/3/library/tkinter.html) (Simple Python app for interaction)

Further details on our solution may be found in the [report](https://github.com/davidpissarra/ddbs-project/blob/main/docs/report.pdf).
Also, for configuration details please refer to our [manual](https://github.com/davidpissarra/ddbs-project/blob/main/docs/manual.md).