Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/eirinimits/dbms

This project contains basic functions of a DBMS (External Merge sort, Merge Join, Hash Join, Duplicate Elimination) that are designed to work in real-life and extreme circumstances (Huge input data, extremely low available memory).
https://github.com/eirinimits/dbms

dbms duplicate-elimination external-merge-sort hash-join merge-join

Last synced: 19 days ago
JSON representation

This project contains basic functions of a DBMS (External Merge sort, Merge Join, Hash Join, Duplicate Elimination) that are designed to work in real-life and extreme circumstances (Huge input data, extremely low available memory).

Awesome Lists containing this project

README

        

# DBMS
This project was created for academic purposes on the subject of 'Database Technology' at the Aristotle University of Thessaloniki.

This project contains basic functions of a DBMS (External Merge sort, Duplicate Elimination, Merge Join, Hash Join) that are designed to work in real-life and extreme circumstances (Huge input data, extremely low available memory).

## Examples

**Input:**
```
- N: number of blocks in the file, N1: number of blocks in the file1, N2: number of blocks in the file2
- B: number of blocks in the memory buffer
- field: which field will be used for sorting: 0 is for recid, 1 is for num, 2 is for str and 3 is for both num and str
```
**Output:**
```
- nsorted_segs: number of sorted segments produced
- npasses: number of passes required for sorting
- nios: number of IOs performed
```
## External Merge sort
**-Input File-**

screen shot 2017-11-16 at 8 23 39 pm

```
1. N = 10, B = 3, MAX_RECORDS_PER_BLOCK = 10, field = 1
```
**-Output File-**

screen shot 2017-11-16 at 8 23 50 pm

```
2. N = 10, B = 3, MAX_RECORDS_PER_BLOCK = 10, field = 2
```
**-Output File-**

screen shot 2017-11-16 at 8 24 17 pm

## Duplicate Elimination
**-Input File-**

screen shot 2017-11-16 at 8 31 02 pm

```
1. N = 10, B = 4, MAX_RECORDS_PER_BLOCK = 10, field = 1
```
**-Output File-**

screen shot 2017-11-16 at 8 30 48 pm

```
2. N = 10, B = 4, MAX_RECORDS_PER_BLOCK = 10, field = 2
```
**-Output File-**

screen shot 2017-11-16 at 8 24 17 pm

## Merge Join
**-Input File 1 & Input File 2-**

screen shot 2017-11-16 at 8 39 18 pm
screen shot 2017-11-16 at 8 39 28 pm

```
N1 = 10, N2 = 10, B = 3, MAX_RECORDS_PER_BLOCK = 10, field = 1
```
**-Output File-**

screen shot 2017-11-16 at 8 40 09 pm

## Hash Join
**-Input File 1 & Input File 2-**

screen shot 2017-11-16 at 8 46 06 pm
screen shot 2017-11-16 at 8 46 29 pm

```
N1 = 10, N2 = 10, B = 3, MAX_RECORDS_PER_BLOCK = 10, field = 1
```
**-Output File-**

screen shot 2017-11-16 at 8 46 37 pm

## More examples in bigger input data sets

### MergeSort:

screen shot 2017-11-16 at 8 53 48 pm

### EliminateDuplicates:

screen shot 2017-11-16 at 8 54 00 pm

### MergeJoin & ΗashJoin:

screen shot 2017-11-16 at 8 54 12 pm
screen shot 2017-11-16 at 8 54 21 pm