https://github.com/lubien/ufpa-advanced-databases-2018-assignment-01

Implement some queries over self made db table with 1 billion entries of 8 bytes each for @gustavopinto class
https://github.com/lubien/ufpa-advanced-databases-2018-assignment-01

college-assignment dbms elixir

Last synced: 7 months ago
JSON representation

Implement some queries over self made db table with 1 billion entries of 8 bytes each for @gustavopinto class

Host: GitHub
URL: https://github.com/lubien/ufpa-advanced-databases-2018-assignment-01
Owner: lubien
License: mit
Created: 2018-10-04T03:20:19.000Z (about 7 years ago)
Default Branch: master
Last Pushed: 2019-05-04T19:23:42.000Z (over 6 years ago)
Last Synced: 2025-02-01T13:24:07.982Z (8 months ago)
Topics: college-assignment, dbms, elixir
Language: Elixir
Homepage: http://gustavopinto.org/teaching/bd2/exercise
Size: 7.56 MB
Stars: 2
Watchers: 3
Forks: 1
Open Issues: 2
Metadata Files:
- Readme: README.md
- License: LICENSE.md

Awesome Lists containing this project

README

          # Queries over binary files using the BEAM ecosystem

Code used for [@gustavopinto](https://github.com/gustavopinto) Advanced Databases classroom [first assignment](http://gustavopinto.org/teaching/bd2/exercise).

[Full report](https://github.com/lubien/ufpa-advanced-databases-2018-assignment-01-report/releases) is written in portuguese.

Overview:

  * Create a single-file single-table database to save 1 billion people.

  * Each person is a 64 bits tuple with gender (1), age (7), monthly income (10), scholarity (2), idiom (12), country (8) and coordinates (24).

  * Implement 7 queries from the assignment and 3 of your choice.

  * Export the binary database to a relational DBMS.

  * Compare results.

## Setup

Generate the binary file.

```sh

head -c 8000000000 people.db

```

Setup PostgreSQL. You may change env variables, but it's optional. Below are the default.

```sh

export DB_NAME=ufpa-databases-2

export DB_USER=postgres

make prepare

```

Export the binary database to CSV then import to PostgreSQL.

```sh

make dump-database

make import-dump

```

Prepare Elixir.

```sh

mix deps.get

mix compile

```

## Query

Elixir.

```sh

# from 1 to 10

mix query --db people.db --query 1

```

PostgreSQL.

```sh

# from 1 to 10

make query-1

```

## Results

This machine was used:

```sh

λ neofetch

OS: Manjaro Linux x86_64 

Host: B250M-D3H 

Kernel: 4.14.77-1-MANJARO 

Uptime: 12 hours, 58 mins 

Packages: 1024 (pacman) 

Shell: bash 4.4.23 

Resolution: 1360x768, 2560x1080 

DE: Xfce 

Theme: Vertex-Maia [GTK2], Breath [GTK3] 

Icons: Vertex-Maia [GTK2], hicolor [GTK3] 

Terminal: xfce4-terminal 

Terminal Font: Monospace 12 

CPU: Intel i5-7400 (4) @ 3.500GHz 

GPU: NVIDIA GeForce GT 610 

Memory: 3204MiB / 7939MiB

λ lscpu

Arquitetura:                x86_64

Modo(s) operacional da CPU: 32-bit, 64-bit

Ordem dos bytes:            Little Endian

CPU(s):                     4

Lista de CPU(s) on-line:    0-3

Thread(s) per núcleo:       1

Núcleo(s) por soquete:      4

Soquete(s):                 1

Nó(s) de NUMA:              1

ID de fornecedor:           GenuineIntel

Família da CPU:             6

Modelo:                     158

Nome do modelo:             Intel(R) Core(TM) i5-7400 CPU @ 3.00GHz

Step:                       9

CPU MHz:                    800.050

CPU MHz máx.:               3500,0000

CPU MHz mín.:               800,0000

BogoMIPS:                   6002.00

Virtualização:              VT-x

cache de L1d:               32K

cache de L1i:               32K

cache de L2:                256K

cache de L3:                6144K

CPU(s) de nó0 NUMA:         0-3

λ sudo hdparm -Tt /dev/sda

/dev/sda:

 Timing cached reads:   26046 MB in  1.99 seconds = 13068.87 MB/sec

 Timing buffered disk reads: 392 MB in  3.00 seconds = 130.50 MB/sec

```

Query       | First (DBMS)    | First (Bin)     | Mean (DBMS)  | Mean (Bin)

------------|-----------------|-----------------|--------------|-----------

1           | 748s            | 134s            | 640s         | 134s

2           | -               | 456s            | -            | 453s

3           | 629s            | 160s            | 622s         | 161s

4           | 620s            | 173s            | 614s         | 163s

5           | 705s            | 59s             | 618s         | 59s

6           | 861s            | 63s             | 623s         | 61s

7           | 631s            | 67s             | 604s         | 62s

8           | 619s            | 60s             | 622s         | 59s

9           | 639s            | 59s             | 626s         | 60s

10          | 622s            | 61s             | 628s         | 60s

Query 2 never finished on PostgreSQL so...

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/lubien/ufpa-advanced-databases-2018-assignment-01

Awesome Lists containing this project

README