https://github.com/lubien/ufpa-advanced-databases-2018-assignment-01
Implement some queries over self made db table with 1 billion entries of 8 bytes each for @gustavopinto class
https://github.com/lubien/ufpa-advanced-databases-2018-assignment-01
college-assignment dbms elixir
Last synced: 7 months ago
JSON representation
Implement some queries over self made db table with 1 billion entries of 8 bytes each for @gustavopinto class
- Host: GitHub
- URL: https://github.com/lubien/ufpa-advanced-databases-2018-assignment-01
- Owner: lubien
- License: mit
- Created: 2018-10-04T03:20:19.000Z (about 7 years ago)
- Default Branch: master
- Last Pushed: 2019-05-04T19:23:42.000Z (over 6 years ago)
- Last Synced: 2025-02-01T13:24:07.982Z (8 months ago)
- Topics: college-assignment, dbms, elixir
- Language: Elixir
- Homepage: http://gustavopinto.org/teaching/bd2/exercise
- Size: 7.56 MB
- Stars: 2
- Watchers: 3
- Forks: 1
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE.md
Awesome Lists containing this project
README
# Queries over binary files using the BEAM ecosystem
Code used for [@gustavopinto](https://github.com/gustavopinto) Advanced Databases classroom [first assignment](http://gustavopinto.org/teaching/bd2/exercise).
[Full report](https://github.com/lubien/ufpa-advanced-databases-2018-assignment-01-report/releases) is written in portuguese.
Overview:
* Create a single-file single-table database to save 1 billion people.
* Each person is a 64 bits tuple with gender (1), age (7), monthly income (10), scholarity (2), idiom (12), country (8) and coordinates (24).
* Implement 7 queries from the assignment and 3 of your choice.
* Export the binary database to a relational DBMS.
* Compare results.## Setup
Generate the binary file.
```sh
head -c 8000000000 people.db
```Setup PostgreSQL. You may change env variables, but it's optional. Below are the default.
```sh
export DB_NAME=ufpa-databases-2
export DB_USER=postgresmake prepare
```Export the binary database to CSV then import to PostgreSQL.
```sh
make dump-database
make import-dump
```Prepare Elixir.
```sh
mix deps.get
mix compile
```
## QueryElixir.
```sh
# from 1 to 10
mix query --db people.db --query 1
```PostgreSQL.
```sh
# from 1 to 10
make query-1
```## Results
This machine was used:
```sh
λ neofetch
OS: Manjaro Linux x86_64
Host: B250M-D3H
Kernel: 4.14.77-1-MANJARO
Uptime: 12 hours, 58 mins
Packages: 1024 (pacman)
Shell: bash 4.4.23
Resolution: 1360x768, 2560x1080
DE: Xfce
Theme: Vertex-Maia [GTK2], Breath [GTK3]
Icons: Vertex-Maia [GTK2], hicolor [GTK3]
Terminal: xfce4-terminal
Terminal Font: Monospace 12
CPU: Intel i5-7400 (4) @ 3.500GHz
GPU: NVIDIA GeForce GT 610
Memory: 3204MiB / 7939MiBλ lscpu
Arquitetura: x86_64
Modo(s) operacional da CPU: 32-bit, 64-bit
Ordem dos bytes: Little Endian
CPU(s): 4
Lista de CPU(s) on-line: 0-3
Thread(s) per núcleo: 1
Núcleo(s) por soquete: 4
Soquete(s): 1
Nó(s) de NUMA: 1
ID de fornecedor: GenuineIntel
Família da CPU: 6
Modelo: 158
Nome do modelo: Intel(R) Core(TM) i5-7400 CPU @ 3.00GHz
Step: 9
CPU MHz: 800.050
CPU MHz máx.: 3500,0000
CPU MHz mín.: 800,0000
BogoMIPS: 6002.00
Virtualização: VT-x
cache de L1d: 32K
cache de L1i: 32K
cache de L2: 256K
cache de L3: 6144K
CPU(s) de nó0 NUMA: 0-3λ sudo hdparm -Tt /dev/sda
/dev/sda:
Timing cached reads: 26046 MB in 1.99 seconds = 13068.87 MB/sec
Timing buffered disk reads: 392 MB in 3.00 seconds = 130.50 MB/sec
```Query | First (DBMS) | First (Bin) | Mean (DBMS) | Mean (Bin)
------------|-----------------|-----------------|--------------|-----------
1 | 748s | 134s | 640s | 134s
2 | - | 456s | - | 453s
3 | 629s | 160s | 622s | 161s
4 | 620s | 173s | 614s | 163s
5 | 705s | 59s | 618s | 59s
6 | 861s | 63s | 623s | 61s
7 | 631s | 67s | 604s | 62s
8 | 619s | 60s | 622s | 59s
9 | 639s | 59s | 626s | 60s
10 | 622s | 61s | 628s | 60sQuery 2 never finished on PostgreSQL so...