Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/jmaupetit/imdb-sql
Load IMDB datasets in SQL databases
https://github.com/jmaupetit/imdb-sql
database imdb load-testing sql
Last synced: 2 months ago
JSON representation
Load IMDB datasets in SQL databases
- Host: GitHub
- URL: https://github.com/jmaupetit/imdb-sql
- Owner: jmaupetit
- License: mit
- Created: 2024-07-11T09:36:13.000Z (6 months ago)
- Default Branch: main
- Last Pushed: 2024-10-14T15:18:34.000Z (3 months ago)
- Last Synced: 2024-10-17T03:54:49.201Z (3 months ago)
- Topics: database, imdb, load-testing, sql
- Language: Python
- Homepage:
- Size: 95.7 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# IMDB SQL
Load IMDB datasets in a SQL database.
## 💡 The idea
**TL;DR** this project aims to be a helper to generate massive databases used in
performance-related studies. It is actually used in the
[data7 project](https://jmaupetit.github.io/data7/).## Dependencies
- [Poetry](https://python-poetry.org)
- [Curl](https://curl.se/)
- [GNU Make](https://www.gnu.org/software/make/)## Getting started
Clone the project then bootstrap it using:
```sh
make bootstrap
```And you are now ready to push the dataset to your database:
```sh
poetry run python imdb-sql.py [DATABASE_URL]
```With no argument, this will create an `im.db` SQLite database in the current
directory. Feel free to add the `DATABASE_URL` argument to use a PostgreSQL or
MariaDB instance. We support database URLs as defined in
[SQLAlchemy](https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls)
(note that you need to install required database-specific driver).## Testing with other DBMS
We provide a Docker compose configuration to test datasets loading in other DBMS
than SQLite.> 💡 Feel free to substitute provided docker-based configuration with the
> database server instance used in performance tests.### PostgreSQL
Boot the Postgres server (if you need one, else use your own server):
```sh
docker compose up -d postgresql
```Install the Postgres driver:
```sh
poetry add psycopg2-binary
```Load IMDB datasets:
```sh
poetry run python imdb-sql.py postgresql://imdb:pass@localhost:5432/imdb
```### MariaDB
Boot the MariaDB server (if you need one, else use your own server):
```sh
docker compose up -d mariadb
```Install the MariaDB driver (you should first install MariaDB on your system):
```sh
poetry add mariadb
```Load IMDB datasets:
```sh
poetry run python imdb-sql.py mariadb://imdb:pass@localhost:3306/imdb
```## LICENSE
This work is released under the MIT License.
IMDB datasets are provided for
[non-commercial use only](https://developer.imdb.com/non-commercial-datasets/).