https://github.com/jmaupetit/imdb-sql
Load IMDB datasets in SQL databases
https://github.com/jmaupetit/imdb-sql
database imdb load-testing sql
Last synced: 3 months ago
JSON representation
Load IMDB datasets in SQL databases
- Host: GitHub
- URL: https://github.com/jmaupetit/imdb-sql
- Owner: jmaupetit
- License: mit
- Created: 2024-07-11T09:36:13.000Z (almost 2 years ago)
- Default Branch: main
- Last Pushed: 2025-05-02T09:25:25.000Z (about 1 year ago)
- Last Synced: 2025-09-26T10:42:37.072Z (7 months ago)
- Topics: database, imdb, load-testing, sql
- Language: Python
- Homepage:
- Size: 91.8 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- License: LICENSE
Awesome Lists containing this project
README
# IMDB SQL
Load IMDB datasets in a SQL database.
## 💡 The idea
**TL;DR** this project aims to be a helper to generate massive databases used in
performance-related studies. It is actually used in the
[data7 project](https://jmaupetit.github.io/data7/).
## Dependencies
- [Poetry](https://python-poetry.org)
- [Curl](https://curl.se/)
- [GNU Make](https://www.gnu.org/software/make/)
## Getting started
Clone the project then bootstrap it using:
```sh
make bootstrap
```
And you are now ready to push the dataset to your database:
```sh
poetry run python imdb-sql.py [DATABASE_URL]
```
With no argument, this will create an `im.db` SQLite database in the current
directory. Feel free to add the `DATABASE_URL` argument to use a PostgreSQL or
MariaDB instance. We support database URLs as defined in
[SQLAlchemy](https://docs.sqlalchemy.org/en/20/core/engines.html#database-urls)
(note that you need to install required database-specific driver).
## Testing with other DBMS
We provide a Docker compose configuration to test datasets loading in other DBMS
than SQLite.
> 💡 Feel free to substitute provided docker-based configuration with the
> database server instance used in performance tests.
### PostgreSQL
Boot the Postgres server (if you need one, else use your own server):
```sh
docker compose up -d postgresql
```
Install the Postgres driver:
```sh
poetry add psycopg2-binary
```
Load IMDB datasets:
```sh
poetry run python imdb-sql.py postgresql://imdb:pass@localhost:5432/imdb
```
### MariaDB
Boot the MariaDB server (if you need one, else use your own server):
```sh
docker compose up -d mariadb
```
Install the MariaDB driver (you should first install MariaDB on your system):
```sh
poetry add mariadb
```
Load IMDB datasets:
```sh
poetry run python imdb-sql.py mariadb://imdb:pass@localhost:3306/imdb
```
## LICENSE
This work is released under the MIT License.
IMDB datasets are provided for
[non-commercial use only](https://developer.imdb.com/non-commercial-datasets/).