{"id":15056654,"url":"https://github.com/bujowskis/put-bd-project","last_synced_at":"2026-01-29T21:32:29.912Z","repository":{"id":244219990,"uuid":"811878063","full_name":"bujowskis/put-bd-project","owner":"bujowskis","description":"Distributed system for library management","archived":false,"fork":false,"pushed_at":"2024-06-11T10:49:53.000Z","size":27389,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-06-10T08:40:12.239Z","etag":null,"topics":["cassandra","docker","docker-compose","flask"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bujowskis.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-06-07T13:42:23.000Z","updated_at":"2024-06-13T09:59:59.000Z","dependencies_parsed_at":"2024-06-13T13:57:18.386Z","dependency_job_id":null,"html_url":"https://github.com/bujowskis/put-bd-project","commit_stats":null,"previous_names":["bujowskis/put-bd-project"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bujowskis/put-bd-project","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bujowskis%2Fput-bd-project","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bujowskis%2Fput-bd-project/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bujowskis%2Fput-bd-project/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bujowskis%2Fput-bd-project/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bujowskis","download_url":"https://codeload.github.com/bujowskis/put-bd-project/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bujowskis%2Fput-bd-project/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28885563,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-29T21:06:44.224Z","status":"ssl_error","status_checked_at":"2026-01-29T21:06:42.160Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["cassandra","docker","docker-compose","flask"],"created_at":"2024-09-24T21:54:41.326Z","updated_at":"2026-01-29T21:32:29.891Z","avatar_url":"https://github.com/bujowskis.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Bookkeeper - Distributed system for library\nThis README covers all the aspects required in the report for the project\n\n## Team members\n- Szymon Bujowski, 148050\n- Dominika Plewińska, 151929\n\n## What is Bookkeeper?\nIt's a distributed system intended for use by libraries.\nIt helps management and administration by exposing a web interface, used to perform actions on the books.\n\nThe system allows for:\n- Listing existing books in the system (library)\n- Adding new books\n- Deleting existing books\n- Fetching book info (ISBN, author, title, borrower, publisher, year of publication)\n- Borrowing book (by particular borrower)\n- Returning borrowed book\n\n## How to run Bookkeeper?\n### 1. Build docker image\nNote that the following command assumes you're in the directory of the project\n```shell\ndocker build -t my_flask_server:latest .\n```\n\n### 2. Start multi-container application\n```shell\ndocker compose up\n```\n\n### 3. Access webpage\nThe default config is for webpage to be exposed at [localhost:80](http://localhost:80)\n\n## Dataset\nhttps://www.kaggle.com/datasets/saurabhbagchi/books-dataset, preprocessed for usage in the project.\nNamely:\n- `\"Image-URL-S\";\"Image-URL-M\";\"Image-URL-L\"` columns are dropped\n- `borrower_id` field is added (initially at random) to specify if a given book is borrowed (and by whom)\n\n### Database schema\nData resides in keyspace `bookkeeper`, table `books`:\n```\nisbn                : text (PRIMARY KEY)\nbook_title:         : text\nbook_author:        : text\nyear_of_publication : bigint\npublisher           : text\nborrower_id         : bigint\n```\nNote that ISBN is of type `text` because of some instances such as:\n```\n188164961X, Feel Great, Be Beautiful over 40: Inside Tips on How to Look Better, Be Healthier and Slow the Aging Process\n```\nWhere `X` at the end is valid.\n\n## Distributed system setup\nThe system is a multi-container docker setup with five services,\nconnected through `cassandra-net` bridge network:\n- 3 Cassandra database nodes - `[c1, c2, c3]`\n  - each node has a healthcheck that tests if `describe keyspaces` command from Cassandra Query Language works within 5s timeout\n    - if so, the node is considered healthy\n    - if not, there's a total of 60 retries in 5s intervals\n  - each subsequent node depends on the healthcheck of the previous one\n    - `c1-\u003ec2`, `c2-\u003ec3`\n    - this ensures that by the time `c3` is up, all nodes are as well\n- Flask server\n  - 3 replicas for load balancing, exposed and mapped on one port `8089`\n  - on startup, the server:\n    - 1 - runs `init_db.py` script to initialize the database (populate it with data from `data/dataset.csv`)\n    - 2 - exposes Bookkeeper webpage\n  - depends on `c3` being healthy\n- Nginx web server reverse proxy (middleware orchestrating client-server flow)\n  - listens on port `80` (this is how user accesses [localhost:80](http://localhost:80) webpage)\n  - has default volume mount\n  - depends on Flask server being healthy\n\nThe whole sequence of dependencies on health checks ensures proper setup.\n\n## Stress tests\n`/stress_tests` directory contains a number of stress tests that can be run once the system is set up using:\n```shell\nbash stress_tests.sh\n```\n\nThe tests intend to simulate possible high-load situations the system may encounter:\n- `test1_many_add.py` - high load of adding new book\n- `test2_multiple_actions_and_clients.py` - high load of various actions coming from multiple clients at the same time\n- `test3_reserving_books.py` - high load of reserving books\n- `test4_borrow_and_return.py` - high load of subsequent borrow and return requests\n- `test5_conflicting_reservation.py` - high load of two clients trying to borrow the same book at the same time\n\n## Encountered problems\nWe encountered a number of various problems, concerning different parts and aspects of the project:\n- Flask requiring `2.2.2+` version of a library it depends on (`werkzeug`), pulling the most recent one, which happened to not work with it anymore\n  - A: specify specific version of `werkzeug`\n- database initializing before all Cassandra nodes were fully operational\n  - A: implement health checks and specify dependencies\n- hyphens are special characters in CQL\n  - A: special-handling (double quoting - `'-\u003e''` when inserting data)\n- various issues with ports due to improper nginx config\n  - A: a bit of trial and error, resolving issues one by one\n- huge RAM usage by Cassandra nodes\n  - A: `MAX_HEAP_SIZE` and `HEAP_NEWSIZE` limits\n- no validation of id lead to \"sql injection\" of `-1`, making \"borrow\" action a \"return\" action\n  - A: validation\n- requiring all attributes specified for any operation\n  - A: rewrite code to only require what's actually needed\n- unable to use `action_scripts.js` or `bookkeeper.css`\n  - A: serving of static files with templating of Flask\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbujowskis%2Fput-bd-project","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbujowskis%2Fput-bd-project","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbujowskis%2Fput-bd-project/lists"}