{"id":28305922,"url":"https://github.com/adrianklessa/steamscout","last_synced_at":"2026-04-10T11:32:39.293Z","repository":{"id":250925426,"uuid":"828940854","full_name":"AdrianKlessa/SteamScout","owner":"AdrianKlessa","description":"Ensemble recommendation (recommender) system for finding similar games on Steam","archived":false,"fork":false,"pushed_at":"2025-04-09T11:20:38.000Z","size":385,"stargazers_count":1,"open_issues_count":6,"forks_count":0,"subscribers_count":1,"default_branch":"master","last_synced_at":"2025-07-30T15:57:34.014Z","etag":null,"topics":["doc2vec","docker","dockerfile","flask","fts5","react","recommender-system","recommender-systems","sqlite","steam","steam-games","vite"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/AdrianKlessa.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":"docs/roadmap.MD","authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-07-15T12:39:05.000Z","updated_at":"2025-04-09T11:20:41.000Z","dependencies_parsed_at":"2024-07-30T22:05:22.208Z","dependency_job_id":"43b10e92-f928-4606-a52b-2a034750a9d6","html_url":"https://github.com/AdrianKlessa/SteamScout","commit_stats":null,"previous_names":["adrianklessa/steamscout"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/AdrianKlessa/SteamScout","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdrianKlessa%2FSteamScout","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdrianKlessa%2FSteamScout/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdrianKlessa%2FSteamScout/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdrianKlessa%2FSteamScout/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/AdrianKlessa","download_url":"https://codeload.github.com/AdrianKlessa/SteamScout/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/AdrianKlessa%2FSteamScout/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31641114,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-10T07:40:12.752Z","status":"ssl_error","status_checked_at":"2026-04-10T07:40:11.664Z","response_time":98,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["doc2vec","docker","dockerfile","flask","fts5","react","recommender-system","recommender-systems","sqlite","steam","steam-games","vite"],"created_at":"2025-05-24T03:13:26.610Z","updated_at":"2026-04-10T11:32:39.288Z","avatar_url":"https://github.com/AdrianKlessa.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"### A similarity search and recommendation system for finding games on Steam\n![Test workflow](https://github.com/AdrianKlessa/SteamScout/actions/workflows/test_workflow.yml/badge.svg)\n![Docker compose](https://github.com/AdrianKlessa/SteamScout/actions/workflows/build_workflow.yml/badge.svg)\n\n***Find similar games to the ones you like - even if the genres you like are not very popular***\n\n![image](docs/screenshot_new.PNG)\n\nThe system is a weighted ensemble recommender composed of:\n- A recommender rating the games based on the similarity of their vectorized descriptions (\"About this game\" section)\n- A recommender rating the similarity of games based on their tags\n- A simple check for the ratio of positive user reviews to negative\n\nSince this ensemble does not take into account the number of reviews (only the fraction of positive ones), it tends to catch games with very few reviews if they appear to be a good match to the one the user searched for.\n\nBoth the tags and descriptions are compared based on the cosine similarity of their vectors. \n\nThe tags are binary vectors (consisting of 1s at the indices matching a particular tag the game has and 0s for tags it lacks). The description vector is generated by using doc2vec after training on the entire dataset.\n\nThe vectors are stored using [vectordb](https://github.com/jina-ai/vectordb/) for easy Approximate Nearest Neighbor search.\n\n### Build \u0026 installation (Docker)\n\n1. Clone this repository\n2. Download the [Steam Games dataset](https://www.kaggle.com/datasets/fronkongames/steam-games-dataset) from Kaggle and put `games.json` inside `data/raw`\n3. Modify the `users.json` file to inject temporary user login info (used during build time). E.g.\n```\n{\n\"user1\": \"pass1\",\n\"user2\": \"pass2\",\n}\n```\nThis is a temporary solution to add auth to this application while using file-based SQLite DBs. The file is deleted (in-container) during the build process and doesn't exist in the final image.\n\nYou also need to define a `JWT_SECRET` env variable (as seen in compose) that will be passed to the docker image and used to sign the JWTs.\n4. Run `docker compose up` from the repo's directory.\n\nBuilding the app will take a while (possibly 15+ minutes) since it's preprocessing the dataset and training doc2vec on it.\n\n*(I did not want to include a pretrained model or the data as part of this repository)*\n\nRunning the app after it has been built uses the generated model and DBs so it should be much faster.\n\nDue to heavy reliance on a local database, the app greatly benefits from using an SSD. Full Text Search and indexing on queried columns is used to reduce search delays.\n\nBy default, the frontend is available on the host machine at `http://localhost:8080`\n\n### Unit tests\n\nUnit tests can be executed from the `src` directory by running `python -m unittest discover ../tests`\n\n---\n\n*Utilizes the [Steam Games dataset](https://www.kaggle.com/datasets/fronkongames/steam-games-dataset) published on Kaggle by Martin Bustos Roman, originally generated with [Steam Games Scraper](https://github.com/FronkonGames/Steam-Games-Scraper) by Martin Bustos.*\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadrianklessa%2Fsteamscout","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fadrianklessa%2Fsteamscout","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fadrianklessa%2Fsteamscout/lists"}