{"id":16137417,"url":"https://github.com/purarue/old_forums","last_synced_at":"2026-04-29T10:02:49.763Z","repository":{"id":94610488,"uuid":"292716829","full_name":"purarue/old_forums","owner":"purarue","description":"Parses posts/achievements from random forums I used in the past","archived":false,"fork":false,"pushed_at":"2024-10-25T17:35:28.000Z","size":23,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-12T22:51:17.937Z","etag":null,"topics":["forum","minecraft","selenium","webscraping"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/purarue.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2020-09-04T01:17:30.000Z","updated_at":"2024-10-25T17:35:32.000Z","dependencies_parsed_at":"2024-11-01T14:51:25.058Z","dependency_job_id":null,"html_url":"https://github.com/purarue/old_forums","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/purarue%2Fold_forums","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/purarue%2Fold_forums/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/purarue%2Fold_forums/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/purarue%2Fold_forums/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/purarue","download_url":"https://codeload.github.com/purarue/old_forums/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247517919,"owners_count":20951719,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["forum","minecraft","selenium","webscraping"],"created_at":"2024-10-09T23:26:58.096Z","updated_at":"2026-04-29T10:02:44.730Z","avatar_url":"https://github.com/purarue.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# old_forums\n\nParses posts/achievements from random forums I used in the past. I don't use any of these anymore, but they contain random thoughts I had back then, so parsing them so I have access to them\n\nThe bit of lib code here pulls CSS selectors from a config file to detect/parse achievement pages. I use this in my personal [HPI](https://github.com/purarue/HPI-personal) modules\n\nThe forum posts are loaded from JSON files created by `./selenium_scripts`, while forum achievements are parsed from the raw HTML pages (i.e., by right click and `save as`ing a page, so that its possible to update)\n\nThis is quite a personal library, as generalizing this to any amount of sites isn't trivial, though the [`achievements` portion](./old_forums/achievements.py) of the library could possibly be re-used, if you have some webscraping know-how\n\n## Installation\n\nRequires `python3.7+`\n\nTo install with pip, run:\n\n    pip install git+https://github.com/purarue/old_forums\n\n### selenium_scripts\n\nPutting these up here as reference. I have so little posts on some of these that didn't have to worry about pagination.\n\nAll the posts get pulled out into a common schema:\n\n```\nforum_name: str\npost_title: str  (name/title of the post)\npost_url: str  (url to the post)\npost_contents: str  (what I actually said)\ndt: epoch datetime\n```\n\nBased on code from [`steamscraper`](https://github.com/purarue/steamscraper)\n\nAs an example; `minecraft_forum.py`\n\n```\npython3 ./minecraft_forum.py \u003cusername\u003e --to-file ./minecraft_forum.json\nHit enter when the page is ready \u003e\n[D 200903 17:18:15 minecraft_forum:49] getting next page...\n[D 200903 17:18:24 minecraft_forum:49] getting next page...\n[D 200903 17:18:32 minecraft_forum:49] getting next page...\n[D 200903 17:18:39 minecraft_forum:49] getting next page...\n[D 200903 17:18:46 minecraft_forum:49] getting next page...\n[D 200903 17:18:54 minecraft_forum:49] getting next page...\n[D 200903 17:19:01 minecraft_forum:49] getting next page...\n[D 200903 17:19:08 minecraft_forum:49] getting next page...\n[D 200903 17:19:16 minecraft_forum:49] getting next page...\n[D 200903 17:19:23 minecraft_forum:49] getting next page...\n[D 200903 17:19:30 minecraft_forum:49] getting next page...\n[D 200903 17:19:39 minecraft_forum:52] done, writing to file...\n```\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpurarue%2Fold_forums","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fpurarue%2Fold_forums","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fpurarue%2Fold_forums/lists"}