{"id":22694607,"url":"https://github.com/kaspercools/tiktok-selenium-crawler","last_synced_at":"2025-08-23T02:17:27.778Z","repository":{"id":157605254,"uuid":"624844653","full_name":"kaspercools/tiktok-selenium-crawler","owner":"kaspercools","description":null,"archived":false,"fork":false,"pushed_at":"2023-05-04T18:36:22.000Z","size":13,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-07-28T21:22:26.243Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kaspercools.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-07T11:55:32.000Z","updated_at":"2023-04-07T11:57:50.000Z","dependencies_parsed_at":null,"dependency_job_id":"7aff16ce-dc8d-4035-84ae-019f6f87476b","html_url":"https://github.com/kaspercools/tiktok-selenium-crawler","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/kaspercools/tiktok-selenium-crawler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaspercools%2Ftiktok-selenium-crawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaspercools%2Ftiktok-selenium-crawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaspercools%2Ftiktok-selenium-crawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaspercools%2Ftiktok-selenium-crawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kaspercools","download_url":"https://codeload.github.com/kaspercools/tiktok-selenium-crawler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kaspercools%2Ftiktok-selenium-crawler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271732362,"owners_count":24811309,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-23T02:00:09.327Z","response_time":69,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-12-10T03:08:51.832Z","updated_at":"2025-08-23T02:17:27.760Z","avatar_url":"https://github.com/kaspercools.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# tiktok-selenium-crawler\n\nThis quick and rather dirty script, including others, was written to help with autamatically scraping data from TikTok as part of my Master's thesis. Further details can be found at [github.com/kaspercools/tiktok-offensive-language-classifier](https://github.com/kaspercools/tiktok-offensive-language-classifier)\n\nThe `data-reader.py`file maps the results to individual files for further processing. The original data was obtained using our [Bright Data Collector script](https://github.com/kaspercools/bright-data-collector). Subsequently, the crawler.py file processes these and adds comments gathered from TikTok to these data files.\nThese data-files were later used to populate our MongoDB collections.\n\n### Developers discretion is advised\nNote that this script may not be all that well written or conform to Python conventions. We quickly wrote this code to meet our needs for automatically collecting data. This script was one of a few that contributed in continuous and automated collection and processing all the information hence why we start off by writing an endless while loop.\n\n## License\n\nAll source code is made available under a MIT license. You can freely use and modify the code, without warranty, so long as you provide attribution to the authors. See LICENSE for the full license text.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkaspercools%2Ftiktok-selenium-crawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkaspercools%2Ftiktok-selenium-crawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkaspercools%2Ftiktok-selenium-crawler/lists"}