{"id":16802598,"url":"https://github.com/daviddavo/blogspot-crawler","last_synced_at":"2026-04-19T19:01:43.805Z","repository":{"id":115205342,"uuid":"354276085","full_name":"daviddavo/Blogspot-Crawler","owner":"daviddavo","description":"Crawler for blogspot and blogger with beautifulsoup","archived":false,"fork":false,"pushed_at":"2021-04-03T18:22:07.000Z","size":5,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-03-17T06:21:56.378Z","etag":null,"topics":["crawler","hacktoberfest","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/daviddavo.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-04-03T11:37:51.000Z","updated_at":"2021-10-02T09:18:28.000Z","dependencies_parsed_at":null,"dependency_job_id":"0d1bcb49-1bef-41f9-b6fb-c628fb1db7cc","html_url":"https://github.com/daviddavo/Blogspot-Crawler","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/daviddavo/Blogspot-Crawler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daviddavo%2FBlogspot-Crawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daviddavo%2FBlogspot-Crawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daviddavo%2FBlogspot-Crawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daviddavo%2FBlogspot-Crawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/daviddavo","download_url":"https://codeload.github.com/daviddavo/Blogspot-Crawler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/daviddavo%2FBlogspot-Crawler/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":265889138,"owners_count":23844539,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["crawler","hacktoberfest","python"],"created_at":"2024-10-13T09:40:08.382Z","updated_at":"2026-04-19T19:01:38.758Z","avatar_url":"https://github.com/daviddavo.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Blogspot Crawler\n\nA simple crawler using Beautiful Soup 4 and requests to obtain every post\nin a Blogger/Blogspot website, via clicking on next page.\n\nIt only downloads the post body in html format, but creating a Jekyll file\nwith the title and tags. As a result, the entire blog dump is very small.\n\nIn the future, it will download images and jekyllify the HTML output.\n\n## Usage\nJust put the url and a destination folder. Posts should be downloaded as the url without the basename.\n\n```\nusage: ./blogspotCrawler.py [-h] [-o DESTINATION] url\n\nBlogspot crawler\n\npositional arguments:\n  url                   Blog url\n\noptional arguments:\n  -h, --help            show this help message and exit\n  -o DESTINATION, --output DESTINATION\n                        Output folder\n```\n\n## Ideas for the future\n- [ ] Quietly process ReadTimeout exceptions on future callback\n- [ ] Auto download images\n- [ ] Jekyllify output\n- [ ] Add wordpress support\n\n-----------------\nThis program is licensed under an MIT License\n\n(C) 2021 [David Davó](https://ddavo.me/en)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaviddavo%2Fblogspot-crawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdaviddavo%2Fblogspot-crawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdaviddavo%2Fblogspot-crawler/lists"}