{"id":20827642,"url":"https://github.com/bonifield/extractlinks","last_synced_at":"2026-04-25T09:38:07.936Z","repository":{"id":57427796,"uuid":"380555610","full_name":"bonifield/extractlinks","owner":"bonifield","description":null,"archived":false,"fork":false,"pushed_at":"2021-06-26T17:08:16.000Z","size":7,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-12-25T21:00:09.196Z","etag":null,"topics":["beautifulsoup4","html","json","python","python3","requests","urllib","web-scraping"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/bonifield.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-06-26T17:07:09.000Z","updated_at":"2021-06-26T17:16:38.000Z","dependencies_parsed_at":"2022-09-09T08:50:38.731Z","dependency_job_id":null,"html_url":"https://github.com/bonifield/extractlinks","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/bonifield/extractlinks","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bonifield%2Fextractlinks","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bonifield%2Fextractlinks/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bonifield%2Fextractlinks/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bonifield%2Fextractlinks/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/bonifield","download_url":"https://codeload.github.com/bonifield/extractlinks/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/bonifield%2Fextractlinks/sbom","scorecard":{"id":247620,"data":{"date":"2025-08-11","repo":{"name":"github.com/bonifield/extractlinks","commit":"2b7bdca0f42a8da70e95a7861aeff08e73d86b62"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":3,"checks":[{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Code-Review","score":0,"reason":"Found 0/1 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Vulnerabilities","score":10,"reason":"0 existing vulnerabilities detected","details":null,"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'main'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}}]},"last_synced_at":"2025-08-17T07:54:03.203Z","repository_id":57427796,"created_at":"2025-08-17T07:54:03.203Z","updated_at":"2025-08-17T07:54:03.203Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32257755,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-25T09:15:33.318Z","status":"ssl_error","status_checked_at":"2026-04-25T09:15:31.997Z","response_time":59,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["beautifulsoup4","html","json","python","python3","requests","urllib","web-scraping"],"created_at":"2024-11-17T23:12:35.924Z","updated_at":"2026-04-25T09:38:07.914Z","avatar_url":"https://github.com/bonifield.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# extractlinks\nextract and repair links from Requests objects, including redirects and final landing page\n\n### Installation\n```\npip install extractlinks\npython3 -m pip install extractlinks\n```\n\n### Usage\n```\nimport requests\nfrom extractlinks import ExtractLinks\nURL = \"http://cnn.com/\"\nr = requests.get(URL, allow_redirects=True)\ne = ExtractLinks(content=r)\nprint(e.json)\n```\n\n### Example Output\n```\n[\n\t{\n\t\t\"@timestamp\": \"2021-06-26T16:33:20.384Z\",\n\t\t\"url\": {\n\t\t\t\"full\": \"https://www.cnn.com/\",\n\t\t\t\"original\": \"https://www.cnn.com/\",\n\t\t\t\"scheme\": \"https\",\n\t\t\t\"domain\": \"www.cnn.com\",\n\t\t\t\"path\": \"/\"\n\t\t},\n\t\t\"http\": {\n\t\t\t\"response\": {\n\t\t\t\"status_code\": 200,\n\t\t\t\"status_code_reason\": \"OK\",\n\t\t\t\"body_bytes\": 1110460\n\t\t},\n\t\t\"chainitem\": 2,\n\t\t\"pguid\": \"1ff26fce-21a0-401a-9d53-1f863c6e3e31\",\n\t\t\"guid\": \"59dcfa56-b6d2-4924-bae1-70dbcd9d8309\"\n\t\t\"count\": 324,\n\t\t\"types\": [\n\t\t\t\"a-href\",\n\t\t\t\"form-action\",\n\t\t\t\"link-href\",\n\t\t\t\"meta-content\",\n\t\t\t\"script-src\"\n\t\t],\n\t\t\"tags\": [\n\t\t\t\"script\",\n\t\t\t\"meta\",\n\t\t\t\"a\",\n\t\t\t\"form\",\n\t\t\t\"link\"\n\t\t],\n\t\t\"attributes\": [\n\t\t\t\"action\",\n\t\t\t\"content\",\n\t\t\t\"src\",\n\t\t\t\"href\"\n\t\t],\n\t\t\"links\": [\n\t\t\t\"https://www.cnn.com/specials/cnn-investigates\",\n\t\t\t\"https://www.cnn.com/specials/tech/innovate\",\n\t\t\t\"https://www.cnn.com/travel/news\",\n\t\t\t\"https://www.i.cdn.cnn.com/.a/fonts/cnn/3.9.0/cnnsans-italic.woff2\"\n\t\t...\n```\n\n### Objects\n```\n# primary list-of-dictionaries / JSON dump\n# these contain the full link extractions, including items not recognized as URLs or mobile links\noutput # list of dictionaries\njson # JSON string\n\n# lists\nlinks_all # this only contains full links and any relative links \"repaired\" back to full-link format (ex. /images becomes https://www.cnn.com/images\ntypes_all # ex. \"a-href\", \"img-src\", etc\ntags_all # ex. \"a\", \"img\"\nattributes_all # ex. \"href\", \"src\"\n\n# generators, if urlbreakdown module is installed; runs URLBreakdown on every link in links_all\nurlbreakdown_generator_dict()\nurlbreakdown_generator_json()\n```\n\n### Notes\n- select URL and HTTP output fields align to the Elastic Common Schema\n- links_count is not reflective of a unique count, and includes all objects identified including non-URLs in otherwise link-related tag attributes\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbonifield%2Fextractlinks","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbonifield%2Fextractlinks","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbonifield%2Fextractlinks/lists"}