{"id":19316586,"url":"https://github.com/tokenmill/fast-url-access-checker","last_synced_at":"2025-04-22T17:30:24.501Z","repository":{"id":138584384,"uuid":"188852535","full_name":"tokenmill/fast-url-access-checker","owner":"tokenmill","description":"Easily run HTTP GET requests against a list of URLs to check their HTTP status.","archived":false,"fork":false,"pushed_at":"2019-09-04T09:36:26.000Z","size":43,"stargazers_count":12,"open_issues_count":4,"forks_count":4,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-02T02:02:04.326Z","etag":null,"topics":["clojure","http-redirect","http-status","java","url-checker","url-cleaning"],"latest_commit_sha":null,"homepage":"","language":"Clojure","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/tokenmill.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2019-05-27T13:52:43.000Z","updated_at":"2024-04-28T12:33:10.000Z","dependencies_parsed_at":"2023-04-15T12:01:22.465Z","dependency_job_id":null,"html_url":"https://github.com/tokenmill/fast-url-access-checker","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tokenmill%2Ffast-url-access-checker","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tokenmill%2Ffast-url-access-checker/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tokenmill%2Ffast-url-access-checker/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/tokenmill%2Ffast-url-access-checker/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/tokenmill","download_url":"https://codeload.github.com/tokenmill/fast-url-access-checker/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":250287337,"owners_count":21405588,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["clojure","http-redirect","http-status","java","url-checker","url-cleaning"],"created_at":"2024-11-10T01:11:57.319Z","updated_at":"2025-04-22T17:30:24.483Z","avatar_url":"https://github.com/tokenmill.png","language":"Clojure","funding_links":[],"categories":[],"sub_categories":[],"readme":"\u003ca href=\"http://www.tokenmill.lt\"\u003e\n      \u003cimg src=\".github/tokenmill-logo.svg\" width=\"125\" height=\"125\" align=\"right\" /\u003e\n\u003c/a\u003e\n\n# URL Access Checker\n\nThis tool will take a list URLs of the sites, identify a correct form of the URL and run HTTP GET request against the URL to check its HTTP status. In cases where the address is not completely specified - a protocol is missing, 'www' part is not included when it is needed - a correct form of the URL will be identified. The library will also validate the correctness of the URL and in cases of redirection will return a target URL.\n\nIt is a Clojure library. Additionaly an interface to call it from Java is provided. As well as native binary distribution to be used as a command line tool.\n\n# Features\n\n* Provides the interface for a single URL check.\n* Provides the interface for bulk URL checks.\n* In the case of bulk URL check library parallelizes checking to ensure maximum speed of the entire process.\n* In cases of incompletely formed URLs correct protocol (http or https) will be detected. Access with 'www' part if it is missing will also be tested.\n* Redirection will be detected and target URL returned.\n* URL check returns the following data: HTTP status, target URL, response time.\n\n# How to Use\n\n## Command Line\n\nUse URL checker can be started from the command line with the following instruction\n```\n./url-checker test/resources/bulk-test.txt\n```\n\nSee bellow for the output sample.\n\n\nURL checker can be executed via command line using  intalled [Clojure](https://clojure.org/) tools. Execution example with project's test url set:\n\n```\nclojure -m fast-url-check.core test/resources/bulk-test.txt\n```\n\nOr via project's Makefile\n\n```\nmake check-urls file-name=test/resources/bulk-test.txt\n```\n\nThis will result in CSV formated output of URL checking results\n\n```\ntimestamp,seed,url,status,status-type,response-time,exception\n2019-05-30T10:43:47.674Z,cameron.slb.com,https://www.products.slb.com,302,redirect,431,\n2019-05-30T10:43:47.691Z,co.williams.com,https://co.williams.com/,200,accessible,622,\n2019-05-30T10:43:47.691Z,company.ingersollrand.com,https://www.company.ingersollrand.com/,200,accessible,645,\n...\n2019-05-30T10:43:51.950Z,http://aes.com,https://aes.com/,200,accessible,3632,\n```\n\n\n## Clojure\n\nSinge URL check example.\n\n```\n(require '[fast-url-check.core :refer :all])\n\n(check-access \"tokenmill.lt\")\n=\u003e \n{:url \"http://www.tokenmill.lt/\",\n :seed \"tokenmill.lt\",\n :status 200,\n :response-time 7,\n :status-type :accessible}\n```\n\nBulk URL check example\n\n```\n\n(check-access-bulk [\"tokenmill.lt\" \"15min.lt\" \"https://news.ycombinator.com\"])\n=\u003e \n({:url \"http://www.tokenmill.lt/\",\n  :seed \"tokenmill.lt\",\n  :status 200,\n  :response-time 10,\n  :status-type :accessible}\n {:url \"https://www.15min.lt/\",\n  :seed \"15min.lt\",\n  :status 200,\n  :response-time 46,\n  :status-type :accessible}\n {:url \"https://news.ycombinator.com/\",\n  :seed \"https://news.ycombinator.com\",\n  :status 301,\n  :response-time 379,\n  :status-type :redirect})\n\n```\n\n## Java\n\nJava code example:\n\n```\nimport crawl.tools.URLCheck;\n\nimport java.util.Map;\nimport java.util.Arrays;\nimport java.util.Collection;\n\npublic class MyClass {\n\n    public static void main(String[] args) {\n        System.out.println(URLCheck.checkAccess(\"tokenmill.lt\"));\n\n        String[] urls = {\"15min.lt\", \"https://news.ycombinator.com\"};\n        Collection\u003cMap\u003e validatedUrls = URLCheck.checkAccessBulk(Arrays.asList(urls));\n        for(Map validatedUrl : validatedUrls) {\n            System.out.println(validatedUrl);\n        }\n    }\n}\n\n```\n\n# Benchmark\n\nThis tool aims to provide top performance in bulk URL checking. This repository includes a [reference set](https://github.com/tokenmill/fast-url-access-checker/blob/master/test/resources/bulk-test.txt) of 1000 URLs for consistent performance checking. \n\nBenchmark test executed against the reference URL set performs with average _0.3 seconds per URL_. Execution times are subject to the network conditions and hardware the tests are executed on.\n\nBenchmark test can be launched with `make benchmark`\n\n## License\n\nCopyright \u0026copy; 2019 [TokenMill UAB](http://www.tokenmill.lt).\n\nDistributed under the The Apache License, Version 2.0.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftokenmill%2Ffast-url-access-checker","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Ftokenmill%2Ffast-url-access-checker","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Ftokenmill%2Ffast-url-access-checker/lists"}