{"id":28242729,"url":"https://github.com/max00358/single_threaded_image_web_crawler","last_synced_at":"2025-10-06T17:30:06.453Z","repository":{"id":214426359,"uuid":"736501024","full_name":"Max00358/Single_Threaded_Image_Web_Crawler","owner":"Max00358","description":"ECE 252 Lab 5: Single threaded web crawler that uses asynchronous I/O and cURL multi-interface to enable multiple simultaneous transfers in the same thread to find up to 50 valid PNGs using a seed URL","archived":false,"fork":false,"pushed_at":"2023-12-28T04:54:10.000Z","size":13,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-05-19T06:09:43.519Z","etag":null,"topics":["asynchronous-io","curl-multi-api","webcrawler"],"latest_commit_sha":null,"homepage":"","language":"C","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Max00358.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null}},"created_at":"2023-12-28T04:43:00.000Z","updated_at":"2023-12-28T04:48:28.000Z","dependencies_parsed_at":"2023-12-28T06:04:02.532Z","dependency_job_id":null,"html_url":"https://github.com/Max00358/Single_Threaded_Image_Web_Crawler","commit_stats":null,"previous_names":["max00358/single_threaded_image_web_crawler"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/Max00358/Single_Threaded_Image_Web_Crawler","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Max00358%2FSingle_Threaded_Image_Web_Crawler","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Max00358%2FSingle_Threaded_Image_Web_Crawler/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Max00358%2FSingle_Threaded_Image_Web_Crawler/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Max00358%2FSingle_Threaded_Image_Web_Crawler/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Max00358","download_url":"https://codeload.github.com/Max00358/Single_Threaded_Image_Web_Crawler/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Max00358%2FSingle_Threaded_Image_Web_Crawler/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":278648521,"owners_count":26021915,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-06T02:00:05.630Z","response_time":65,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["asynchronous-io","curl-multi-api","webcrawler"],"created_at":"2025-05-19T06:09:41.746Z","updated_at":"2025-10-06T17:30:06.443Z","avatar_url":"https://github.com/Max00358.png","language":"C","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Single_Threaded_Image_Web_Crawler\n## Objective\nThe findpng3 is a tiny, simplified, single-threaded concurrent web crawler that utilizes non-blocking I/O which enables simultaneous transfers. It searches the web starting from a seed URL. It then finds the number of valid PNG URLs that user specifies in the input or up to 50 PNG URLs and log them into a log.txt file. \n\nThis time, the solution should not use pthreads. However, it should keep multiple concurrent connections to servers open.\n\n## Sample Test Run\n-t: Create t number of threads simultaneously crawling the web\n\u003cbr\u003e -m: Find up to m number of unique PNG URLs on the web\n\u003cbr\u003e -v: Name of log file, log all the visited URLs by the crawler, one URL per line in logfile\n\n* **Input**: findpng2 -t 10 -m 20 -v log.txt http://ece252-1.uwaterloo.ca/lab4\n* **Output**: findpng2 execution time: 10.123456 seconds\n\nLog file may look like the following:\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca/lab4\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca/~yqhuang/lab4/#top\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca/~yqhuang/lab4/imgs/cpp.jpg\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca:2540/image?q=gAAAAABdHkoqPlVGPQPMpFG8Vjvrodu_pZqXMC4jGRiLcYwY6MkhhFG1m5a3x5ZYDOiLGFmz8FTbq3sAva7QKiXY5YNIxrBdxg==\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca/~yqhuang/lab4/imgs/para.jpg\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca/~yqhuang/lab4/imgs/crawler_arch.jpg\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca/~yqhuang/lab4/imgs/Disguise.png\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca:2531/image?img=1\u0026part=0\n\u003cbr\u003ehttp://ece252-3.uwaterloo.ca:2540/image?q=gAAAAABdHkoqOKR-cFRCkiCUBEMAAAWfDvBFlRisL9ysLWHYHbcQbn1b28PV_uHBZ0gJf5bvzrnf1HNXxB6KRlAVETwTIqBH2Q==\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca/~yqhuang/lab4/sub1/index.html\n\u003cbr\u003ehttp://ece252-2.uwaterloo.ca:2540/image?q=gAAAAABdHkoqvYBSFk4VKGcRmB2TKYcl7hgNpTehsR0y7cd1ysTzvlgowzkgpWgwp42BumobA22jAEl0LJj_TcwR-X_F79vyuA==\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca:2540/image?q=gAAAAABdHkoq1GsX73XwnpU1qaO1laOtmzl_e3npHIX83pNGFgfOLJG2vwICDD5Gnb7iwdzV_ZKnckzZQfhsmEEeWbvO3-4Kug==\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca:2540/image?q=gAAAAABdHkoqXCN_I78zg0zDjOvOhA19H7tB5L2pSkuG0NmXg4E1LCe1Ew8utkFxulC2ssJUGOYgguCCXcDlOraPK-tYzsMl-w==\n\u003cbr\u003ehttp://ece252-3.uwaterloo.ca:2540/image?q=gAAAAABdHkoq5CAThsWSs_snhLDy2vDhV97ea1remux8ERu-H5FIIqoG5by17Zt7ba3hdadbOLXoHKJEYVHWdlvkpZg6VxeyaQ==\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca:2540/image?q=gAAAAABdHkoqKjtpjpiM4WKtVvo33HojmbxU3Q_VNRl2bS5uYIta1fUv23182WN7nHtTa-qg_M33g_5CvFk-5i95lVgkaDOCYA==\n\u003cbr\u003ehttp://ece252-2.uwaterloo.ca:2540/image?q=gAAAAABdHkoqymQU_88e8PRLFAHlee_iqRtsEkeVQsUV1YzN2c7mQXV1Sg7hOuNp212fkLFH71-EtoQMS_7bivH4V_Q5rx2Cfg==\n\u003cbr\u003ehttp://ece252-3.uwaterloo.ca:2540/image?q=gAAAAABdHkoqvsHZzgOGhVZBuBxA8IkeZS6GevG48xd-O9--4La4GPLs4qMW8QWW63iNSc4bypjbMJK1tZZGvHELGeVTmaoyZw==\n\u003cbr\u003ehttp://ece252-2.uwaterloo.ca:2540/image?q=gAAAAABdHlHZgxhjnyyylvmstitqthsafcipehodfzmjbugcviblrqxfzjgucrcllazmyhznzdkcmymdknquttkinbgbtshojf==\n\u003cbr\u003ehttp://ece252-3.uwaterloo.ca:2540/image?q=gAAAAABdHkoqdRxJhdCk49ofJ1TCxsbV6DF4DIHKtdWsF-5fJM0BHF6b6oM5Y99T7T78SQQcii0YvDMqzGN4L8a4eFsQotIHLA==\n\u003cbr\u003ehttp://ece252-3.uwaterloo.ca:2540/image?q=gAAAAABdHkoqu2HgO66gfwYQjCHJninMtqjg83iaDpT6U54p8F9XGbboHeGjhalG9fE7CMNx7vkgZZMzN-9c6EqVE5qXlhdqnw==\n\u003cbr\u003ehttp://ece252-2.uwaterloo.ca:2540/image?q=gAAAAABdHkoqMW24vZRpDznRyQZuRsR5vjPThROSj7k52y1weIHxIKOvt9xOxAbpZNudfTkojMyk2ii5Ap5o7e4cRdq_5C0aKw==\n\u003cbr\u003ehttp://ece252-2.uwaterloo.ca:2540/image?q=gAAAAABdHkoqLkzOa89N8w0I4kdmfUtjcN1D6AZYEOJCzr6qpnWae4LfzfIhCRH_MOeQj8x8_ugiRcF79z5D723Ssnm7U-84og==\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca:2540/image?q=gAAAAABdHkoqBsj81dnixZFLtjKlW6lHD5o0y0IWPRDcP0L_uEzcHVKXT6L-32Cl4u-th38LksMevU6T3NRufm6yVtFON0tXHg==\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca:2540/image?q=gAAAAABdHkoqYBTxXgfsZAuOV3Nkm0obQ58hGNAnrTrEDwPseWO4OKq2JXTnwAhg1lufmc4OhP7Vtf4hMEgzNdxNdeAgjsscVg==\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca:2540/image?q=gAAAAABdHkoq-JvNAjwE7fa5r26FVJgLMEqvkrj6AlO2BKI5zQtKRThKL79uHBoP-uRzt9WzYyPeec6zE6bmdkzy5UXt5Br2Iw==\n\u003cbr\u003ehttp://ece252-2.uwaterloo.ca:2540/image?q=gAAAAABdHkoq292-4eAqc4xZ-nwNibr1F5ZZCJhojZGV7hQMtNzryLwR8grsztNSurbqnucszcyMbtSE8BaFg049sI6WZ4aDfg==\n\u003cbr\u003ehttp://ece252-1.uwaterloo.ca:2540/image?q=gAAAAABdHkoqtuljtcVzSazkwT4SVEw_rlEYwknlCq6mKLqndncMei2Om8ccO2ieak5QQCVNJKuwG5FYmX8sgbhKht2xt4ZNlw==\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmax00358%2Fsingle_threaded_image_web_crawler","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmax00358%2Fsingle_threaded_image_web_crawler","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmax00358%2Fsingle_threaded_image_web_crawler/lists"}