Crawler
A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing (web spidering).
- GitHub: https://github.com/topics/crawler
- Wikipedia: https://en.wikipedia.org/wiki/Web_crawler
- Last updated: 2026-06-23 00:06:44 UTC
- JSON Representation
https://github.com/mustafadalga/website-crawler
Hedef web sitesini tarayarak linklerini listeleyen bir web crawler scripti || A web crawler script that lists links by scanning the target website.
crawl crawler crawling-sites hacking hacking-tool web-crawler web-crawler-python web-crawling
Last synced: 20 Apr 2026
https://github.com/ymdarake/otenki-crawler
Yet another weather data scraper.
Last synced: 02 Feb 2026
https://github.com/ariefrahmansyah/crawler
Simple website crawler using Go programming language.
Last synced: 27 Mar 2025
https://github.com/filipsedivy/tachometer-check
🚘 MDČR - kontrola tachometru
Last synced: 11 Jan 2026
https://github.com/zfael/scrape-it-all
Modular web scraper for Node.JS
crawler scraper scraping scraping-websites web-scraping
Last synced: 04 Feb 2026
https://github.com/terminaldweller/crawley
A creepy crawler that runs as a sleepy daemon.
Last synced: 04 Jul 2025
https://github.com/tca166/ck2-history-extractor
A tool for creating an encyclopedia from your CK2 savefile
Last synced: 02 Apr 2025
https://github.com/manu-sh/http_normalizer
http url normalization for web crawlers
crawler http spider url-normalization
Last synced: 12 Jun 2025
https://github.com/mrrefactoring/types-supercrawler
Types for supercrawler nodejs lib
crawler crawlerjs nodejs supercrawler types typescript typescript-definitions
Last synced: 18 Apr 2026
https://github.com/tigercosmos/web-crawler
Web Crawler in Java Maven Project
Last synced: 12 Jun 2025
https://github.com/itechbear/robotstxt
A java clone of Google's robotst.txt parser: https://github.com/google/robotstxt
crawler google-robotst-parser java robotstxt
Last synced: 14 Jan 2026
https://github.com/vuchkov/forbes-billionairs-list
Forbes Billionairs List Crawler - PHP, MySQL, Headless browser, etc.
crawler headless-chrome php scraper website
Last synced: 29 Apr 2026
https://github.com/engageintellect/scrapers
A repository of web scrapers using Python & Scrapy
Last synced: 31 Mar 2025
https://github.com/splorg/sage
A scraper to get every quote from a book off of Goodreads.
books crawler datamining goodreads goodreads-data python scraper scrapy webcrawling webscraping
Last synced: 12 Jun 2025
https://github.com/rcmilan/ex-web-scraping
Web Scraping com F#
crawler f-sharp fsharp fsharp-data scraper web-scraping xplot
Last synced: 27 May 2026
https://github.com/eghuro/crawlcheck
Extensible web crawler
configuration crawler http plugin python robots-txt sitemap
Last synced: 12 Apr 2026
https://github.com/casatrick/solana-transaction-crawler
crawl & parse solana transaction
crawler parser rust solana transaction
Last synced: 20 Jun 2026
https://github.com/athulmurali/flickr-api-docs-crawler
A python based crawler that extracts the documentation of apis and writes it into a file as JSON. A beautiful documentation page can be built from the JSON file using Docusaurus
api beautifulsoup4 crawler documentation python3
Last synced: 18 Jun 2026
https://github.com/huakunshen/cron-crawler-template
Web Crawler Cron Job Template running with GitHub Action. Capable of sending email notifications.
Last synced: 15 May 2026
https://github.com/amirsorouri00/crawler
Page-Rank Public python2 projects whice have been turned into python3.
Last synced: 05 Sep 2025
https://github.com/c17an/grade-tracer
👨💻 항공대 성적변동 추적 크롤러 🏑
concurrently crawler es6 express nodejs nodemon puppeteer react
Last synced: 13 Apr 2026
https://github.com/gabrielolobo/crawley
This project is designed to run crawlers and process the results based on the specified output format. It takes command-line arguments to select the crawler and output format.
crawler poetry python scrapping
Last synced: 22 Jun 2025
https://github.com/ggteixeira/motorcycle-simulator
A toy project that fetches prices from motorcycles from OLX and does some calculations for those who want to buy them..
crawler motorcycle olx scraper
Last synced: 28 Feb 2025
https://github.com/linjonh/videowebsidesparser
This Project is used to parse a video web side to remove ads.
Last synced: 13 Jun 2025
https://github.com/danielemoraschi/go-sitemap-app
crawler golang sitemap sitemap-generator
Last synced: 29 Apr 2026
https://github.com/danielemoraschi/sitemap-common
Simple PHP Sitemap generator and crawler library.
crawler php php-library php-sitemap-generator sitemap
Last synced: 11 Mar 2026
https://github.com/raspi/scrapy-kuntavaalit2021-keskisuomalainen
Fetch Keskisuomalainen kuntavaalit 2021 data
crawler mirror python scrapy spider webcrawler
Last synced: 26 Apr 2025
https://github.com/raspi/scrapy-kuntavaalit2021-sanoma
Fetch Sanoma kuntavaalit 2021 data
crawler mirror python scrapy spider webcrawler
Last synced: 26 Apr 2025
https://github.com/raspi/scrapy-kuntavaalit2021-almamedia
Fetch Almamedia kuntavaalit 2021 data
crawler mirror python scrapy spider webcrawler
Last synced: 26 Apr 2025
https://github.com/radityaharya/sitesweeper
Sitesweeper is a python package to help you automate your web scraping process, outputting pages to a file
crawler pdf python website-crawler
Last synced: 27 Mar 2025
https://github.com/basemax/crawler-news-currency-gold-coins
PHP Crawler to get Persian news related to currency coin and gold.
crawler crawler-php crawler-testing currency currency-exchange-rates gold php php-crawler
Last synced: 05 Jul 2025
https://github.com/der3318/daily-pixiv
Integrated Flow - Line Notification of Top Ranked Pixiv Illustrations
crawler line-notify pixiv workflow
Last synced: 03 Mar 2025
https://github.com/shentengtu/cht-yp-crawler
Simple Crawler of www.iyp.com.tw.
crawler node-js nodejs yellow-pages yellowpages
Last synced: 09 May 2026
https://github.com/hackthedev/botnet
Tool to find IP's on the Web and check SSH availability and brute force login with a wordlist. Educationally only !!!
botnet bruteforce crawler education educational ip malicious proof-of-concept ssh testing web
Last synced: 17 Mar 2025
https://github.com/massongit/ibaraki-univ-circle-crawler
Crawls official circles in Ibaraki University from university's website
Last synced: 25 Mar 2025
https://github.com/w3labkr/ipynb-scraper
A collection of frequently used Jupiter notebook code.
crawler ipynb jupyter jupyter-notebook python scrapper
Last synced: 19 Apr 2026
https://github.com/hvtuananh/twitter_crawler
Daemon to call and get tweets from Twitter Public Stream API
crawler java streaming-api tweets twitter twitter-crawler
Last synced: 11 Mar 2025
https://github.com/cls1991/gank.io-go
A simple crawler for fetching pictures from http://gank.io, implemented in golang.
crawler gankio goquery pictures
Last synced: 27 Feb 2025
https://github.com/ericc-ch/crawldown
Crawl websites and convert their pages into clean, readable Markdown content using Mozilla's Readability and Turndown.
Last synced: 05 Jul 2025
https://github.com/matheusfaustino/jazzmaster_crawler
It is a crawling for getting the audio programs from a specific radio program called Jazzmaster
Last synced: 14 Jun 2025
https://github.com/jenting/compare-drugstore-price
Compare price between cosmeceutical shops
cosmed crawler golang poya side-project watsons
Last synced: 27 Mar 2025
https://github.com/marcosvbras/twitton
A simple Python library to make Twitter Search API easily to use
crawler crawling python spider twitter twitter-api
Last synced: 27 Mar 2025
https://github.com/kasperomari/simplecrawlerapi
A simple RESTful API that takes a URL and returns all the links in a specific depth.
crawler flask-api flask-restful
Last synced: 02 Apr 2025
https://github.com/lesterrry/campfire
Shock-drop watching utility
crawler parser web-crawler web-parser
Last synced: 13 Jun 2026
https://github.com/moe131/webcrawler
Python web crawler designed to scrape websites
crawler crawling-python python python-crawler scraping simhash web-crawler
Last synced: 09 Apr 2025
https://github.com/dmarcosl/upshelf-technical-test
Technical test for Upshelf
crawler interview python scraping scrapy spider technical-test web-scraping
Last synced: 09 Apr 2025
https://github.com/ismoreirakt/spyder
The web is changing. Spyder sees it.
alerts automation crawler monitor
Last synced: 01 Mar 2025
https://github.com/mnemocron/VPNNetworkShareCrawler
ugly scripts to connect a Raspberry Pi to a VPN and attach network share to periodically crawl the documents on it
Last synced: 11 Mar 2025
https://github.com/appliedsoul/headless-screenshot
High-level library for taking screenshot of websites based on headless chrome (puppeteer)
crawler headless-chromium javascript nodejs scrapper screenshot testing
Last synced: 21 Apr 2026
https://github.com/ggteixeira/corpus-cleaner
Linguistic tool (made by a linguist, for linguists) that scraps corpora, automatically cleans it up, and generates n-grams.
beautifulsoup4 bs4 corpora corpus corpus-linguistics crawler linguistics nlp python scraper web-scraping
Last synced: 28 Feb 2025
https://github.com/yosh1/mio-crawler
A crawler that acquires data usage of iijmio .
Last synced: 10 May 2026
https://github.com/robin98sun/structured-web-data-crawler
crawler multi-thread structured-web-data
Last synced: 16 Mar 2025
https://github.com/bramtenhove/issue-crawler
Crawls Drupal issues and keeps stats
Last synced: 09 Jan 2026
https://github.com/yangxuhui/requests-google
A simple google related Parsing Package
Last synced: 14 Jan 2026
https://github.com/k0nxt3d/web-scrapers
Web Scraping Scripts in PhP and Bash
bash bot clone cloning crawler curl curlphp download mirroring scraping scraping-websites seo seo-optimization shell-script spider wget
Last synced: 31 Dec 2025
https://github.com/usethisname1419/connectioncrawler
crawls a website and checks for connections
connection crawler http-headers reporting website-analyzer
Last synced: 06 Jul 2025
https://github.com/mikiw/reactweb3
Ethereum transaction crawler in ReactJs.
Last synced: 14 May 2026
https://github.com/loko5ja/seed-gen
Seed-gen is an innovative tool designed to generate unique and creative seed phrases for cryptocurrency wallets. With a focus on security and usability, it ensures that users have robust, memorable keys for safeguarding their digital assets efficiently.
crawler crypto crypto-2025 crypto-bot crypto-finder crypto-recovery ethereum-bruteforce laravel lost-btc-wallet-finder mnemonic-generator seed-crypto seed-recovery seed-tool yeoman
Last synced: 03 Apr 2025
https://github.com/nowshad-sust/corona
A simple data endpoint for coronavirus updates
api corona coronavirus-updates crawler dcoker-compose excel nodejs
Last synced: 17 May 2026
https://github.com/sssshefer/web-crawler-http
Basic web crawler which represents the linking structure of the website
Last synced: 01 Mar 2025
https://github.com/allancapistrano/anime-sheets
Crawler que pega as informações dos animes e salva numa planilha.
anime crawler google-sheets google-sheets-api
Last synced: 16 Mar 2025
https://github.com/roc41d/http-web-crawler
Http web crawler with Nodejs + TDD
crawler http javascript jest jest-test nodejs webcrawler
Last synced: 13 Apr 2026
https://github.com/moojing/coinmarketcap-crypto-crawler
A Raycast plugin for getting the latest price of your favorite coins from CoinMarketCap.
Last synced: 01 Apr 2025
https://github.com/d-w-arnold/local-news-data-collection
Web crawler for local news sites - Generates HTML files of each webpage visited and a list of links found on the webpage, as a TXT file 🌎
crawler data-collection python
Last synced: 01 Apr 2025
https://github.com/keizerzilla/ssh-hunter
Script que caça por Raspberry Pis vulneráveis na internet (porta SSH aberta e senha padrão não modificada).
Last synced: 10 Apr 2025
https://github.com/keizerzilla/search4dwango9
My attempt to help solving the DWANGO9 wad mystery. More info: https://www.youtube.com/watch?v=RXGtCjdwwe8
Last synced: 10 Apr 2025
https://github.com/mevljas/gov.si-crawler-playwright
A standalone crawler that crawls only .gov.si web sites using Playwright.
crawler multithreading playwright sqlachemy
Last synced: 19 Jan 2026
https://github.com/tiennhm/crawl-sanfoundry-mcqs
Sanfoundry MQCS Crawler
beautifulsoup4 bs4 crawler csv flask python
Last synced: 13 Apr 2026
https://github.com/mehdieidi/offliner
Offliner is a tool to make a website offline viewable. It's a concurrent web crawler which saves all the pages and static files in a directory.
concurrency concurrent concurrent-programming crawler go golang goroutine multiprocessing multithreading process scraper thread
Last synced: 14 Jan 2026
https://github.com/heitor57/astronomy-news
:telescope::newspaper: Astronomy News
crawler data-science news text-mining
Last synced: 06 Oct 2025
https://github.com/b3j4y/unidisk
A Crawler to search for keywords and compare the score
comparison crawler nlp solr-client
Last synced: 17 Jan 2026
https://github.com/semoal/pythoncrawler
Python crawler with XMLRPC & BeautifulSoap
beautifulsoup crawler python wordpress xmlrpc
Last synced: 15 Apr 2026
https://github.com/heyihuang826/ncku_course
Efficiently and reliably scrapes course information from National Cheng Kung University on a regular basis(if you choose to store data on onedrive). The collected data is organized into Excel files and can be automatically uploaded to OneDrive or saved locally (to your personal computer or github repo).
Last synced: 01 Mar 2026
https://github.com/constaf79/pycn
🔗 Simplify your cryptocurrency tasks with pycoin, a Python library providing essential utilities for Bitcoin and alt-coins, ensuring seamless transactions and operations.
cnc-machine cnc-milling-controller cnn cnn-model cnn-processors computer-vision crawler edge-detection fun image-classification image-processing library neural-network pillow pycnc python raspberry-pi web
Last synced: 14 May 2026
https://github.com/dasantonym/node-cesspoll
:poop: Turd Miner Node Module
crawler news poopetry potty-humour
Last synced: 28 Oct 2025
https://github.com/nyarla/net-paranoid-go
(WIP) A paranoidic helpers for untrusted web content crawler
crawler filtering golang helper
Last synced: 14 Jan 2026
https://github.com/huyduc1602/uniapp-crawler
Crawl và Dịch tài liệu Uni-app
Last synced: 25 Jan 2026
https://github.com/viko16/hatcher
🐣[WIP] Provides APIs by simple configuration.
api api-server cli crawler koa-middleware nodejs spider
Last synced: 08 Oct 2025
https://github.com/romangw/lukki
Completely free code for a webcrawling bot.
crawler python web-scraping web-scraping-python
Last synced: 08 Oct 2025
https://github.com/killianmeersman/wander
Convenient scraping library for Gophers
crawler data-mining golang scraper spider
Last synced: 14 Jan 2026
https://github.com/bernieyangmh/check-link
Checking through whole website, identifying broken links.
Last synced: 14 Jan 2026