Crawler
A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing (web spidering).
- GitHub: https://github.com/topics/crawler
- Wikipedia: https://en.wikipedia.org/wiki/Web_crawler
- Last updated: 2026-07-01 00:06:39 UTC
- JSON Representation
https://github.com/basemax/okala-store-ids
A PHP script designed to systematically query the Okala API and extract a comprehensive list of valid store IDs. By automating the retrieval of store details, it enables users to efficiently compile and maintain an up-to-date dataset of active Okala stores for analysis, integration, or further processing.
crawler curl id ids ir iran okala okala-store okala-store-id php store store-okala
Last synced: 10 Jun 2025
https://github.com/sedrubal/webcrawler
Crawl sites and search for security issues.
crawler script security website-auditing
Last synced: 17 Mar 2025
https://github.com/dubniczky/webmap
Website mapping crawler implemented in python
crawler mapping mapping-tools package python scraping security
Last synced: 31 Mar 2025
https://github.com/dubniczky/bad-robot
This is a python crawler that disregards robots.txt rules and downloads disallowed resources
crawler osint-python osint-tool python robots-txt
Last synced: 31 Mar 2025
https://github.com/rsheremeta/web-crawler
A tiny web-crawler which looks for the links, extract and prints them concurrently to the Terminal output
crawler go golang web-crawler webcrawler
Last synced: 12 Jun 2026
https://github.com/jlenon7/sef_automation
📑 Crawler that automatically enrol in open vacancies in SEF website.
athenna crawler esm nodejs playwright portugal residence sef typescript
Last synced: 03 Mar 2026
https://github.com/ecklf/reddit-clawler
A command-line tool written in Rust that crawls Reddit posts from a user or subreddit
cli crawler downloader downloader-for-reddit reddit
Last synced: 31 Mar 2025
https://github.com/zaneh/ocw-crawler
Crawl MIT OpenCourseWare courses with Kimurai. Not affiliated.
crawler kimurai mit ocw opencourseware spider
Last synced: 28 May 2026
https://github.com/seanghay/wpget
⚡️wpget - A tool for downloading all posts from a WordPress website via public JSON API
Last synced: 08 Feb 2026
https://github.com/balintpethe/laravel-universal-scraper
Universal Scraper for Laravel
crawler laravel scraper web-scraper
Last synced: 13 Jan 2026
https://github.com/yuchenq/comp90055-project
This is the lastest version of my project belong to Comp90055.
couchdb crawler data-visualization python3 textblob tweepy
Last synced: 16 Jul 2025
https://github.com/phatpham9/scraper.fun
Building, using & sharing HTML scraper are way funnier!
Last synced: 24 Mar 2025
https://github.com/longluo/spider
My Python Spider / Crawler
crawler python spider twitter weibo weibo-crawler weibo-spider
Last synced: 11 Jun 2025
https://github.com/billy0402/python-application
A learning project from the book 'Python 技術者們'.
course crawler matplotlib opencv pandas python requests selenium sklearn
Last synced: 12 Apr 2026
https://github.com/ma-pony/playwright-spider-utils
Playwright Spider Utils is a utility library for engineers using the Playwright framework to build web crawlers. This project provides common web scraping functions, simplifying the process of crawler development and enhancing productivity.
crawl crawler playwright python scrapy selenium spider spiderman
Last synced: 06 Jan 2026
https://github.com/manikantasanjay/stackoverflow_tag_generator_webcrawler
StackOverFlow Tag Generator Using a WebCrawler.
Last synced: 08 Apr 2025
https://github.com/eklem/vinmonopolet-crawler
Crawling Vinmonopolet-data and indexing it to a norch search index
crawler dataset javascript norch search-engine
Last synced: 26 Mar 2025
https://github.com/allancapistrano/steam.py
An API wrapper for Steam written in Python.
Last synced: 16 Mar 2025
https://github.com/reineimi/va2crawl
Website crawler, validator and SEO optimizer
crawler seo-optimization seotools validator website-crawler
Last synced: 07 Jul 2025
https://github.com/bradsec/gomine
A Go CLI tool to quickly crawl and mine (download) specific file types from websites.
cli crawler golang terminal-based
Last synced: 09 Apr 2025
https://github.com/montenegrodr/letmecrawl
Curated free proxies
crawler proxy proxy-server proxypool scraper
Last synced: 18 Jan 2026
https://github.com/davelongdev/link-report-crawler
A web crawler using Node.js that crawls a site and returns a report showing all internal links.
crawler crawling javascript seo seo-tools
Last synced: 16 Jun 2025
https://github.com/ahsouza/iquizz-api
API RESTfull developed in Node.Js with MongoDB
animations cluster crawler docker docker-compose ejs-templates es8 font-awesome grunt-task helmet-detection heroku javascript jquery material-design mongodb nodejs passport-strategy passportjs pusher token-authetication
Last synced: 12 Apr 2026
https://github.com/mustafadalga/website-crawler
Hedef web sitesini tarayarak linklerini listeleyen bir web crawler scripti || A web crawler script that lists links by scanning the target website.
crawl crawler crawling-sites hacking hacking-tool web-crawler web-crawler-python web-crawling
Last synced: 20 Apr 2026
https://github.com/filipsedivy/tachometer-check
🚘 MDČR - kontrola tachometru
Last synced: 11 Jan 2026
https://github.com/zfael/scrape-it-all
Modular web scraper for Node.JS
crawler scraper scraping scraping-websites web-scraping
Last synced: 04 Feb 2026
https://github.com/terminaldweller/crawley
A creepy crawler that runs as a sleepy daemon.
Last synced: 04 Jul 2025
https://github.com/tca166/ck2-history-extractor
A tool for creating an encyclopedia from your CK2 savefile
Last synced: 02 Apr 2025
https://github.com/manu-sh/http_normalizer
http url normalization for web crawlers
crawler http spider url-normalization
Last synced: 12 Jun 2025
https://github.com/mrrefactoring/types-supercrawler
Types for supercrawler nodejs lib
crawler crawlerjs nodejs supercrawler types typescript typescript-definitions
Last synced: 18 Apr 2026
https://github.com/tigercosmos/web-crawler
Web Crawler in Java Maven Project
Last synced: 12 Jun 2025
https://github.com/itechbear/robotstxt
A java clone of Google's robotst.txt parser: https://github.com/google/robotstxt
crawler google-robotst-parser java robotstxt
Last synced: 14 Jan 2026
https://github.com/vuchkov/forbes-billionairs-list
Forbes Billionairs List Crawler - PHP, MySQL, Headless browser, etc.
crawler headless-chrome php scraper website
Last synced: 29 Apr 2026
https://github.com/engageintellect/scrapers
A repository of web scrapers using Python & Scrapy
Last synced: 31 Mar 2025
https://github.com/splorg/sage
A scraper to get every quote from a book off of Goodreads.
books crawler datamining goodreads goodreads-data python scraper scrapy webcrawling webscraping
Last synced: 12 Jun 2025
https://github.com/rcmilan/ex-web-scraping
Web Scraping com F#
crawler f-sharp fsharp fsharp-data scraper web-scraping xplot
Last synced: 27 May 2026
https://github.com/eghuro/crawlcheck
Extensible web crawler
configuration crawler http plugin python robots-txt sitemap
Last synced: 12 Apr 2026
https://github.com/casatrick/solana-transaction-crawler
crawl & parse solana transaction
crawler parser rust solana transaction
Last synced: 20 Jun 2026
https://github.com/athulmurali/flickr-api-docs-crawler
A python based crawler that extracts the documentation of apis and writes it into a file as JSON. A beautiful documentation page can be built from the JSON file using Docusaurus
api beautifulsoup4 crawler documentation python3
Last synced: 18 Jun 2026
https://github.com/amirsorouri00/crawler
Page-Rank Public python2 projects whice have been turned into python3.
Last synced: 05 Sep 2025
https://github.com/c17an/grade-tracer
👨💻 항공대 성적변동 추적 크롤러 🏑
concurrently crawler es6 express nodejs nodemon puppeteer react
Last synced: 13 Apr 2026
https://github.com/gabrielolobo/crawley
This project is designed to run crawlers and process the results based on the specified output format. It takes command-line arguments to select the crawler and output format.
crawler poetry python scrapping
Last synced: 22 Jun 2025
https://github.com/ggteixeira/motorcycle-simulator
A toy project that fetches prices from motorcycles from OLX and does some calculations for those who want to buy them..
crawler motorcycle olx scraper
Last synced: 28 Feb 2025
https://github.com/linjonh/videowebsidesparser
This Project is used to parse a video web side to remove ads.
Last synced: 13 Jun 2025
https://github.com/danielemoraschi/go-sitemap-app
crawler golang sitemap sitemap-generator
Last synced: 29 Apr 2026
https://github.com/danielemoraschi/sitemap-common
Simple PHP Sitemap generator and crawler library.
crawler php php-library php-sitemap-generator sitemap
Last synced: 11 Mar 2026
https://github.com/raspi/scrapy-kuntavaalit2021-keskisuomalainen
Fetch Keskisuomalainen kuntavaalit 2021 data
crawler mirror python scrapy spider webcrawler
Last synced: 26 Apr 2025
https://github.com/raspi/scrapy-kuntavaalit2021-sanoma
Fetch Sanoma kuntavaalit 2021 data
crawler mirror python scrapy spider webcrawler
Last synced: 26 Apr 2025
https://github.com/raspi/scrapy-kuntavaalit2021-almamedia
Fetch Almamedia kuntavaalit 2021 data
crawler mirror python scrapy spider webcrawler
Last synced: 26 Apr 2025
https://github.com/basemax/crawler-news-currency-gold-coins
PHP Crawler to get Persian news related to currency coin and gold.
crawler crawler-php crawler-testing currency currency-exchange-rates gold php php-crawler
Last synced: 05 Jul 2025
https://github.com/der3318/daily-pixiv
Integrated Flow - Line Notification of Top Ranked Pixiv Illustrations
crawler line-notify pixiv workflow
Last synced: 03 Mar 2025
https://github.com/shentengtu/cht-yp-crawler
Simple Crawler of www.iyp.com.tw.
crawler node-js nodejs yellow-pages yellowpages
Last synced: 09 May 2026
https://github.com/massongit/ibaraki-univ-circle-crawler
Crawls official circles in Ibaraki University from university's website
Last synced: 25 Mar 2025
https://github.com/w3labkr/ipynb-scraper
A collection of frequently used Jupiter notebook code.
crawler ipynb jupyter jupyter-notebook python scrapper
Last synced: 19 Apr 2026
https://github.com/hvtuananh/twitter_crawler
Daemon to call and get tweets from Twitter Public Stream API
crawler java streaming-api tweets twitter twitter-crawler
Last synced: 11 Mar 2025
https://github.com/cls1991/gank.io-go
A simple crawler for fetching pictures from http://gank.io, implemented in golang.
crawler gankio goquery pictures
Last synced: 27 Feb 2025
https://github.com/ericc-ch/crawldown
Crawl websites and convert their pages into clean, readable Markdown content using Mozilla's Readability and Turndown.
Last synced: 05 Jul 2025
https://github.com/matheusfaustino/jazzmaster_crawler
It is a crawling for getting the audio programs from a specific radio program called Jazzmaster
Last synced: 14 Jun 2025
https://github.com/kasperomari/simplecrawlerapi
A simple RESTful API that takes a URL and returns all the links in a specific depth.
crawler flask-api flask-restful
Last synced: 02 Apr 2025
https://github.com/lesterrry/campfire
Shock-drop watching utility
crawler parser web-crawler web-parser
Last synced: 13 Jun 2026
https://github.com/moe131/webcrawler
Python web crawler designed to scrape websites
crawler crawling-python python python-crawler scraping simhash web-crawler
Last synced: 09 Apr 2025
https://github.com/ismoreirakt/spyder
The web is changing. Spyder sees it.
alerts automation crawler monitor
Last synced: 01 Mar 2025
https://github.com/mnemocron/VPNNetworkShareCrawler
ugly scripts to connect a Raspberry Pi to a VPN and attach network share to periodically crawl the documents on it
Last synced: 11 Mar 2025
https://github.com/appliedsoul/headless-screenshot
High-level library for taking screenshot of websites based on headless chrome (puppeteer)
crawler headless-chromium javascript nodejs scrapper screenshot testing
Last synced: 21 Apr 2026
https://github.com/robin98sun/structured-web-data-crawler
crawler multi-thread structured-web-data
Last synced: 16 Mar 2025
https://github.com/bramtenhove/issue-crawler
Crawls Drupal issues and keeps stats
Last synced: 09 Jan 2026
https://github.com/yangxuhui/requests-google
A simple google related Parsing Package
Last synced: 14 Jan 2026
https://github.com/k0nxt3d/web-scrapers
Web Scraping Scripts in PhP and Bash
bash bot clone cloning crawler curl curlphp download mirroring scraping scraping-websites seo seo-optimization shell-script spider wget
Last synced: 31 Dec 2025
https://github.com/usethisname1419/connectioncrawler
crawls a website and checks for connections
connection crawler http-headers reporting website-analyzer
Last synced: 06 Jul 2025
https://github.com/mikiw/reactweb3
Ethereum transaction crawler in ReactJs.
Last synced: 14 May 2026
https://github.com/loko5ja/seed-gen
Seed-gen is an innovative tool designed to generate unique and creative seed phrases for cryptocurrency wallets. With a focus on security and usability, it ensures that users have robust, memorable keys for safeguarding their digital assets efficiently.
crawler crypto crypto-2025 crypto-bot crypto-finder crypto-recovery ethereum-bruteforce laravel lost-btc-wallet-finder mnemonic-generator seed-crypto seed-recovery seed-tool yeoman
Last synced: 03 Apr 2025
https://github.com/nowshad-sust/corona
A simple data endpoint for coronavirus updates
api corona coronavirus-updates crawler dcoker-compose excel nodejs
Last synced: 17 May 2026
https://github.com/sssshefer/web-crawler-http
Basic web crawler which represents the linking structure of the website
Last synced: 01 Mar 2025
https://github.com/allancapistrano/anime-sheets
Crawler que pega as informações dos animes e salva numa planilha.
anime crawler google-sheets google-sheets-api
Last synced: 16 Mar 2025
https://github.com/roc41d/http-web-crawler
Http web crawler with Nodejs + TDD
crawler http javascript jest jest-test nodejs webcrawler
Last synced: 13 Apr 2026
https://github.com/moojing/coinmarketcap-crypto-crawler
A Raycast plugin for getting the latest price of your favorite coins from CoinMarketCap.
Last synced: 01 Apr 2025