Crawler
A Web crawler, sometimes called a spider or spiderbot and often shortened to crawler, is an Internet bot that systematically browses the World Wide Web and that is typically operated by search engines for the purpose of Web indexing (web spidering).
- GitHub: https://github.com/topics/crawler
- Wikipedia: https://en.wikipedia.org/wiki/Web_crawler
- Last updated: 2026-06-22 00:06:47 UTC
- JSON Representation
https://github.com/appliedsoul/promise-crawler
Promise support for node-crawler (Web Crawler/Spider for NodeJS + server-side jQuery)
crawler node-crawler nodejs promise-node-crawler spider
Last synced: 28 Feb 2026
https://github.com/nakabonne/webcrawlerforserps
Web crawler that scrapes Google search results
Last synced: 22 Oct 2025
https://github.com/jacobsteves/crawlperl
A web crawler made with Perl. Great for grabbing or searching for data off the web, or ensuring that your own site files are secure and hidden.
crawler perl scripting web-crawler
Last synced: 14 Apr 2025
https://github.com/yaroslaff/bulk-http-check
Very fast and simple concurrent HTTP client (3500 HTTP req/s)
bulk check concurrent connections crawler header http https multiple parallel spider status
Last synced: 13 Apr 2025
https://github.com/windfarer/biu
biubiubiu~~ I'm a tiny web crawler framework
crawler python spider spider-framework web-crawler
Last synced: 23 Mar 2025
https://github.com/adambankz/tiktok-scraper
A simple, no download scraper for social media platforms like TikTok. Just input parameters and parse useful data. Download TikTok videos with no watermark
crawler no-watermark parse scraper scraper-site tiktok-no-watermark tiktok-scraper
Last synced: 19 Feb 2026
https://github.com/simin75simin/libgencrawl
crawl all books from a library genesis search
crawler free-software libgen python3 scraper
Last synced: 05 Apr 2025
https://github.com/box-archived/vlive-py
VLIVE(vlive.tv) parser for python
api-wrapper crawler kpop parser python vlive
Last synced: 14 Jan 2026
https://github.com/amirhoseinsb/Cloud_Player_V2
You can use the cloudplayer tool to listen to the music of the singer you want without going to a specific website and at a very high speed.
cloud-player crawler crawling music music-player programming python url-player
Last synced: 08 Jul 2025
https://github.com/wenyalintw/job-scraper-bot
幫朋友做好玩的Telegram機器人,已部署到Heroku
amazon-web-services aws-s3 boto3 crawler google-drive google-drive-api heroku heroku-deployment python-telegram-bot scraper scraping scrapy telegram telegram-bot telegram-bot-api web-scraping
Last synced: 13 Sep 2025
https://github.com/yerkopalma/bash-crawler
:computer: Get a site links with bash
Last synced: 05 Aug 2025
https://github.com/vmarcosp/supervise-crawler
:male_detective: Supervise crawler
crawler esy ocaml reasonml webcrawler
Last synced: 13 May 2025
https://github.com/bfwg/node-tinycrawler
Tiny web-crawler in a nute shell for Node.js
Last synced: 10 Nov 2025
https://github.com/markmelnic/mobile-de-crawler
A crawler for mobile.de to index all car listings on the website.
crawler requests scraper sqlite3
Last synced: 08 Oct 2025
https://github.com/florinutz/filme
Filme provides utilities for torrenting movies
crawler golang movies torrents
Last synced: 14 Jan 2026
https://github.com/dori-dev/quotes-crawler
Quotes crawler using scrapy and python.
crawler crawling python scraping-python scraping-websites scrapy scrapy-crawler scrapy-spider web-scraper
Last synced: 08 Oct 2025
https://github.com/xvc323/omnidocs
Automated documentation crawler that generates LLM-friendly Markdown from any docs site. Export as single or multi-file, ready for AI ingestion.
crawler documentation llm markdown
Last synced: 27 Jun 2025
https://github.com/alexqi/webphantom
面向 Web 数据采集任务的开源爬虫框架,支持接口调用、任务调度、会话管理等核心功能,适用于构建具备一定反爬能力的自动化采集系统(抖音|小红书|淘宝|京东)
crawler douyin qps scheduler taobao xiaohonghsu
Last synced: 22 Jun 2026
https://github.com/mashukui/dy_trans_tool
用python开发的抖音转换gui界面软件工具,支持抖音号和主页链接uid相互转换、作品链接app端转为pc端等。抖音爬虫|抖音工具|抖音采集工具|抖音采集|抖音采集软件|抖音效率工具|抖音爬取数据|douyin|Douyin
crawler douyin douyin-api gui gui-application python3
Last synced: 04 Apr 2026
https://github.com/rvegas/dota_crawler
Crawler for dotapedia. Fills a Mongo and a PG database with game data.
crawler dota dota2 flask mongodb postgresql python3 regex scrapy
Last synced: 05 Sep 2025
https://github.com/hybridx/webscraper
webcrawler made from Beautiful soup
crawler flask google-dorks javascript python3 search-engine
Last synced: 07 May 2025
https://github.com/mrrfv/webarchive
Crawls websites and saves found URLs to a file.
archive archiveteam archiving crawler crawling ia internet-archive scraper web-archiving web-scraping
Last synced: 18 Mar 2025
https://github.com/AmirAref/Torobot
an inline telegram robot to easy access and search in torob.com products from telegram.
crawler python python-telegram-bot scraper telegtam-bot
Last synced: 13 Jul 2025
https://github.com/amirzenoozi/persian-news-crawler
Simple Script To Crawl Data From Persian News Agencies Including Fars, Mehr.
cli crawler database fars-news farsi-datasets kaggle-dataset mehr-news news news-agencies newspaper python python3 script shargh-news sqlite3 tensorflow tensorflow2
Last synced: 13 Apr 2025
https://github.com/akiosarkiz/manga-collector
The manga collector is a library designed to easily scrape manga content from various websites. This package is licensed under the MIT License and is fully test-covered
Last synced: 10 Jul 2025
https://github.com/twtrubiks/pttcrawlercontent
PTT Crawler Content on python PTT文章爬蟲
Last synced: 15 Apr 2025
https://github.com/jonasgeiler/Iconmonstr-API
An unofficial API to access icons from iconmonstr.com
api collection collections crawler eps font icon icon-font iconmonstr iconmonstr-api icons image images png psd scraper svg unofficial vector vector-graphics
Last synced: 10 Mar 2025
https://github.com/simsso/vision-based-page-rank-estimation
Student research project on pagerank estimation with deep graph networks
cnn crawler deep-learning graph-networks page-rank student-research-project
Last synced: 24 Apr 2025
https://github.com/doroudi/imdb-crawler
imdb.com movies crawler in scrapy
crawler data-mining python scrapy
Last synced: 22 Jun 2025
https://github.com/markelog/map
Simple site map generator, supports couple reporters, depth levels and etc
Last synced: 11 Apr 2025
https://github.com/gabfl/sitecrawl
Simple Python module to crawl a website and extract URLs
crawl crawler crawler-python crawling-sites
Last synced: 10 Apr 2025
https://github.com/ivangrana/minerador-noticias-labsc
Raspador de notícias utilizando palavras-chaves // utilizando a biblioteca BeautifulSoup em Python
Last synced: 17 Oct 2025
https://github.com/integralist/go-web-crawler
A web crawler built in the Go programming language
concurrency crawler go golang web-crawler
Last synced: 26 Oct 2025
https://github.com/chusiang/crawler-book-info
A crawler for quick parser the book information
Last synced: 12 Apr 2025
https://github.com/arshadkazmi42/blc
Broken link checker
blc broken-link-checker broken-link-finder bug-bounty bugbounty crawler python
Last synced: 30 Oct 2025
https://github.com/AmirAref/DivarCrawler
an script to crawl divar.ir and extract phone numbers
Last synced: 13 Jul 2025
https://github.com/basemax/googleplaydatabasemirror
Repository of designing a crawler script to update a mirror database from Google Play on PHP.
crawl crawl-pages crawler crawlers crawling database database-schema google-play mysql php
Last synced: 24 Sep 2025
https://github.com/hctilg/taaghche-dl
Save books purchased from taaghche.com !
crawler downloader pillow-library python3 selenium taaghche
Last synced: 12 May 2025
https://github.com/xlisp/ai-auto-crawler
ai-auto-crawler: puppeteer + autogen
Last synced: 31 Aug 2025
https://github.com/neilblaze/smapviw
Sitemap Visualizer built upon D3.js
crawler sitemap sitemap-generator
Last synced: 06 Oct 2025
https://github.com/bitscoper/bitscoper_cyber_toolbox
A Flutter application consisting of TCP Port Scanner, Route Tracer, Pinger, File Hash Calculator, String Hash Calculator, Base Encoder, Morse Code Translator, Open Graph Protocol Data Extractor, Series URI Crawler, DNS Record Retriever, and WHOIS Retriever.
android calculator crawler cybersecurity dart decoder docker encoder extractor flutter github-action ios mac retriever scanner tracer translator web windows
Last synced: 31 Jul 2025
https://github.com/bernabe9/render-it
Render any JavaScript content to create static sites ready for SEO
crawler javascript prerender prerenderio puppeteer render seo seo-tools server-side-rendering static-site static-site-generator
Last synced: 12 Jun 2025
https://github.com/prdx23/async-crawler
A recursive async crawler which creates a graph of connected webpages
Last synced: 17 Jan 2026
https://github.com/dotenorio/freeloader-of-data
A simple crawler or scraper to get open graph and other meta data from any website.
crawler graph hacktoberfest meta-data open-graph scraper
Last synced: 13 Mar 2025
https://github.com/idealchain/dhtcrawler-cluster
BitTorrent DHT crawling cluster
cluster crawler dht docker-images torrent
Last synced: 27 Sep 2025
https://github.com/btlmd/thuhole_crawler
A crawler to save holes on the deceased thuhole
Last synced: 16 Jun 2025
https://github.com/ajcerejeira/base.gov.pt
A crawler that fetches data from base.gov.pt
Last synced: 14 Jul 2025
https://github.com/brucewind/fear-and-greed-index-alarm
A notification reminder for indicating when the CNN Fear and Greed Index is out of range.
crawler fear-and-greed fear-greed-index investment sctock stock-market us-stock-market
Last synced: 21 Jul 2025
https://github.com/frectonz/rampilo
A telegram crawler
crawler rust telegram telegram-crawler
Last synced: 07 Sep 2025
https://github.com/basemax/twitterbotcrawler
A bot to login in Twitter and process page with selenium using Python.
crawler crawler-twitter crawlers selenium-crawler selenium-example selenium-sample selenium-twitter twitter twitter-bot twitter-crawler twitter-py twitter-python twitter-selenium
Last synced: 05 May 2025
https://github.com/nobodxbodon/chromecrawlerwildspider
Chrome Extension to crawl web pages by loading them into browser tabs parallelly.
chrome-extension crawler localstorage spider
Last synced: 07 May 2025
https://github.com/hypervapor/bilibili-crawler
根据关键字列表爬取 Bilibili 视频信息的后端应用 / Backend application for crawling Bilibili video information based on a list of keywords.
bilibili crawler express nodejs
Last synced: 14 Apr 2025
https://github.com/jean-baptiste-camps/iiif-crawler
Interrogate IIIF servers and get images of manuscripts
crawler iiif iiif-image manuscripts
Last synced: 29 Oct 2025
https://github.com/mcstreetguy/crawler
An advanced web-crawler written in PHP.
composer composer-library crawler crawler-engine guzzle http-requests php php-7 php-library web-crawler webcrawler
Last synced: 09 Apr 2025
https://github.com/oldkingcone/pbandj
PasteBin Crawler, crawls the url https://pastebin.com/archive
crawler headless headless-chrome python python-crawler selenium-python selenium-webdriver
Last synced: 26 Sep 2025
https://github.com/lucasboscatti/mercado-livre-crawler
A beginner data engineering project which involves scrapping offers from https://www.mercadolivre.com.br/ofertas, stores in a postgres database and analyze the data scrapped.
crawler docker docker-compose heroku mercado-livre postgresql python scrapy sqlalchemy
Last synced: 06 Mar 2025
https://github.com/systemfsoftware/youtube-autocomplete-scraper
YouTube AutoComplete Scraper - An Apify actor that scrapes YouTube's search suggestions with intelligent deduplication using pglite and trigram similarity matching. Perfect for content research, SEO, and trend analysis.
actor apify autocomplete crawler deduplication pglite scraper search similarity suggestions trigram youtube youtube-api
Last synced: 25 Jun 2025
https://github.com/surelle-ha/dogma
Dogma is a CLI tool that enables interaction with the GitHub API for the purpose of searching .env files with specified keywords. You can configure a GitHub token and use the crawler to search for keys in .env files across public repositories.
Last synced: 22 Jun 2025
https://github.com/beingvirus/jobminer
JobMiner – A Python-based web scraping toolkit for extracting and organizing job listings from multiple websites into structured data.
automation beautifulsoup career crawler data-collection data-mining hacktoberfest hacktoberfest-accepted hacktoberfest2025 job-scraper jobs open-source python selenium web-scraping
Last synced: 10 Oct 2025
https://github.com/lgraubner/node-w3c-validator-cli
Crawls a given site and checks for W3C validity.
Last synced: 13 Apr 2025
https://github.com/meysam81/scry
Your website has problems you can't see. Scry finds them. Crawl your entire website across SEO, security, performance, and accessibility. No browser, no subscription.
accessibility cli command-line-tool crawler devops golang hreflang lighthouse link-checker pagespeed sarif security-headers seo seo-tools site-audit structured-data technical-seo web-performance web-security website-audit
Last synced: 14 Jun 2026
https://github.com/poyea/coronaflight-hkg
😷 Crawler and history manager for dangerous, coronavirus-infected flights to Hong Kong (VHHH)
corona coronaflight-hkg coronavirus coronavirus-analysis coronavirus-info coronavirus-tracker coronavirus-tracking crawl crawler crawlers crawling hacktoberfest hong-kong hongkong javascript json json-api node node-js nodejs
Last synced: 24 Mar 2025
https://github.com/moehmeni/ezweb
Easy to use web page analyzer
analyzer crawler scraper text-analysis text-classification text-mining webcrawler webcrawling webpage webscraper webscraping www
Last synced: 06 Apr 2025
https://github.com/yjyoon-dev/nara-crawler
Crawler for National Archives Catalog
Last synced: 10 Jul 2025
https://github.com/searchformyusername/dark-net-websites-dataset
Dataset of Onion Websites
crawler darknet data-analysis dataset onion search-engine website
Last synced: 16 Jun 2025
https://github.com/trudi-group/mc-crawler
A MobileCoin network crawler. Corresponding preprint available on arXiv (https://arxiv.org/pdf/2111.12364.pdf).
Last synced: 25 Jun 2025
https://github.com/manuel-lang/autonomous-semantic-search-engine
Submission for HackDataKIBots 2018 - Web crawler combined with document analysis
crawler hackathon machine-learning mannheim microsoft natural-language-processing natural-language-understanding nextiteration rnv semantic-search textract
Last synced: 03 May 2025
https://github.com/pritom007/facebookparser
crawler facebook facebook-crawler mongodb public-page pytohn selenium-python
Last synced: 25 Jul 2025
https://github.com/lonsty/zcooldl
ZCool picture crawler. Download ZCool (https://www.zcool.com.cn/) designer's or user's pictures, photos and illustrations.
Last synced: 18 Jan 2026