Crawler | Ecosyste.ms: Awesome

https://github.com/appliedsoul/promise-crawler

Promise support for node-crawler (Web Crawler/Spider for NodeJS + server-side jQuery)

crawler node-crawler nodejs promise-node-crawler spider

Last synced: 28 Feb 2026

https://github.com/nakabonne/webcrawlerforserps

Web crawler that scrapes Google search results

cli crawler golang

Last synced: 22 Oct 2025

https://github.com/jacobsteves/crawlperl

A web crawler made with Perl. Great for grabbing or searching for data off the web, or ensuring that your own site files are secure and hidden.

crawler perl scripting web-crawler

Last synced: 14 Apr 2025

https://github.com/yaroslaff/bulk-http-check

Very fast and simple concurrent HTTP client (3500 HTTP req/s)

bulk check concurrent connections crawler header http https multiple parallel spider status

Last synced: 13 Apr 2025

https://github.com/jadbin/xpaw

Async web scraping framework

async crawler spider

Last synced: 16 Jan 2026

https://github.com/windfarer/biu

biubiubiu~~ I'm a tiny web crawler framework

crawler python spider spider-framework web-crawler

Last synced: 23 Mar 2025

https://github.com/xanke/node-crawler-server

一个轻量级nodejs的远程采集服务器

crawler nodejs server

Last synced: 26 Jul 2025

https://github.com/adambankz/tiktok-scraper

A simple, no download scraper for social media platforms like TikTok. Just input parameters and parse useful data. Download TikTok videos with no watermark

crawler no-watermark parse scraper scraper-site tiktok-no-watermark tiktok-scraper

Last synced: 19 Feb 2026

https://github.com/simin75simin/libgencrawl

crawl all books from a library genesis search

crawler free-software libgen python3 scraper

Last synced: 05 Apr 2025

https://github.com/box-archived/vlive-py

VLIVE(vlive.tv) parser for python

api-wrapper crawler kpop parser python vlive

Last synced: 14 Jan 2026

https://github.com/amirhoseinsb/Cloud_Player_V2

You can use the cloudplayer tool to listen to the music of the singer you want without going to a specific website and at a very high speed.

cloud-player crawler crawling music music-player programming python url-player

Last synced: 08 Jul 2025

https://github.com/wenyalintw/job-scraper-bot

幫朋友做好玩的Telegram機器人，已部署到Heroku

amazon-web-services aws-s3 boto3 crawler google-drive google-drive-api heroku heroku-deployment python-telegram-bot scraper scraping scrapy telegram telegram-bot telegram-bot-api web-scraping

Last synced: 13 Sep 2025

https://github.com/yerkopalma/bash-crawler

:computer: Get a site links with bash

bash crawler

Last synced: 05 Aug 2025

https://github.com/tzxyz/webber

基于golang实现的一个轻量级爬虫框架

crawler golang

Last synced: 12 Jan 2026

https://github.com/khaleddallah/LinkedinScraper

Python Scrapy project parse people profiles of Linkedin Search and arrange result content in Excel and Json file

crawler excel json linkedin python scraper scrapy spider

Last synced: 06 Apr 2025

https://github.com/vmarcosp/supervise-crawler

:male_detective: Supervise crawler

crawler esy ocaml reasonml webcrawler

Last synced: 13 May 2025

https://github.com/aurelius84/pycrawler

A flexible spider based on mysql

crawler etl mysql scrapy spider

Last synced: 10 Apr 2025

https://github.com/bfwg/node-tinycrawler

Tiny web-crawler in a nute shell for Node.js

crawler nodejs redis

Last synced: 10 Nov 2025

https://github.com/markmelnic/mobile-de-crawler

A crawler for mobile.de to index all car listings on the website.

crawler requests scraper sqlite3

Last synced: 08 Oct 2025

https://github.com/florinutz/filme

Filme provides utilities for torrenting movies

crawler golang movies torrents

Last synced: 14 Jan 2026

https://github.com/dori-dev/quotes-crawler

Quotes crawler using scrapy and python.

crawler crawling python scraping-python scraping-websites scrapy scrapy-crawler scrapy-spider web-scraper

Last synced: 08 Oct 2025

https://github.com/xixu-me/library-data-assistant

Java-based client-server application for managing library book data with web crawling capabilities

crawler crawling database java mysql

Last synced: 08 Oct 2025

https://github.com/xvc323/omnidocs

Automated documentation crawler that generates LLM-friendly Markdown from any docs site. Export as single or multi-file, ready for AI ingestion.

crawler documentation llm markdown

Last synced: 27 Jun 2025

https://github.com/twtrubiks/dowload-image-ptt

PTT圖片下載器 (C# WinForm) For Windows

crawler dowload image ptt winforms

Last synced: 15 Apr 2025

https://github.com/alexqi/webphantom

面向 Web 数据采集任务的开源爬虫框架，支持接口调用、任务调度、会话管理等核心功能，适用于构建具备一定反爬能力的自动化采集系统（抖音｜小红书｜淘宝｜京东）

crawler douyin qps scheduler taobao xiaohonghsu

Last synced: 22 Jun 2026

https://github.com/mashukui/dy_trans_tool

crawler douyin douyin-api gui gui-application python3

Last synced: 04 Apr 2026

https://github.com/thaddeusjiang/campcat

キャンプ場予約情報監視 Bot

bot crawler telegram

Last synced: 01 Aug 2025

https://github.com/rvegas/dota_crawler

Crawler for dotapedia. Fills a Mongo and a PG database with game data.

crawler dota dota2 flask mongodb postgresql python3 regex scrapy

Last synced: 05 Sep 2025

https://github.com/eric2788/platformscrawler

多平台爬蟲 + 模塊化管理，用於搜集資料並經 redis pubsub 發送

bilibili crawler crawling pubsub redis twitter youtube

Last synced: 27 Oct 2025

https://github.com/hybridx/webscraper

webcrawler made from Beautiful soup

crawler flask google-dorks javascript python3 search-engine

Last synced: 07 May 2025

https://github.com/mrrfv/webarchive

Crawls websites and saves found URLs to a file.

archive archiveteam archiving crawler crawling ia internet-archive scraper web-archiving web-scraping

Last synced: 18 Mar 2025

https://github.com/52cik/creeper

简单爬虫引擎 (苦力怕)

crawler node-crawler

Last synced: 22 Apr 2025

https://github.com/AmirAref/Torobot

an inline telegram robot to easy access and search in torob.com products from telegram.

crawler python python-telegram-bot scraper telegtam-bot

Last synced: 13 Jul 2025

https://github.com/amirzenoozi/persian-news-crawler

Simple Script To Crawl Data From Persian News Agencies Including Fars, Mehr.

cli crawler database fars-news farsi-datasets kaggle-dataset mehr-news news news-agencies newspaper python python3 script shargh-news sqlite3 tensorflow tensorflow2

Last synced: 13 Apr 2025

https://github.com/akiosarkiz/manga-collector

The manga collector is a library designed to easily scrape manga content from various websites. This package is licensed under the MIT License and is fully test-covered

api crawler manga scraper

Last synced: 10 Jul 2025

https://github.com/twtrubiks/pttcrawlercontent

PTT Crawler Content on python PTT文章爬蟲

crawler gossiping ptt python

Last synced: 15 Apr 2025

https://github.com/jonasgeiler/Iconmonstr-API

An unofficial API to access icons from iconmonstr.com

api collection collections crawler eps font icon icon-font iconmonstr iconmonstr-api icons image images png psd scraper svg unofficial vector vector-graphics

Last synced: 10 Mar 2025

https://github.com/simsso/vision-based-page-rank-estimation

Student research project on pagerank estimation with deep graph networks

cnn crawler deep-learning graph-networks page-rank student-research-project

Last synced: 24 Apr 2025

https://github.com/doroudi/imdb-crawler

imdb.com movies crawler in scrapy

crawler data-mining python scrapy

Last synced: 22 Jun 2025

https://github.com/markelog/map

Simple site map generator, supports couple reporters, depth levels and etc

crawler map sitemap spider

Last synced: 11 Apr 2025

https://github.com/gabfl/sitecrawl

Simple Python module to crawl a website and extract URLs

crawl crawler crawler-python crawling-sites

Last synced: 10 Apr 2025

https://github.com/ivangrana/minerador-noticias-labsc

Raspador de notícias utilizando palavras-chaves // utilizando a biblioteca BeautifulSoup em Python

crawler python

Last synced: 17 Oct 2025

https://github.com/integralist/go-web-crawler

A web crawler built in the Go programming language

concurrency crawler go golang web-crawler

Last synced: 26 Oct 2025

https://github.com/chusiang/crawler-book-info

A crawler for quick parser the book information

book crawler python

Last synced: 12 Apr 2025

https://github.com/arshadkazmi42/blc

Broken link checker

blc broken-link-checker broken-link-finder bug-bounty bugbounty crawler python

Last synced: 30 Oct 2025

https://github.com/giscafer/ziroom-crawler

自如友家租房，房源爬虫，房源状态监听，目的是抢房

crawler nodejs

Last synced: 28 Apr 2025

https://github.com/AmirAref/DivarCrawler

an script to crawl divar.ir and extract phone numbers

crawler scraper selenium

Last synced: 13 Jul 2025

https://github.com/basemax/googleplaydatabasemirror

Repository of designing a crawler script to update a mirror database from Google Play on PHP.

crawl crawl-pages crawler crawlers crawling database database-schema google-play mysql php

Last synced: 24 Sep 2025

https://github.com/busterc/crwlr

🕷a minimal puppeteer crawler api

crawl crawler crawling puppeteer spider walker

Last synced: 23 Apr 2025

https://github.com/hctilg/taaghche-dl

Save books purchased from taaghche.com !

crawler downloader pillow-library python3 selenium taaghche

Last synced: 12 May 2025

https://github.com/leo9960/bilibili_live_danmu_crawler

b站直播的弹幕抓取

bilibili crawler danmu live

Last synced: 23 Apr 2025

https://github.com/xlisp/ai-auto-crawler

ai-auto-crawler: puppeteer + autogen

autogen crawler gpt puppeteer

Last synced: 31 Aug 2025

https://github.com/neilblaze/smapviw

Sitemap Visualizer built upon D3.js

crawler sitemap sitemap-generator

Last synced: 06 Oct 2025

https://github.com/antosser/web-crawler

Rust Web Crawler that finds every page, image, and script on a website (and downloads it)

crawler html rust seo web

Last synced: 04 Sep 2025

https://github.com/bitscoper/bitscoper_cyber_toolbox

A Flutter application consisting of TCP Port Scanner, Route Tracer, Pinger, File Hash Calculator, String Hash Calculator, Base Encoder, Morse Code Translator, Open Graph Protocol Data Extractor, Series URI Crawler, DNS Record Retriever, and WHOIS Retriever.

android calculator crawler cybersecurity dart decoder docker encoder extractor flutter github-action ios mac retriever scanner tracer translator web windows

Last synced: 31 Jul 2025

https://github.com/bernabe9/render-it

Render any JavaScript content to create static sites ready for SEO

crawler javascript prerender prerenderio puppeteer render seo seo-tools server-side-rendering static-site static-site-generator

Last synced: 12 Jun 2025

https://github.com/prdx23/async-crawler

A recursive async crawler which creates a graph of connected webpages

async crawler python3

Last synced: 17 Jan 2026

https://github.com/dotenorio/freeloader-of-data

A simple crawler or scraper to get open graph and other meta data from any website.

crawler graph hacktoberfest meta-data open-graph scraper

Last synced: 13 Mar 2025

https://github.com/idealchain/dhtcrawler-cluster

BitTorrent DHT crawling cluster

cluster crawler dht docker-images torrent

Last synced: 27 Sep 2025

https://github.com/dist1ll/hltv-rust

A client to fetch and parse data from HLTV.org

api crawler hltv parser rust

Last synced: 03 Oct 2025

https://github.com/oscarnevarezleal/ecommerce-crawler

Parallel ecommerce crawler using Docker and Puppeter on GCP

crawler gcp nodejs pubnub puppeteer

Last synced: 16 Jan 2026

https://github.com/luckyzxl2016/go-spider

concurrent crawler golang spider

Last synced: 29 Jul 2025

https://github.com/btlmd/thuhole_crawler

A crawler to save holes on the deceased thuhole

crawler

Last synced: 16 Jun 2025

https://github.com/ajcerejeira/base.gov.pt

A crawler that fetches data from base.gov.pt

crawler csv python scrapy

Last synced: 14 Jul 2025

https://github.com/brucewind/fear-and-greed-index-alarm

A notification reminder for indicating when the CNN Fear and Greed Index is out of range.

crawler fear-and-greed fear-greed-index investment sctock stock-market us-stock-market

Last synced: 21 Jul 2025

https://github.com/frectonz/rampilo

A telegram crawler

crawler rust telegram telegram-crawler

Last synced: 07 Sep 2025

https://github.com/basemax/twitterbotcrawler

A bot to login in Twitter and process page with selenium using Python.

crawler crawler-twitter crawlers selenium-crawler selenium-example selenium-sample selenium-twitter twitter twitter-bot twitter-crawler twitter-py twitter-python twitter-selenium

Last synced: 05 May 2025

https://github.com/nobodxbodon/chromecrawlerwildspider

Chrome Extension to crawl web pages by loading them into browser tabs parallelly.

chrome-extension crawler localstorage spider

Last synced: 07 May 2025

https://github.com/labic/ze-the-scraper

brazil crawler mongodb news newspaper portals scraper

Last synced: 09 Jul 2025

https://github.com/hypervapor/bilibili-crawler

根据关键字列表爬取 Bilibili 视频信息的后端应用 / Backend application for crawling Bilibili video information based on a list of keywords.

bilibili crawler express nodejs

Last synced: 14 Apr 2025

https://github.com/jean-baptiste-camps/iiif-crawler

Interrogate IIIF servers and get images of manuscripts

crawler iiif iiif-image manuscripts

Last synced: 29 Oct 2025

https://github.com/dynesshely/everydaynews

A repo fetched most of news and infomation, where stored and organized them.

crawler data fetcher network news

Last synced: 22 Feb 2026

https://github.com/mcstreetguy/crawler

An advanced web-crawler written in PHP.

composer composer-library crawler crawler-engine guzzle http-requests php php-7 php-library web-crawler webcrawler

Last synced: 09 Apr 2025

https://github.com/oldkingcone/pbandj

PasteBin Crawler, crawls the url https://pastebin.com/archive

crawler headless headless-chrome python python-crawler selenium-python selenium-webdriver

Last synced: 26 Sep 2025

https://github.com/lucasboscatti/mercado-livre-crawler

A beginner data engineering project which involves scrapping offers from https://www.mercadolivre.com.br/ofertas, stores in a postgres database and analyze the data scrapped.

crawler docker docker-compose heroku mercado-livre postgresql python scrapy sqlalchemy

Last synced: 06 Mar 2025

https://github.com/s045pd/magicworld

环球网-神奇世界看看看爬虫

crawler python3 sanic telepot

Last synced: 09 Oct 2025

https://github.com/systemfsoftware/youtube-autocomplete-scraper

YouTube AutoComplete Scraper - An Apify actor that scrapes YouTube's search suggestions with intelligent deduplication using pglite and trigram similarity matching. Perfect for content research, SEO, and trend analysis.

actor apify autocomplete crawler deduplication pglite scraper search similarity suggestions trigram youtube youtube-api

Last synced: 25 Jun 2025

https://github.com/bingxyz/tg-earthquake-warning

telegram 台灣地震報告廣播頻道

bash crawler telegram-bot-api

Last synced: 11 Jul 2025

https://github.com/lon9/arxiv-crawler

Crawler for arxiv.org

arxiv crawler golang

Last synced: 24 Jul 2025

https://github.com/Hound-fm/podcatcher

Audio media crawler for lbry.

crawler lbry python

Last synced: 12 May 2025

https://github.com/surelle-ha/dogma

Dogma is a CLI tool that enables interaction with the GitHub API for the purpose of searching .env files with specified keywords. You can configure a GitHub token and use the crawler to search for keys in .env files across public repositories.

cli crawler github nodejs

Last synced: 22 Jun 2025

https://github.com/dylanhogg/legaldata

Provides access to Australian legal data

crawler data law lawtech legal legaltech

Last synced: 21 Jul 2025

https://github.com/marshhu/ma-tools

a golang tool package

crawler go golang htmltopdf mapping

Last synced: 23 Jan 2026

https://github.com/expandergraph/crypto-crawler

The on-chain data crawler supports the collection and analysis of data on Bitcoin, Eth, and Polkadot chain.

bitcion crawler ethereum on-chain polkadot

Last synced: 11 Jan 2026

https://github.com/beingvirus/jobminer

JobMiner – A Python-based web scraping toolkit for extracting and organizing job listings from multiple websites into structured data.

automation beautifulsoup career crawler data-collection data-mining hacktoberfest hacktoberfest-accepted hacktoberfest2025 job-scraper jobs open-source python selenium web-scraping

Last synced: 10 Oct 2025

https://github.com/exp-codes/jzone-crawler

QQ空间爬虫（Java版）

crawler programming

Last synced: 15 Jun 2025

https://github.com/Antosser/web-crawler

Rust Web Crawler that finds every page, image, and script on a website (and downloads it)

crawler html rust seo web

Last synced: 25 Sep 2025

https://github.com/lgraubner/node-w3c-validator-cli

Crawls a given site and checks for W3C validity.

cli crawler w3c w3c-validator

Last synced: 13 Apr 2025

https://github.com/meysam81/scry

Your website has problems you can't see. Scry finds them. Crawl your entire website across SEO, security, performance, and accessibility. No browser, no subscription.

accessibility cli command-line-tool crawler devops golang hreflang lighthouse link-checker pagespeed sarif security-headers seo seo-tools site-audit structured-data technical-seo web-performance web-security website-audit