Projects in Awesome Lists tagged with crawling

https://github.com/scrapy/scrapy

Scrapy, a fast high-level web crawling & scraping framework for Python.

crawler crawling framework hacktoberfest python scraping web-scraping web-scraping-python

Last synced: 16 Jan 2026

https://github.com/gocolly/colly

Elegant Scraper and Crawler Framework for Golang

crawler crawling framework go golang scraper scraping spider

Last synced: 12 May 2025

https://github.com/apifytech/apify-js

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping

Last synced: 06 Jul 2025

https://github.com/apify/crawlee

Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping

Last synced: 03 Nov 2025

https://github.com/codelucas/newspaper

newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:

crawler crawling news news-aggregator python scraper

Last synced: 12 May 2025

https://github.com/go-rod/rod

A Chrome DevTools Protocol driver for web automation and scraping.

automation cdp chrome-devtools chrome-devtools-protocol chrome-headless crawling devtools devtools-protocol go golang gorod headless rod scraper testing web web-scraping

Last synced: 15 May 2025

https://github.com/montferret/ferret

Declarative web scraping

cdp chrome cli crawler crawling data-mining dsl go golang library query-language scraper scraping scraping-websites tool

Last synced: 01 May 2026

https://github.com/MontFerret/ferret

Declarative web scraping

cdp chrome cli crawler crawling data-mining dsl go golang hacktoberfest library query-language scraper scraping scraping-websites tool

Last synced: 13 Mar 2025

https://github.com/apify/crawlee-python

Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.

apify automation beautifulsoup crawler crawling hacktoberfest headless headless-chrome pip playwright python scraper scraping web-crawler web-crawling web-scraping

Last synced: 06 Mar 2026

https://github.com/yujiosaka/headless-chrome-crawler

Distributed crawler powered by Headless Chrome

chrome chromium crawler crawling headless-chrome jquery promise puppeteer scraper scraping

Last synced: 14 May 2025

https://github.com/D4Vinci/Scrapling

🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

ai ai-scraping automation crawler crawling crawling-python data data-extraction hacktoberfest playwright python python3 scraping selectors stealth web-scraper web-scraping web-scraping-python webscraping xpath

Last synced: 13 May 2025

https://github.com/hakluke/hakrawler

Simple, fast web crawler designed for easy, quick discovery of endpoints and assets within a web application

bugbounty crawling hacking osint pentesting recon reconnaissance

Last synced: 14 May 2025

https://github.com/hardkoded/puppeteer-sharp

Headless Chrome .NET API

automation chrome chromium crawler crawling csharp e2e e2e-testing puppeteer webautomation

Last synced: 13 May 2025

https://github.com/apache/nutch

Apache Nutch is an extensible and scalable web crawler

apache crawling hadoop java nutch web-crawler

Last synced: 13 May 2025

https://github.com/d4vinci/scrapling

🕷️ An undetectable, powerful, flexible, high-performance Python library to make Web Scraping Easy and Effortless as it should be!

ai ai-scraping automation crawler crawling crawling-python data data-extraction hacktoberfest playwright python python3 scraping selectors stealth web-scraper web-scraping web-scraping-python webscraping xpath

Last synced: 15 Feb 2026

https://github.com/lorien/grab

Web Scraping Framework

asynchronous crawler crawling framework http-client network pycurl python python-library python3 scraping spider urllib3 web-scraping

Last synced: 14 May 2025

https://github.com/zorlan/skycaiji

蓝天采集器是一款开源免费的爬虫系统，仅需点选编辑规则即可采集数据，可运行在本地、虚拟主机或云服务器中，几乎能采集所有类型的网页，无缝对接各类CMS建站程序，免登录实时发布数据，全自动无需人工干预！是网页大数据采集软件中完全跨平台的云端爬虫系统

crawler crawling php spider webcrawler

Last synced: 14 May 2025

https://github.com/edoardottt/cariddi

Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more

bugbounty crawler crawling endpoint-discovery endpoints go golang hacktoberfest infosec osint penetration-testing pentesting recon reconnaissance redteam scraper secret-keys secrets-detection security security-tools

Last synced: 14 May 2025

https://github.com/natescarlet/holiday-cn

📅🇨🇳中国法定节假日数据自动每日抓取国务院公告

china crawling data holiday natural-language-processing

Last synced: 14 May 2025

https://github.com/NateScarlet/holiday-cn

📅🇨🇳中国法定节假日数据自动每日抓取国务院公告

china crawling data holiday natural-language-processing

Last synced: 26 Mar 2025

https://github.com/roach-php/core

The complete web scraping toolkit for PHP.

crawling php web-scraping

Last synced: 13 May 2025

https://github.com/lorey/mlscraper

🤖 Scrape data from HTML websites automatically by just providing examples

crawler crawler-python crawling extraction-engine html machine-learning scraper scraping

Last synced: 15 May 2025

https://github.com/elixir-crawly/crawly

Crawly, a high-level web crawling & scraping framework for Elixir.

crawler crawling elixir erlang extract-data scraper scraping scraping-websites spider

Last synced: 11 Dec 2025

https://github.com/webrecorder/browsertrix-crawler

Run a high-fidelity browser-based web archiving crawler in a single Docker container

crawler crawling wacz warc web-archiving web-crawler webrecorder

Last synced: 10 Feb 2026

https://github.com/clemfromspace/scrapy-selenium

Scrapy middleware to handle javascript pages using selenium

crawling scrapy selenium

Last synced: 14 May 2025

https://github.com/scrapinghub/scrapyrt

HTTP API for Scrapy spiders

crawler crawling hacktoberfest hacktoberfest2021 python scraper scrapy twisted webcrawler webcrawling

Last synced: 15 May 2025

https://github.com/iawia002/Lulu

[Unmaintained] A simple and clean video/music/image downloader 👾

crawler crawling downloader python python3 scraper scraping video

Last synced: 22 Jul 2025

https://github.com/morvanzhou/easy-scraping-tutorial

Simple but useful Python web scraping tutorial code.

asyncio beautifulsoup crawler crawling distributed-scraper regex requests scraping scrapy urllib

Last synced: 16 May 2025

https://github.com/MorvanZhou/easy-scraping-tutorial

Simple but useful Python web scraping tutorial code.

asyncio beautifulsoup crawler crawling distributed-scraper regex requests scraping scrapy urllib

Last synced: 07 Sep 2025

https://github.com/rebrowser/rebrowser-patches

Collection of patches for puppeteer and playwright to avoid automation detection and leaks. Helps to avoid Cloudflare and DataDome CAPTCHA pages. Easy to patch/unpatch, can be enabled/disabled on demand.

automation bot bot-detection chrome chromedriver cloudflare crawler crawling datadome headless headless-chrome playwright puppeteer puppeteer-extra rebrowser scraping selenium stealth web-scraping webdriver

Last synced: 14 May 2025

https://github.com/slotix/dataflowkit

Extract structured data from web sites. Web sites scraping.

cdp chrome-fetcher crawling extract-data go golang golang-library headless scraper scraping scraping-websites

Last synced: 16 Jan 2026

https://github.com/josephlimtech/linkedin-profile-scraper-api

🕵️‍♂️ LinkedIn profile scraper returning structured profile data in JSON.

crawler crawling expressjs json linkedin linkedin-bot linkedin-crawler linkedin-profile linkedin-profile-scraper linkedin-scraper linkedin-scraping nodejs profile-data puppeteer scraper scrapers scraping scraping-websites spider website-scraper

Last synced: 04 Apr 2025

https://github.com/essandess/isp-data-pollution

ISP Data Pollution to Protect Private Browsing History with Obfuscation

crawling data data-analytics obfuscation privacy privacy-enhancing-technologies web

Last synced: 29 Dec 2025

https://github.com/mishakorzik/adminhack

today we will hack the admin panel of the site.

admin-finder admin-hack admin-panel admin-website-hack allhackingtools cpanel cpanl-finder crawling directory-bruteforce hacking-tool kali-linux linux termux termux-hacking termux-tool website website-hacking website-hacking-methods websitehacking

Last synced: 16 May 2025

https://github.com/scrapinghub/spidermon

Scrapy Extension for monitoring spiders execution.

crawling hacktoberfest monitoring monitoring-tool scraping scrapinghub spiders testing

Last synced: 14 May 2025

https://github.com/zhuyingda/webster

a reliable high-level web crawling & scraping framework for Node.js.

automation-test automation-ui chromium crawler crawling headless-chrome javascript javascript-framework nodejs nodejs-framework puppeteer scraping-framework spider

Last synced: 15 May 2025

https://github.com/crawljax/crawljax

Crawljax

crawler crawling dom dynamic event-driven-crawling javascript test-generation web-analysis web-testing

Last synced: 16 May 2025

https://github.com/l4rm4nd/linkedindumper

Python 3 script to dump/scrape/extract company employees from LinkedIn API

crawling employees extracting linkedin osint python3 scraping spider

Last synced: 18 Apr 2026

https://github.com/scrapfly/scrapfly-scrapers

Scalable Python web scraping scripts for +40 popular domains

antibot automation captcha-bypass crawler crawling crawling-python datascraping proxies python python-scraper scraper scraping scraping-python spider twitter-scraper web-crawler web-scraping web-scraping-python webscraper webscraping

Last synced: 11 Apr 2025

https://github.com/florents-tselai/warcdb

WarcDB: Web crawl data as SQLite databases.

cli crawling database sqlite warc web-archiving web-data

Last synced: 04 Apr 2025

https://github.com/Florents-Tselai/WarcDB

WarcDB: Web crawl data as SQLite databases.

cli crawling database sqlite warc web-archiving web-data

Last synced: 08 Apr 2025

https://github.com/mhmdiaa/second-order

Second-order subdomain takeover scanner

crawler crawling infosec mapping penetration-testing penetration-testing-tools pentesting recon reconnaissance security security-tools web-application-security wordlist wordlist-generator

Last synced: 05 Apr 2025

https://github.com/xorbit01/webpalm

🕸️ Crawl in the web network

crawler crawling data data-science datamining go golang hack mining osint redteam spider tool

Last synced: 15 Dec 2025

https://github.com/XORbit01/webpalm

🕸️ Crawl in the web network

crawler crawling data data-science datamining go golang hack mining osint redteam spider tool

Last synced: 14 Apr 2025

https://github.com/crwlrsoft/crawler

Library for Rapid (Web) Crawler and Scraper Development

crawler crawling hacktoberfest php scraper scraping scraping-websites web-crawler web-crawling web-scraper web-scraping

Last synced: 15 May 2025

https://github.com/rivermont/spidy

The simple, easy to use command line web crawler.

crawler crawling python python3 web-crawler web-spider

Last synced: 16 Jan 2026

https://github.com/StJudeWasHere/seonaut

Open source SEO audit tool.

audit crawler crawlergo crawlers crawling docker docker-compose go golang multiuser search-engine-optimization seo seo-audit seotools web

Last synced: 23 Apr 2025

https://github.com/alephdata/memorious

Lightweight web scraping toolkit for documents and structured data.

crawling scraping scraping-framework

Last synced: 12 Apr 2025

https://github.com/infinilabs/crawler

🕷️ An easy-to-use spider written in Golang. (previous named GOPA.)

crawler crawling elasticsearch lightweight scraping spider web-crawler web-scraping web-spider

Last synced: 11 Apr 2026

https://github.com/marshalx/telegram-crawler

🕷 Automatically detect changes made to the official Telegram sites, clients and servers.

crawler crawling crawling-python parser telegram telegram-org telegram-updates

Last synced: 16 May 2025

https://github.com/MarshalX/telegram-crawler

🕷 Automatically detect changes made to the official Telegram sites, clients and servers.

crawler crawling crawling-python parser telegram telegram-org telegram-updates

Last synced: 15 May 2025

https://github.com/mustafadalga/instagram-bot

An Instagram bot developed using the Selenium Framework

automation automation-selenium bot bulk-comments bulk-unfollow crawler crawling download-stories instagram instagram-api instagram-bot instagram-downloader instagram-without-api mass-liking python python3 selenium selenium-framework selenium-python selenium-webdriver

Last synced: 02 Oct 2025

https://github.com/ai-robots-txt/ai.robots.txt

A list of AI agents and robots to block.

ai crawlers crawling privacy

Last synced: 28 Mar 2025

https://github.com/roach-php/laravel

Laravel adapter for Roach, the complete web scraping toolkit for PHP.

crawling laravel php web-scraping

Last synced: 11 Apr 2025

https://github.com/antchfx/antch

Antch, a fast, powerful and extensible web crawling & scraping framework for Go

crawler crawling framework golang scraping web-crawler web-spider

Last synced: 14 Mar 2025

https://github.com/amerkurev/scrapper

Web scraper with a simple REST API living in Docker and using a Headless browser and Readability.js for parsing.

crawler crawler-python crawling headless readability scraper scraping web-parsers web-parsing web-scraping

Last synced: 08 May 2025

https://github.com/mishakorzik/infect

Create you virus in termux!

allhackingtools crawling hacking-tool infect infection linux termux termux-hacking termux-tool virus virus-termux viruses

Last synced: 09 May 2025

https://github.com/a3h1nt/grawler

Grawler is a tool written in PHP which comes with a web interface that automates the task of using google dorks, scrapes the results, and stores them in a file.

algorithm-schema automation crawling curl google-dorks grawler osint osint-tool php proxy scraping xampp

Last synced: 09 Apr 2025

https://github.com/A3h1nt/Grawler

Grawler is a tool written in PHP which comes with a web interface that automates the task of using google dorks, scrapes the results, and stores them in a file.

algorithm-schema automation crawling curl google-dorks grawler osint osint-tool php proxy scraping xampp

Last synced: 11 Jul 2025

https://github.com/google/corpuscrawler

Crawler for linguistic corpora

corpus-builder corpus-linguistics crawling linguistics minority-language

Last synced: 14 Mar 2025

https://github.com/18520339/facebook-data-extraction

Experience for effectively fetching Facebook data by Querying Graph API with Account-based Token and Operating undetectable scraping Bots to extract Client/Server-side Rendered content

automation browser-fingerprinting crawling facebook facebook-graph-api proxy scraping selenium tor-network

Last synced: 03 Apr 2025

https://github.com/csharp-leaf/Leaf.xNet

HTTP Library. Impoved original xNet.

capsolver captcha-solving cookies crawling csharp http http-client https parser proxy-client scraping

Last synced: 12 Apr 2025

https://github.com/mehmetozkaya/dotnetcrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

crawler crawling csharp ddd-architecture dotnetcore entity-framework-core htmlagilitypack scraping scrapy scrapy-crawler webcrawler webcrawler-htmlagilitypack webcrawling webscraper webscraping

Last synced: 11 May 2025

https://github.com/mehmetozkaya/DotnetCrawler

DotnetCrawler is a straightforward, lightweight web crawling/scrapying library for Entity Framework Core output based on dotnet core. This library designed like other strong crawler libraries like WebMagic and Scrapy but for enabling extandable your custom requirements. Medium link : https://medium.com/@mehmetozkaya/creating-custom-web-crawler-with-dotnet-core-using-entity-framework-core-ec8d23f0ca7c

crawler crawling csharp ddd-architecture dotnetcore entity-framework-core htmlagilitypack scraping scrapy scrapy-crawler webcrawler webcrawler-htmlagilitypack webcrawling webscraper webscraping

Last synced: 18 Apr 2025

https://github.com/N0taN3rd/Squidwarc

Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head

browser-automation chrome chrome-headless crawler crawling headless-chrome high-fidelity-preservation puppeteer webarchives webarchiving

Last synced: 06 Apr 2025

https://github.com/n0tan3rd/squidwarc

Squidwarc is a high fidelity, user scriptable, archival crawler that uses Chrome or Chromium with or without a head

browser-automation chrome chrome-headless crawler crawling headless-chrome high-fidelity-preservation puppeteer webarchives webarchiving

Last synced: 13 Sep 2025

https://github.com/dimkouv/massivedl

Download a large list of files concurrently

crawling download-manager downloader golang

Last synced: 15 Jan 2026

https://github.com/karthikuj/sasori

Sasori is a dynamic web crawler powered by Puppeteer, designed for lightning-fast endpoint discovery.

automation crawler crawling dast dynamic endpoint-discovery infosec puppeteer scraping security

Last synced: 15 Aug 2025

https://github.com/janreges/siteone-crawler

SiteOne Crawler is a website analyzer and exporter you'll ♥ as a Dev/DevOps, QA engineer, website owner or consultant. Works on all popular platforms - Windows, macOS and Linux (x64 and arm64 too).

analyzer crawler crawling performance qa quality-assessment security seo seotools stress-testing swoole testing website

Last synced: 18 Mar 2026

https://github.com/unblocked-web/double-agent

A test suite of common scraper detection techniques. See how detectable your scraper stack is.

crawling puppeteer scraping scrapy secret-agent

Last synced: 08 Apr 2025

https://github.com/alash3al/scraply

Scraply a simple dom scraper to fetch information from any html based website

crawler crawling dom golang scraper scrapers scraping-websites scrapy server

Last synced: 28 Apr 2025

https://github.com/ihandmine/aioscpy

An asyncio + aiolibs crawler imitate scrapy framework

aiohttp asyncio crawling framework loguru python3 scrapy scrapy-redis

Last synced: 14 Jan 2026

https://github.com/archiveteam/wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

archiveteam archiving crawl crawler crawlers crawling downloader ftp lua scraper scraping spider warc webarchiving wget wget-lua zstd

Last synced: 04 Apr 2025

https://github.com/SimFin/pdf-crawler

SimFin's open source PDF crawler

crawler crawling geckodriver pdf pdf-crawler puppeteer python selenium-webdriver

Last synced: 07 Apr 2025

https://github.com/simfin/pdf-crawler

SimFin's open source PDF crawler

crawler crawling geckodriver pdf pdf-crawler puppeteer python selenium-webdriver

Last synced: 28 Oct 2025

https://github.com/antoinevastel/bots-zoo

bot crawler crawling playwright puppeteer scraper scraping selenium user-agent useragent

Last synced: 16 Aug 2025

https://github.com/maxcountryman/warc-parquet

🗄️ A simple CLI for converting WARC to Parquet.

crawling duckdb parquet warc web-archiving

Last synced: 16 May 2025

https://github.com/ArchiveTeam/wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

archiveteam archiving crawl crawler crawlers crawling downloader ftp lua scraper scraping spider warc webarchiving wget wget-lua zstd

Last synced: 18 Jul 2025

https://github.com/kreuzberg-dev/kreuzcrawl

High-performance web crawling engine with bindings for 11 languages

crawling csharp elixir ffi golang java mcp php python ruby rust typescript wasm web-crawler web-scraping

Last synced: 07 Jun 2026

https://github.com/fcavallarin/burp-dom-scanner

Burp Suite's extension to scan and crawl Single Page Applications

crawling dom scanning single-page-applications xss xss-detection

Last synced: 17 Mar 2025

https://github.com/usc-isi-i2/dig-etl-engine

Download DIG to run on your laptop or server.

crawling etl-framework etl-pipeline information-extraction information-visualization search-engine

Last synced: 04 Aug 2025

https://github.com/creekorful/bathyscaphe

Fast, highly configurable, cloud native dark web crawler.

architecture crawler crawling elasticsearch golang hidden-services kibana tor web-crawler

Last synced: 17 Mar 2025

https://github.com/alexfazio/devdocs-to-llm

Turn any developer documentation into a GPT

crawler crawling firecrawl scraper scraping

Last synced: 08 Mar 2026

https://github.com/datawizard1337/ARGUS

ARGUS is an easy-to-use web scraping tool. The program is based on the Scrapy Python framework and is able to crawl a broad range of different websites. On the websites, ARGUS is able to perform tasks like scraping texts or collecting hyperlinks between websites. See: https://link.springer.com/article/10.1007/s11192-020-03726-9

crawling python scraping scrapy scrapyd webcrawling webscraping

Last synced: 20 Mar 2025

https://github.com/carlosplanchon/spidercreator

Automated web scraping spider generation using Browser Use and LLMs. Streamline the creation of Playwright-based spiders with minimal manual coding. Ideal for large enterprises with recurring data extraction needs.

ai automation browser-use crawling llm low-code no-code python rpa scraping spider vibe-coding

Last synced: 15 Sep 2025

https://github.com/jroakes/tech-seo-crawler

Build a small, 3 domain internet using Github pages and Wikipedia and construct a crawler to crawl, render, and index.

crawling github-pages rendering seo wikipedia

Last synced: 12 Apr 2025

https://github.com/TransparencyToolkit/Harvester

Web crawling and document processing through a usable interface.

api crawling document interface ocr osint web

Last synced: 13 May 2025

https://github.com/shurco/goClone

🌱 goClone - clone websites in seconds

cloner cloning crawler crawling go goclone golang hacktoberfest scraping scraping-websites scrapper website-cloner website-scraper wp2static

Last synced: 05 May 2025

https://github.com/archivebox/abx-dl

⬇️ A simple all-in-one CLI tool to download EVERYTHING from a URL (like youtube-dl/yt-dlp, forum-dl, gallery-dl, simpler ArchiveBox). 🎭 Uses headless Chrome to get HTML, JS, CSS, images/video/audio/subtitles, PDFs, screenshots, article text, git repos, and more...

ai-scraping archivebox chrome cli cli-tool crawling curl downloader gallery-dl headless http-client internet-archiving playwright puppeteer scraping wget youtube-dl yt-dlp

Last synced: 15 Mar 2026

https://github.com/afuntw/python-crawling-tutorial

Python crawling tutorial

crawling ipynb-jupyter-notebook python

Last synced: 28 Oct 2025

https://github.com/howie6879/talospider

talospider - A simple,lightweight scraping micro-framework

crawler crawling python spider web-spider

Last synced: 25 Oct 2025

https://github.com/scrapinghub/learn.scrapinghub.com

Scrapinghub Learning Center. Report issues in Jira: Report issues in Jira: https://scrapinghub.atlassian.net/projects/WEB

crawling learning python scraping scrapy tutorial

Last synced: 08 Jul 2025

https://github.com/soulee-dev/vkeypad-bypass

가상키보드(vKeypad) 우회도구

crawling korean python requests rpa scrapping security selenium vkeyboard vkeypad webautomation

Last synced: 22 Feb 2026

https://github.com/swader/diffbot-php-client

[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library

ai artificial-intelligence bot crawl crawling diffbot machine-learning nlp php scrape scraped-data scraper scraping

Last synced: 21 Aug 2025

https://github.com/soulee-dev/fuckvkeypad

가상키보드(vKeypad) 우회도구

crawling korean python requests rpa scrapping security selenium vkeyboard vkeypad webautomation

Last synced: 23 Mar 2025

https://github.com/OlafZhang/bilib

整合多个B站原生API，并结合爬取技术的Python爬取用lib

anime bilibili-api crawling danmaku

Last synced: 16 Mar 2025

https://github.com/pzaino/thecrowler

A Content Discovery and Development Platform. Empowering Cybersecurity, AI, Marketing, and Finance professionals and researchers to discover, analyze, and interact with the web in all its dimensions.

automation blue-team-tool content-detection content-discovery crawler crawling cyber-security cybersecurity cybersecurity-tools data-collection data-science distributed-systems golang indexer indexing reconnaissance red-team-tools scraping search-engine vulnerability-detection