Crawler | Ecosyste.ms: Awesome

https://github.com/vshulcz/youtube_crawler

A simple YouTube crawler, allows you to quickly collect data from channels, view and sort them in a table, perform SQL queries and advanced search by various parameters.

crawler database gui osint parser python requests reverse-engineering sql tkinter youtube

Last synced: 16 Jan 2026

https://github.com/ivan-alone/instastories-saver-cpp

Program to saving Instagram Stories - Rewritten to C++

api backup crawler grambler gramblr insta instagram instagram-stories instastories-saver instastory stories

Last synced: 15 Mar 2026

https://github.com/wangshouh/sdufelib_seat_crawler

SDUFE Library Reservation Seat Monitoring Crawler

crawler python

Last synced: 12 Feb 2026

https://github.com/karambir/ugc-colleges

Python Script to extract college names from UGC, India website.

college crawler extract html-parser python python-script ugc

Last synced: 19 Apr 2025

https://github.com/juliandavidmr/raptor

Lightweight tool for scanning web sites, works as spider. Once executed, starts scanning pages looking for websites to visit, with automatic indexing.

crawler kotlin mysql spider

Last synced: 20 Oct 2025

https://github.com/supadata-ai/js

Official TypeScript/JavaScript SDK for the Supadata API.

ai crawler llm markdown scraper transcript web-crawler youtube

Last synced: 22 Mar 2025

https://github.com/typingmonk/mnd_adiz_news_crawler

Web crawler that target to mnd.gov.tw post relate to ADIZ(防空識別區) report.

crawler

Last synced: 10 Jul 2025

https://github.com/pjt3591oo/news-crawler

crawler data python

Last synced: 10 Apr 2025

https://github.com/robmch/mindfactory_crawling

A Python 3 Crawler for Mindfactory.de

crawler crawling data webcrawler webcrawling

Last synced: 07 May 2025

https://github.com/xcrypt0r/hyacinth

🌸 Dcinside image crawler with deadly simple structure

beautifulsoup4 crawler dcinside parsing pyqt5 pyside2

Last synced: 28 Apr 2025

https://github.com/floscha/genius-lyrics-crawler

A concurrent crawler to retrieve song lyrics from Genius

celery crawler fluentd genius lyrics mongodb python

Last synced: 30 Apr 2025

https://github.com/pyaesoneaungrgn/2d-crawler

2D crawler for set.or.th

2d 2d-crawler crawler myanmar php

Last synced: 28 Apr 2025

https://github.com/archan937/webhead

An easy-to-use Node web crawler storing cookies, following redirects, traversing pages and submitting forms.

api cookies crawler fetch file-uploads forms headless json node redirects scraper spider traversing

Last synced: 25 Apr 2025

https://github.com/leomaurodesenv/smm-course-search

A package to searching courses - Super Mario Maker

bookmark-site crawler javascript json mario-game mario-maker nodejs

Last synced: 01 Apr 2025

https://github.com/lon9/arxiv

For scraping arxiv.org

arxiv crawler golang

Last synced: 24 Mar 2025

https://github.com/cr0hn/feed-to-exporter

Get RSS Feed and export as Wordpress Post

crawler feed rss wordpress

Last synced: 30 Oct 2025

https://github.com/choi-jiwoo/naver-place-scraper

Scrape reviews from Naver Place

crawler python scraper

Last synced: 14 Jan 2026

https://github.com/roccomuso/is-bing

Verify that a request is from Bing crawlers using Bing's DNS verification steps

bing bot check crawler dns ip js nodejs verify

Last synced: 27 Aug 2025

https://github.com/librecodecoop/querido-diario-php

Brazilian government gazettes, accessible to everyone.

civic-tech crawler data-science gazette-crawler governments-gazettes govtech hacktoberfest open-data php php7 politics spider

Last synced: 18 Oct 2025

https://github.com/sayakie/pixiv-crawler

Crawls images from Pixiv 🚀

crawler nodejs pixiv typescript

Last synced: 21 Mar 2025

https://github.com/spencerlepine/readme-crawler

A Node.js web crawler to download README files and follow contained links. Fetch repositories from a valid GitHub URL

crawler javascript node nodejs readme scraper web-crawler webcrawer

Last synced: 03 May 2025

https://github.com/hktalent/scrapysite

ScrapySite,go Web Crawler（spider）, scraping，intelligence gathering

crawler elasticsearch go scraping site spider web

Last synced: 14 May 2025

https://github.com/aminehsan/crawler-divar.ir

Analyzing and Extracting Insights from Ads on 'divar.ir'

crawler data-mining data-science divar-ir scarping

Last synced: 14 Oct 2025

https://github.com/leelow/nightmare-screenshot-selector

👻 📷 A Nightmare plugin to easily take screenshots.

crawler headless-browsers javascript js nightmare nightmarejs nodejs plugin webcrawler

Last synced: 12 Apr 2025

https://github.com/zebbern/dezcrwl

🕷️ | dezcrwl is a website history crawler gather hidden information and check vulnerabilities for extracted .js endpoints & much more!

crawl crawler crawler-python crawlers ctf-tools hacking historical-data information information-gathering information-retrieval information-security infosec osint osint-tool pentesting-tools python reconnaissance tool web website

Last synced: 14 Apr 2025

https://github.com/aprilnea/xjtlu

This is how to get all the network resources of XJTLU.

crawler gateway http-auth python spider web-crawler xjtlu

Last synced: 01 Aug 2025

https://github.com/natlee/myanimelist-comment-crawler

Crawl all reviews and infomation of Anime works on MyAnimeList. ;)

anime crawler data-analysis data-mining data-science kaggle kaggle-dataset myanimelist python requests scrapy-crawler sqlite

Last synced: 14 Apr 2025

https://github.com/farkaskid/webcrawler

Simple and fast web crawler.

crawler go golang goroutines web webcrawler

Last synced: 14 Jan 2026

https://github.com/holmofy/spring-spider

Spring Spider App Utility Library.

crawler java spider spring spring-spider

Last synced: 17 Mar 2025

https://github.com/filipefilardi/wpp-broadcaster

Crawler made with Selenium and Python to constantly receive video/audio from target and broadcast to a list of contacts.

broadcast crawler python selenium

Last synced: 30 May 2026

https://github.com/basemax/firstselenium

Some sample codes for using selenium in Python just for fun.

crawl crawler crawlers crawling python python-selenium python3 selenium selenium-example selenium-py selenium-python selenium-sample selenium-tests selenium-website

Last synced: 05 May 2025

https://github.com/xdk78/grabbi

grabbi a simple web scraper/crawler

crawler html scraper web-scraper

Last synced: 14 Apr 2025

https://github.com/kstrassheim/datawarehouse-crawler

This is a content and schema crawler tool to receive, update and import various kinds of data into a Onprem or Cloud based SQLServer or Azure-Synapse-Analysis (Azure Datawarehouse SQLServer). As source it supports SQLServer Tables, ODATA Endpoints, CSV Files or Excel Files. For multiple sources it can run in parallel mode where it would make a thread for each connection. The speciality of this crawler is that it creates the target tables by himself using the additional info from source.json. In case of Azure-Synapse-Analysis it would estimate the distribution type and keys. The syncing works completely without SQL Transactions by using a consistency correction algorithm for very frequent fact tables. There are 5 Syncing Algorithms (see Manual/Insert) which can be selected as well as one Update Algorithm.

azure-data-warehouse azure-synapse-analytics business-intelligence crawler csv data-import data-science datawarehouse datawarehousing docker dotnet-core-2 excel integration-testing odata parallel-computing sql

Last synced: 28 Apr 2026

https://github.com/oxylabs/web-crawler

Web Crawler is a tool used to discover target URLs, select the relevant content, and have it delivered in bulk. It crawls websites in real-time and at scale to quickly deliver all content or only the data you need based on your chosen criteria.

api crawler github-python scraper web-crawler web-crawler-python web-scraping web-scraping-api webscraping

Last synced: 01 Aug 2025

https://github.com/mrmarble/mineseek

Minecraft server scanner

crawler minecraft minecraft-server scanner slp

Last synced: 07 Apr 2026

https://github.com/crwlrsoft/laravel-crawler

Laravel adapter for the crwlr/crawler package.

crawler crawling crawling-framework hacktoberfest laravel laravel-package php scraper scraping web-crawler web-crawling web-scraping

Last synced: 28 Feb 2025

https://github.com/giscafer/airlevel-crawler

a demo of crawler for air-level.com

crawler java nodejs

Last synced: 28 Apr 2025

https://github.com/vitorebatista/horoscopefree

The Astrology API Rest daily horoscope

crawler horoscope horoscope-crawler horoscopes-api

Last synced: 14 Oct 2025

https://github.com/inishchith/python-scripts

Some Scripts & Projects

crawler python-script python3 scripts youtube

Last synced: 19 Jul 2025

https://github.com/basemax/instagramseleniumhashtagimagepython

Instagram Selenium Python: A selenium-based crawler to extract images from special hashtags on Instagram.

crawler crawler-python crawlers instagram python python-selenium selenium selenium-python

Last synced: 15 May 2026

https://github.com/hctilg/pinterest-crawler

Downloads all images suitable for search

crawler pinterest

Last synced: 12 Apr 2025

https://github.com/moqsien/scrapx

scrapy定制版; A customized and enhanced version of scrapy for managing hundreds or even thousands of spiders.

crawler framework pymongo scrapy spider

Last synced: 10 Jul 2025

https://github.com/coghost/iparse

To extract HTML/json content identified by CSS selectors(with bs4) with yaml config support

crawler parser parser-library python xkcd yaml

Last synced: 12 Oct 2025

https://github.com/fzdwx/go-pachong

go 爬虫，能根据一个入口url不断爬取。go web crawler, able to continuously crawl data according to an entry url

crawler go golang

Last synced: 28 Apr 2025

https://github.com/yanliu1111/dashboard-flask-echarts

📊 Pandemic Monitoring Realtime Dashboard

crawler echarts flask mysql selenium-webdriver

Last synced: 20 Jun 2025

https://github.com/zurdi15/nbz

Bot to automate internet browsing

automation bot browser-automation browsermob-proxy crawler selenium testing web

Last synced: 10 Jun 2025

https://github.com/x-tropy/docroll

Turn complex programming knowledge 📚 into engaging, AI-powered video 📺 lessons.

ai animation course crawler documentation slides tutorial video

Last synced: 11 Oct 2025

https://github.com/jimmylaurent/node-crawling-framework

✨ NodeJs crawling & scraping framework heavily inspired by Scrapy

crawler crawling crawling-framework elasticsearch headless-chrome middleware mongodb nodejs-framework scraper scraping scraping-framework scrapy spider

Last synced: 15 Mar 2025

https://github.com/developerdavi/meli-crawler

Basic web crawler API for getting products from MercadoLibre (BRL | MLB)

api crawler meli-crawler mercadolibre mercadolibre-sdk mercadolivre mercadolivre-sdk nextjs now products react zeit

Last synced: 12 Apr 2025

https://github.com/danielmorell/se_bot_checker

Validate search engine user agents and IP addresses.

crawler googlebot python search-engine spider

Last synced: 15 Apr 2025

https://github.com/itszeeshan/crawlinit

A web crawler written in python3

appsec bugbounty bugbounty-tool bugbountytips crawler crawler-python enumeration infosec python recon reconnaissance scanner url web

Last synced: 13 Jun 2025

https://github.com/sujinleeme/koreamarathonapi

APIs of Marathon Events in Korea

crawler korea marathon-events python3

Last synced: 23 Jun 2025

https://github.com/wentsingnee/covid-19_crawler

COVID-19 疫情动态爬虫

cplusplus crawler

Last synced: 23 Apr 2025

https://github.com/hxr16f/ss-grabber

Automation script for downloading user screenshots.

automation crawler downloader grabber lightshot screenshot script

Last synced: 20 Jul 2025

https://github.com/giant-stone/gmq

一个支持自定义消费速率的简单消息队列 Simple, reliable, lightweight and efficient task queue in Go

crawler message-queue redis task-manager

Last synced: 12 Jan 2026

https://github.com/feedeo/youtube-channel-crawler

YouTube Channel :tv: Crawler

crawler youtube youtube-channel

Last synced: 06 Feb 2026

https://github.com/mirocow/yii2-crawler

Http concurrent crawler for Yii2

concurrency crawler guzzle yii2-extension

Last synced: 27 May 2026

https://github.com/iml1111/toonkor_collector

툰코 만화 수집기

crawler python

Last synced: 13 Jul 2025

https://github.com/spire-rs/spire

🗼 A flexible async framework for building high-performance crawlers and scrapers, designed for developers who need extensible pipelines, strong concurrency, and robust middleware support.

crawler framework scraper webdriver

Last synced: 21 Jan 2026

https://github.com/hangyan/generate-cs-word-dict

Generate a word dict for CS from stackoverflow/github tags

crawler dict github python word

Last synced: 29 Oct 2025

https://github.com/gatenlp/wpextract

Create datasets from WordPress sites for research or archiving

corpus crawler nlp text-extraction text-mining web-scraping wordpress

Last synced: 25 Jun 2025

https://github.com/tikazyq/github-crawler

Github repositories crawler

crawler scrapy

Last synced: 04 Apr 2025

https://github.com/zain-ul-din/lgu-crawler

LGU timetable Crawler

contribute crawler lahore-garrison-university lahore-garrison-university-timetable open-source

Last synced: 08 Aug 2025

https://github.com/synacktraa/crawl

Web crawler designed to efficiently retrieve unique href, script and form links from a web application.

bash crawler regex shell web-spidering

Last synced: 06 Apr 2026

https://github.com/kernelerr/pixivsync

Pixiv图片下载及同步工具

crawler pixiv pixiv-crawler python

Last synced: 14 May 2025

https://github.com/eished/tujigu_crawler

tujigu.com 图集谷 node.js 多线程爬虫 tujigu crawler

crawler node nodejs

Last synced: 28 Apr 2026

https://github.com/vmdang/historycrawler

The OOP project collects historical data in Vietnam and displays

crawler gson java javafx jsoup

Last synced: 17 Jun 2025

https://github.com/firesjoeng/bfo

Bilibili Followers Observer | Bilibili实时粉丝数监视器

bilibili crawler python

Last synced: 13 Apr 2025

https://github.com/elliotxx/readnewspaper

自动获取电子版报纸，方便每天阅读

crawler lxml newspaper pypdf2 python requests

Last synced: 12 Apr 2025

https://github.com/samnoh/cliboards

⌨️ Surf your online communities on CLI

cli-application crawler javascript

Last synced: 17 Jan 2026

https://github.com/code-inside/sloader

Worker that loads and retrieves data from "slow" endpoints.

crawler drop json yml

Last synced: 03 Sep 2025

https://github.com/serkan-ozal/driflyte-mcp-server

The Driflyte MCP Server exposes tools that allow AI assistants to query and retrieve topic-specific knowledge from recursively crawled and indexed web pages.

ai crawler mcp model-context-protocol opentelemetry rag

Last synced: 05 Oct 2025

https://github.com/liyifeng1994/go-crawler

基于golang的分布式爬虫项目

crawler elastic elasticsearch golang

Last synced: 01 May 2025

https://github.com/foolin/scrago

An simpe, fast, extensible crawl page framework for golang

crawler go scrago scrapy

Last synced: 24 Feb 2025

https://github.com/1uc1f3r616/dark-net-websites-dataset

Dataset of Onion Websites

crawler darknet data-analysis dataset onion search-engine website

Last synced: 27 Feb 2025

https://github.com/kulkultech/asos-crawler

Asos Crawler for Apify

apify asos crawler made-in-indonesia scrapper

Last synced: 27 Jan 2026

https://github.com/pjt3591oo/rust-exchange-crawler

rust 공부겸 만들어보는 크롤러

crawler rust

Last synced: 16 May 2025

https://github.com/cuerz/douban-top

Golang爬虫爬取豆瓣榜单

crawler douban golang goquery

Last synced: 08 Feb 2026

https://github.com/feliz-szk/berserk

Berserk: Crawler to increase web traffic(based on tor and privoxy)

anonymizer anonymous-proxy command-line-tool crawler linux privoxy python scraping-websites tor webtraffic-increaser

Last synced: 21 Apr 2026

https://github.com/omerdogan3/kitapp-crawler

Web Crawler Application of KitApp - Gets data from booksellers & insert them into database.

book bookseller crawler mysql nodejs puppeteer scrapper-script web-crawler

Last synced: 04 May 2026

https://github.com/birkhofflee/blizzard_forum.js

An unofficial Node.js API for Blizzard Forums. (works in 2019)

api crawler web

Last synced: 26 Apr 2026

https://github.com/haxzie-xx/crode.js-node-web-crawler

Node.js Crawler built for open FTP sites for movie link collection.

crawler nodejs

Last synced: 01 May 2026

https://github.com/bakhirev/assayo-crawler

📈 Visualization and analysis of your git repository data.

audit commit crawler data-visualization git report statistics

Last synced: 05 Mar 2026

https://github.com/laurybueno/monibus

API de monitoramento de ônibus em São Paulo

api crawler django docker mapping sptrans

Last synced: 08 May 2026

https://github.com/somnisomni/trawler-csharp

The successor of https://github.com/somnisomni/twitter-account-data-crawler, written in .NET C#

crawler crawling csharp dotnet follower-tracker selenium selenium-csharp twitter twitter-crawler twitter-crawling twitter-scraper

Last synced: 06 May 2026

https://github.com/ayusharma/rss-parser

A simple crawler in ReactJS

crawler reactjs rss-parser

Last synced: 16 Apr 2026

https://github.com/techguy-bhushan/web-spider

multi-threaded webs crawler

crawler python web-spider

Last synced: 27 Mar 2026

https://github.com/0000xffff/webgrab

web page: crawler / file scanner / downloader

crawler download downloader scrape scraper webcrawler

Last synced: 17 Apr 2026

https://github.com/zhaotianff/crawler-line

C# command-line crawler

command-line command-line-tool crawler csharp dotnet-core

Last synced: 07 Jun 2026

https://github.com/kkamara/php-scraper

:office: (Live Link) (2022) Use PHP technologies to crawl and click buttons on websites with GUI. I highly recommend working with Linux (including virtual machines) or MacOs. Laravel 11.

bot crawler laravel scraper spider

Last synced: 01 Apr 2026

https://github.com/zhaotianff/qzone

想起那天夕阳下的奔跑，那是我逝去的青春

crawler crawling-sites csharp qzone qzone-photos qzone-spider wpf

Last synced: 24 Apr 2026

https://github.com/maicss/1024img

1024 image nodejs crawler

1024 crawler nodejs

Last synced: 03 May 2026

https://github.com/achannarasappa/locust-cli

Developer tools to accelerate development of Locust jobs

cli crawler headless-chrome puppeteer scraper

Last synced: 26 Apr 2026

https://github.com/wujunchuan/xiamen-housing-data-collection

利用(计划) Github Actions 定时采集厦门市住房保障与房屋管理局的一手房/二手房网签情况

crawler docker docker-image github-actions nodejs ocr spider tesseract-ocr xiamen

Last synced: 04 Apr 2026

https://github.com/bitlytwiser/tormonger

Recursive Tor network crawler

crawler go golang tor

Last synced: 18 Jun 2026

https://github.com/huzecong/film-spider

Spiders crawling for film listing websites.

crawler

Last synced: 09 Jun 2026

https://github.com/capturr/price-extract

Performant way to extract price amount and metadatas (currency, decimal & thousands separator) from any string.

amount crawler crawling currencies currency extract extractor javascript nodejs parser parsing price scraper scraping spider typescript

Last synced: 30 Apr 2026

https://github.com/tokenmill/crawling-framework-example

Demonstration on how to use the Crawling Framework to setup a simple science news crawler and store results in ElasticSearch. Use this configuration to set up your own crawler.

crawler crawling-framework elasticsearch storm-crawler

Last synced: 08 May 2026

https://github.com/idanhoro/nasa-heat-maps-prediction

In this project we research the correlations between different weather conditions and try to predict future scenarios by using image processing and traditional machine learning algorithms

beautifulsoup crawler machine-learning pillow prediction python sklearn

Last synced: 05 Apr 2026

https://github.com/crackcomm/go-google-search

Google search NSQ worker

crawler google google-search search

Last synced: 16 Feb 2026