Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

Projects in Awesome Lists tagged with scraper

A curated list of projects in awesome lists tagged with scraper .

https://github.com/twiny/spidy

Domain names collector - Crawl websites and collect domain names along with their availability status.

backlinks crawler domain expired-domain golang scraper seotools spider

Last synced: 01 Aug 2024

https://github.com/HarryShomer/Hockey-Scraper

Python Package for scraping NHL Play-by-Play and Shift data

hockey nhl python scraper sports web-scraping

Last synced: 01 Aug 2024

https://github.com/zehina/webtoon-downloader

Webtoons Scraper able to download all chapters of any series wanted.

manhwa manhwa-scraper python python3 scraper webtoon-crawler webtoon-downloader webtoons webtoons-downloader

Last synced: 28 Sep 2024

https://github.com/Zehina/Webtoon-Downloader

Webtoons Scraper able to download all chapters of any series wanted.

manhwa manhwa-scraper python python3 scraper webtoon-crawler webtoon-downloader webtoons webtoons-downloader

Last synced: 01 Aug 2024

https://github.com/voliveirajr/seleniumcrawler

An example using Selenium webdrivers for python and Scrapy framework to create a web scraper to crawl an ASP site

asp-net python scraper scraping scraping-websites scrapper scrapy selenium selenium-webdriver webcrawler webcrawling

Last synced: 28 Sep 2024

https://github.com/alash3al/scraply

Scraply a simple dom scraper to fetch information from any html based website

crawler crawling dom golang scraper scrapers scraping-websites scrapy server

Last synced: 01 Aug 2024

https://github.com/urbanadventurer/bing-ip2hosts

bingip2hosts is a Bing.com web scraper that discovers websites by IP address

bing discovery hostnames ipaddress kali kali-linux osint osint-reconnaissance osint-tool reconnaissance scraper search-engine webscraping

Last synced: 31 Jul 2024

https://github.com/sonnylazuardi/reactriot2017-dotamania

🌐 Web scraping made easy with the visual 🗺 mind map editor to JSON

demo hackathon mindmap reactjs scraper web

Last synced: 31 Jul 2024

https://github.com/integrii/headlessChrome

A Go package for working with headless Chrome. Run interactive JavaScript commands on web pages with Go and Chrome.

chrome cli go headless macos package scraper testing

Last synced: 02 Aug 2024

https://github.com/basilioss/obsidian-scrapers

Get information from link for Obsidian

obsidian obsidian-md parser scraper

Last synced: 01 Aug 2024

https://github.com/ohmybahgosh/YT-DLP-SCRIPTS

...Just a place for me to share my various YT-DLP & related bash scripts.

bash bash-script downloading ffmpeg ffmpeg-script parser scraper shell-script youtube-dl yt-dlp

Last synced: 31 Jul 2024

https://github.com/Linch1/WeChartWeb3

Build a poocoin clone, scrape all the prices from pancakeswap or any other similar dex, build an historical record and offer an api to your users.

blockchain bsc cryptocurrency dex dextools ethereum historical-data nodejs pancakeswap poocoin scraper uniswap

Last synced: 02 Aug 2024

https://github.com/lachlanjc/predictcovid

Visualize & track the 2020 COVID-19 pandemic by country.

coronavirus covid-19 covid19 dataviz prisma2 redwoodjs scraper

Last synced: 02 Oct 2024

https://github.com/cdimascio/essence

Automatically extract the main text content (and more) from an HTML document

extractor hacktoberfest html-extractor scraper web-content-extractor webpage-extractor website-extractor

Last synced: 02 Oct 2024

https://github.com/Crinibus/scraper

Web scraper for scraping, tracking and visualizing prices of products on various websites.

amazon avcables computersalg coolshop ebay elgiganten expert komplett mm-vision newegg prices products proshop python scrape-prices scraper sharkgaming shein tech-scraper web-scraping

Last synced: 01 Aug 2024

https://github.com/sedgwickz/jsonHunter

在线爬虫,online web scraper

scraper

Last synced: 04 Aug 2024

https://github.com/codingforentrepreneurs/Web-Scraping

Learn how to leverage Python's amazing tools to scrape data from other websites. The end goal of this course is to scrape blogs to analyze trending keywords and phrases. We'll be using Python 3.6, Requests, BeautifulSoup, Asyncio, Pandas, Numpy, and more!

aysncio beautifulsoup beautifulsoup4 joincfe numpy pandas python python-requests python3 requests scraper sraping tutorial web-scraping

Last synced: 05 Aug 2024

https://github.com/situmorang-com/whatsapp-group-contacts-scraper

How to scrap whatsapp group contacts from https://web.whatsapp.com/

javascript scraper whatsapp whatsapp-group whatsapp-parser whatsapp-web

Last synced: 30 Sep 2024

https://github.com/shailshouryya/yt-videos-list

Create and **automatically** update a list of all videos on a YouTube channel (in txt/csv/md form) via YouTube bot with end-to-end web scraping - no API tokens required. Multi-threaded support for YouTube videos list updates.

automation bravedriver chromedriver csv firefox-headless geckodriver operadriver safaridriver scraper selenium txt youtube youtube-api youtube-channel youtube-dl youtube-downloader youtube-playlist yt yt-downloader ytdl

Last synced: 28 Sep 2024

https://github.com/cerbero90/lazy-json-pages

📜 Framework-agnostic API scraper to load items from any paginated JSON API into a Laravel lazy collection via async HTTP requests.

api json laravel lazy pagination parser php scraper scraping stream

Last synced: 12 Sep 2024

https://github.com/christophebe/serp

Google Search SERP Scraper

google scraper seo serp serps

Last synced: 01 Aug 2024

https://github.com/pavlovtech/WebReaper

Web scraper, crawler and parser in C#. Designed as simple, declarative and scalable web scraping solution.

crawler datamining parser parsing scraper scraping scraping-api scraping-data scraping-tool scraping-web scraping-websites webcrawler webscraping

Last synced: 01 Aug 2024

https://github.com/scrapehero/zillow_real_estate

Zillow.com Web Scraper written in Python and LXML to extract real estate listings available based on a zip code.

html lxml parsing python-requests scraper web-scraping

Last synced: 01 Aug 2024

https://github.com/jadkins89/Recipe-Scraper

A JS package for scraping recipes from the web.

food-recipes recipe-scraper recipes scraper

Last synced: 01 Aug 2024

https://github.com/st1vms/unofficial-claude-api

Unofficial Claude API supporting direct HTTP chat creation/deletion/retrieval, messages with multiple file attachments and auto session gathering using Firefox with geckodriver.

api assistant chatbot claude claude-api claude3 documented easy-to-use file-attachment firefox free image-processing image-recognition large-file-upload long-text python scraper selenium summarizer unofficial-api

Last synced: 01 Aug 2024

https://github.com/entrepreneur-interet-general/OpenScraper

An open source webapp for scraping: towards a public service for webscraping

bulma entrepreneur-interet-general html mongodb python python2 scraper scrapy spider tornado xpath

Last synced: 01 Aug 2024

https://github.com/entrepreneur-interet-general/openscraper

An open source webapp for scraping: towards a public service for webscraping

bulma entrepreneur-interet-general html mongodb python python2 scraper scrapy spider tornado xpath

Last synced: 30 Sep 2024

https://github.com/mondeja/pymarketcap

Python3 API wrapper and web scraper for https://coinmarketcap.com

api asyncio c coinmarketcap cryptocurrencies cryptotrading cython graphs libcurl pypi python scraper trading urllib

Last synced: 01 Oct 2024

https://github.com/cowboy-bebug/app-store-scraper

Single API ☝ App Store Review Scraper 🧹

app-store appstore review-data scraper

Last synced: 01 Aug 2024

https://github.com/henson/Scraper

Tracking the most popular Github repos, updated daily.

go markdown scraper

Last synced: 01 Aug 2024

https://github.com/ArchiveTeam/wget-lua

Wget-AT is a modern Wget with Lua hooks, Zstandard (+dictionary) WARC compression and URL-agnostic deduplication.

archiveteam archiving crawl crawler crawlers crawling downloader ftp lua scraper scraping spider warc webarchiving wget wget-lua zstd

Last synced: 06 Aug 2024

https://github.com/html2rss/html2rss-web

🕸 Generates and delivers RSS feeds via HTTP. Docker image available! Create your own feeds or get started quickly with the included configs.

builder docker feed feed-configs html2rss html2rss-configs roda rolling-release rss rss-aggregator rss-feed rss-feed-scraper ruby scraper serves webfeed webfeeds website-scraper

Last synced: 30 Jul 2024

https://github.com/gurbaaz27/linkedin-comments-scraper

Script to scrape comments (including name, profile link, pfp, designation, email(if present), and comment) from a LinkedIn post from the URL of the post.

linkedin linkedin-comments-scraper linkedin-post python python3 scraper selenium selenium-python selenium-webdriver webscraping

Last synced: 28 Sep 2024

https://github.com/qeeqbox/osint

Build custom OSINT tools and APIs (Ping, Traceroute, Scans, Archives, DNS, Scrape, Whois, Metadata & built-in database for more info) with this python package

dns osint ping python scan scraper tool traceroute whois

Last synced: 01 Aug 2024

https://github.com/5agado/conversation-analyzer

Analyzer and statistics generator for text-based conversations. Includes Facebook scraper and parser

data-science facebook quantified-self scraper

Last synced: 01 Aug 2024

https://github.com/bellingcat/reddit-post-scraping-tool

Given a subreddit name and a keyword, this program returns all top (by default) posts that contain the specified keyword.

command-line gui open-source-research python reddit scraper visual-basic

Last synced: 02 Aug 2024

https://github.com/philshem/gmaps_popular_times_scraper

Scraper for Google Maps "Popular Times" for place entries

google-maps python3 scraper scrapers

Last synced: 01 Aug 2024

https://github.com/aofdev/instagram-get-images

Instagram get images 🌄 (hashtags, account, locations) with puppeteer

hacktoberfest images instagram instagram-scraper puppeteer scraper

Last synced: 03 Aug 2024

https://github.com/piquette/qtrn

A cli tool to streamline financial markets data analysis :wrench:

cli data data-science finance go golang options quotes scraper stock stock-analysis stock-market

Last synced: 01 Aug 2024

https://github.com/linkpreview/linkpreview

Open Graph, Twitter Card, Oembed preview. Shows visual cards that mimic link previews in Social Media like facebook, twitter, vk and other sites that support link preview.

cheeriojs linkpreview nodejs oembed opengraph react reactjs redux scraper scraping twittercard

Last synced: 30 Jul 2024

https://github.com/mahesh-hegde/rrip

Bulk image downloader for reddit.

downloader golang reddit scraper

Last synced: 01 Aug 2024

https://github.com/philipjkim/goreadability

Webpage summary extractor using Facebook Open Graph and arc90's readability

opengraph readability scraper

Last synced: 03 Aug 2024

https://github.com/openbytedev/sourcescraper

Simple library which helps you to retrieve the source of various video streaming sites.

extractor nodejs npm-package scraper scrapers scraping scraping-tool source-extraction

Last synced: 28 Sep 2024

https://github.com/scrapehero/yellowpages-scraper

Yellowpages.com Web Scraper written in Python and LXML to extract business details available based on a particular category and location.

business-directory extract html lxml parsing python scraper web-scraper yellow-pages yellow-pages-scraper

Last synced: 01 Aug 2024

https://github.com/OpenByteDev/SourceScraper

Simple library which helps you to retrieve the source of various video streaming sites.

extractor nodejs npm-package scraper scrapers scraping scraping-tool source-extraction

Last synced: 31 Jul 2024

https://github.com/lapwat/reCatchable

Turn a site into a book. Download a whole website and upload it to your reMarkable.

ebook epub remarkable remarkable-tablet remarkable-tablets scrape scraper

Last synced: 01 Aug 2024

https://github.com/rodolflying/GPT_scraper

This repository provides a way to scrape full user history (or use) ChatGPT through 2 methods: frontend "hidden" API based or Selenium based, both have their own pros. It can be helpful for avoiding the usage of API credits while still using ChatGPT programmatically

automation chatgpt chrome gpt4 scraper selenium webdriver

Last synced: 05 Aug 2024

https://github.com/daijro/SearchifyX

Fast flashcard searcher study tool

education quizizz quizlet scraper webscraper webscraping

Last synced: 04 Aug 2024

https://github.com/absingh31/tor_spider

Python project to crawl and scrap the lesser known deep web or one can say dark web. Just provide the onion link and get started.

crawler file-manager ioc python3 scraper scraping socks stem tor tor-config tor-spider

Last synced: 03 Aug 2024

https://github.com/fa0311/twitter-openapi-typescript

Implementation of Twitter internal API (Twitter graphql API) in TypeScript

graphql openapi scraper twitter typescript undocumented unofficial

Last synced: 31 Jul 2024

https://github.com/orsifrancesco/instagram-without-api-node

A simple Node.js code to get unlimited instagram public pictures by every user without api, without credentials.

instagram instagram-api instagram-scraper node node-js nodejs scraper scraping scraping-api without-api

Last synced: 02 Aug 2024

https://github.com/Matthew17-21/Captcha-Tools

All-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha, Anticaptcha, and Capsolver API's!

2captcha 2captcha-api anticaptcha anticaptcha-client capsolver capsolvercom captcha hcaptcha recaptcha scraper scraping scraping-api sneakerbot sneakerbots sneakers

Last synced: 01 Aug 2024

https://github.com/SaulLawliet/watchdog

IF (接口/网页 有变化) THEN (提醒你)

monitor scraper watchdog

Last synced: 04 Aug 2024

https://github.com/inverse/termin

Simple PHP script for notifying for a free appointments on the Berlin services website.

abmeldung berlin php scraper telegram

Last synced: 05 Sep 2024

https://github.com/fa0311/TwitterFrontendFlow

Unofficial Client for Twitter Internal API

scraper twitter twitter-bot unofficial

Last synced: 04 Aug 2024

https://github.com/ozencb/yts-scraper

Download .torrent files from YTS YIFY

downloader python scraper torrent-files yify yts yts-scraper

Last synced: 01 Aug 2024

https://github.com/nazliander/scrape-nr-of-deaths-istanbul

A scraper and simple time series analysis example with Selenium and Seaborn.

docker scraper selenium-python

Last synced: 12 Aug 2024

https://github.com/LexiestLeszek/scrapeGPT

ScrapeGPT is a RAG-based Telegram bot designed to scrape and analyze websites, then answer questions based on the scraped content. The bot utilizes Retrieval Augmented Generation and webscraping to return natural language answers to the user's queries.

crawler huggingface large-language-models llm ollama proxy rag retrieval-augmented-generation robots-txt scraper telegram-bot website-scraper

Last synced: 01 Aug 2024

https://github.com/hfreire/browser-as-a-service

A web browser :earth_americas: hosted as a service, to render your JavaScript web pages as HTML

browser browser-as-a-service crawler docker github-actions javascript puppeteer rest-api scraper server webcrawler

Last synced: 02 Aug 2024

https://github.com/a11ywatch/crawler

gRPC web crawler turbo charged for performance

a11ywatch crawler grpc scraper

Last synced: 29 Sep 2024

https://github.com/ReedD/crawler

Chromium / Puppeteer site crawler

bot chromium crawler puppeteer redis scraper

Last synced: 30 Jul 2024

https://github.com/bjesus/pipet

a swiss-army tool for scraping and extracting data from online assets, made for hackers

css curl gjson json playwright scraper scraping

Last synced: 01 Oct 2024

https://github.com/oscarmorrison/nightmare-heroku

😱 a setup for nightmarejs on heroku

heroku nightmare nightmarejs node scary scraper

Last synced: 06 Aug 2024

https://github.com/orsifrancesco/instagram-without-api

A simple PHP code to get unlimited instagram public pictures by every user without api, without credentials.

instagram instagram-api instagram-scraper php scraper scraping without-api

Last synced: 02 Aug 2024

https://github.com/crackernutter/EsriRESTScraper

A Python class that scrapes ESRI Rest Endpoints and exports data to a geodatabase

arcgis-server esri featureclass geodatabase geometry ijson polygon python rest-api schema scraper

Last synced: 13 Aug 2024

https://github.com/tibobrc/Blinkist-to-Readwise

Extract highlights from your Blinkist account and upload them to your Readwise account, or download them to a CSV file.

blinkist blinkist-highlights blinkist-to-readwise highlights python readwise readwise-highlights scraper

Last synced: 02 Aug 2024

https://github.com/lorepozo/magnet

Search for a torrent from the command-line and start streaming

magnet-link scraper stream torrent

Last synced: 31 Jul 2024

https://github.com/greenpeace/gpes-check-my-pages

Scrapping script used to test the Spanish web archive and redirects system, with more than 10,000 pages. It checks redirections, http responses, analytics, files hosted in soon-to-die servers, canonical urls and more.

command-line-tool csv golang scraper

Last synced: 03 Aug 2024

https://github.com/openzim/warc2zim

Command line tool to convert a file in the WARC format to a file in the ZIM format

scraper warc zim

Last synced: 09 Aug 2024

https://github.com/dobizz/TikTok

Download public videos on TikTok using Python with Selenium

chromedriver concurrency downloader javascript python3 reverse-engineering robots scraper selenium tiktok tiktok-api

Last synced: 29 Jul 2024

https://github.com/Tatsh/youtube-unofficial

Access parts of your account unavailable through normal YouTube API access.

command-line python scraper utilities utility youtube

Last synced: 31 Jul 2024

https://github.com/sanghviharshit/pocket-tagger

📖👓🏷Tag your getpocket.com articles automatically using natural language processing

articles getpocket google-cloud natural-language-processing nlp pocket scraper tag

Last synced: 31 Jul 2024

https://github.com/bellingcat/vk-url-scraper

Scrape VK URLs to fetch info and media - python API or command line tool.

command-line media-downloader open-source-research python scraper vk

Last synced: 26 Sep 2024

https://github.com/tamarasaurus/immo-feed

A extensible app for scraping property listings

api immobilier real-estate scraper

Last synced: 12 Aug 2024

https://github.com/donderjoekel/Mangarr

An *arr inspired approach to downloading manga using individual sources

manga manga-scraper manhua manhua-scraper manhwa manhwa-scraper scraper

Last synced: 01 Aug 2024

https://github.com/kalbhor/Image-Scraper

Fast concurrent image scraper

golang image-scraper multithreading scraper

Last synced: 04 Aug 2024

https://github.com/mattmoony/d4v1d

Social-Media OSINT tool - gather info on users across multiple platforms; easily extensible by design. 📷

graph information-gathering instagram network osint py python recon reconnaissance scraper social-network web

Last synced: 01 Oct 2024

https://github.com/RandomNinjaAtk/docker-raromprocessor

RA ROM Processor is a Docker container that is used to aquire/orgainze/process/verify/dedupe/scrape a ROMs library automatically by matching ROMs to the RetroAchievement.org website Hash database.

bash emulationstation rahasher retroachievements retrogaming roms scraper script

Last synced: 01 Aug 2024

https://github.com/xiaoluoboding/vercel-metafy

Easily scrape metadata from websites as a service using Vercel.

metadata scraper serverless-functions vercel

Last synced: 10 Aug 2024

https://github.com/yjl9903/AnimeGarden

動漫花園 3-rd party mirror site and Anime Torrent aggregation site

animation anime anime-tracker animegarden animelist animespace anitomy bangumi dmhy scraper torrent

Last synced: 06 Aug 2024