Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with scraper
A curated list of projects in awesome lists tagged with scraper .
https://github.com/cantino/huginn
Create agents that monitor and act on your behalf. Your agents are standing by!
agent automation feed feedgenerator huginn monitoring notifications rss scraper twitter twitter-streaming webscraping
Last synced: 31 Jul 2024
https://github.com/huginn/huginn
Create agents that monitor and act on your behalf. Your agents are standing by!
agent automation feed feedgenerator huginn monitoring notifications rss scraper twitter twitter-streaming webscraping
Last synced: 29 Sep 2024
https://github.com/NaiboWang/EasySpider
A visual no-code/code-free web crawler/spider易采集:一个可视化浏览器自动化测试/数据采集/爬虫软件,可以无代码图形化的设计和执行爬虫任务。别名:ServiceWrapper面向Web应用的智能化服务封装系统。
batch-processing batch-script code-free crawler data-collection frontend gui html input-parameters layman parameters robotics rpa scraper spider visual visualization visualprogramming web www
Last synced: 31 Jul 2024
https://github.com/cheeriojs/cheerio
The fast, flexible, and elegant library for parsing and manipulating HTML and XML.
cheerio dom hacktoberfest html htmlparser htmlparser2 jquery parser scraper selector
Last synced: 29 Sep 2024
https://github.com/apify/crawlee
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping
Last synced: 29 Sep 2024
https://github.com/codelucas/newspaper
newspaper3k is a news, full-text, and article metadata extraction in Python 3. Advanced docs:
crawler crawling news news-aggregator python scraper
Last synced: 29 Sep 2024
https://github.com/feder-cr/auto_jobs_applier_aihawk
Auto_Jobs_Applier_AIHawk is a tool that automates the jobs application process. Utilizing artificial intelligence, it enables users to apply for multiple job offers in an automated and personalized way.
application-resume automate automation bot challenge chatgpt chrome gpt human-resources job jobs jobsearch jobseeker opeai python python3 resume scraper scraping selenium
Last synced: 02 Oct 2024
https://github.com/apifytech/apify-js
Crawlee—A web scraping and browser automation library for Node.js to build reliable crawlers. In JavaScript and TypeScript. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with Puppeteer, Playwright, Cheerio, JSDOM, and raw HTTP. Both headful and headless mode. With proxy rotation.
apify automation crawler crawling headless headless-chrome javascript nodejs npm playwright puppeteer scraper scraping typescript web-crawler web-crawling web-scraping
Last synced: 05 Aug 2024
https://github.com/pwxcoo/chinese-xinhua
:orange_book: 中华新华字典数据库。包括歇后语,成语,词语,汉字。
chinese chinese-characters chinese-language chinese-nlp chinese-simplified chinese-traditional data json json-data json-dataset python3 scraper
Last synced: 02 Oct 2024
https://github.com/guyueyingmu/avbook
AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database
adult adult-video avmoo crawler database guzzlehttp javbus javlibrary laravel magnet magnet-link scraper spider
Last synced: 30 Sep 2024
https://github.com/evil0ctal/douyin_tiktok_download_api
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
api asgi async asyncio crawler douyin douyin-scraper douyin-tiktok-api douyin-tiktok-download fastapi httpx no-watermark online-parsing python pywebio scraper spider tiktok tiktok-scraper web-scraping
Last synced: 29 Sep 2024
https://github.com/mendableai/firecrawl
🔥 Turn entire websites into LLM-ready markdown or structured data. Scrape, crawl and extract with a single API.
ai ai-scraping crawler data html-to-markdown llm markdown rag scraper scraping web-crawler
Last synced: 29 Sep 2024
https://github.com/Evil0ctal/Douyin_TikTok_Download_API
🚀「Douyin_TikTok_Download_API」是一个开箱即用的高性能异步抖音、快手、TikTok、Bilibili数据爬取工具,支持API调用,在线批量解析及下载。
api asgi async asyncio crawler douyin douyin-scraper douyin-tiktok-api douyin-tiktok-download fastapi httpx no-watermark online-parsing python pywebio scraper spider tiktok tiktok-scraper web-scraping
Last synced: 31 Jul 2024
https://github.com/alirezamika/autoscraper
A Smart, Automatic, Fast and Lightweight Web Scraper for Python
ai artificial-intelligence automation crawler machine-learning python scrape scraper scraping web-scraping webautomation webscraping
Last synced: 30 Sep 2024
https://github.com/MontFerret/ferret
Declarative web scraping
cdp chrome cli crawler crawling data-mining dsl go golang hacktoberfest library query-language scraper scraping scraping-websites tool
Last synced: 30 Jul 2024
https://github.com/montferret/ferret
Declarative web scraping
cdp chrome cli crawler crawling data-mining dsl go golang hacktoberfest library query-language scraper scraping scraping-websites tool
Last synced: 29 Sep 2024
https://github.com/go-rod/rod
A Chrome DevTools Protocol driver for web automation and scraping.
automation cdp chrome-devtools chrome-devtools-protocol chrome-headless crawling devtools devtools-protocol go golang gorod headless rod scraper testing web web-scraping
Last synced: 30 Jul 2024
https://github.com/fent/node-ytdl-core
YouTube video downloader in javascript.
node scraper video-downloader youtube youtube-downloader
Last synced: 29 Sep 2024
https://github.com/madawei2699/mygptreader
A community-driven way to read and chat with AI bots - powered by chatGPT.
ai chatgpt crawler daily-news embedding gpt-35-turbo hot-news openai prompt reader scraper slack-bot
Last synced: 30 Sep 2024
https://github.com/madawei2699/myGPTReader
A community-driven way to read and chat with AI bots - powered by chatGPT.
ai chatgpt crawler daily-news embedding gpt-35-turbo hot-news openai prompt reader scraper slack-bot
Last synced: 31 Jul 2024
https://github.com/justanotherarchivist/snscrape
A social networking service scraper in Python
python scraper social-media social-network
Last synced: 30 Sep 2024
https://github.com/JustAnotherArchivist/snscrape
A social networking service scraper in Python
python scraper social-media social-network
Last synced: 31 Jul 2024
https://github.com/apify/crawlee-python
Crawlee—A web scraping and browser automation library for Python to build reliable crawlers. Extract data for AI, LLMs, RAG, or GPTs. Download HTML, PDF, JPG, PNG, and other files from websites. Works with BeautifulSoup, Playwright, and raw HTTP. Both headful and headless mode. With proxy rotation.
apify automation beautifulsoup crawler crawling headless headless-chrome pip playwright python scraper scraping web-crawler web-crawling web-scraping
Last synced: 30 Sep 2024
https://github.com/IonicaBizau/scrape-it
🔮 A Node.js scraper for humans.
hacktoberfest node-scraper scraper
Last synced: 31 Jul 2024
https://github.com/ionicabizau/scrape-it
🔮 A Node.js scraper for humans.
hacktoberfest node-scraper scraper
Last synced: 30 Sep 2024
https://github.com/UltimaHoarder/UltimaScraper
Scrape all the media from an OnlyFans account - Updated regularly
archive datascraping onlyfans scraper
Last synced: 31 Jul 2024
https://github.com/niespodd/browser-fingerprinting
Analysis of Bot Protection systems with available countermeasures 🚿. How to defeat anti-bot system 👻 and get around browser fingerprinting scripts 🕵️♂️ when scraping the web?
automation bot bot-detection browser-fingerprinting chromedriver chromium chromium-browser crawler detection fingerprinting puppeteer recaptcha scraper spider stealth web webscraping
Last synced: 27 Sep 2024
https://github.com/javscraper/emby.plugins.javscraper
Emby/Jellyfin 的一个日本电影刮削器插件,可以从某些网站抓取影片信息。
adult emby fanart-poster fc2 japanese jav jav-scraper javbus jellyfin jsproxy metadata plugin scraper synology
Last synced: 30 Sep 2024
https://github.com/JavScraper/Emby.Plugins.JavScraper
Emby/Jellyfin 的一个日本电影刮削器插件,可以从某些网站抓取影片信息。
adult emby fanart-poster fc2 japanese jav jav-scraper javbus jellyfin jsproxy metadata plugin scraper synology
Last synced: 31 Jul 2024
https://github.com/aapatre/Automatic-Udemy-Course-Enroller-GET-PAID-UDEMY-COURSES-for-FREE
Do you want to LEARN NEW STUFF for FREE? Don't worry, with the power of web-scraping and automation, this script will find the necessary Udemy coupons & enroll you for PAID UDEMY COURSES, ABSOLUTELY FREE!
python python3 scraper scraping selenium
Last synced: 01 Aug 2024
https://github.com/jae-jae/querylist
:spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
crawler querylist scraper spider
Last synced: 30 Sep 2024
https://github.com/jae-jae/QueryList
:spider: The progressive PHP crawler framework! 优雅的渐进式PHP采集框架。
crawler querylist scraper spider
Last synced: 30 Jul 2024
https://github.com/meetDeveloper/freeDictionaryAPI
There was no free Dictionary API on the web when I wanted one for my friend, so I created one.
api dictionary-api dictonary free-api google google-dictionary scraper
Last synced: 31 Jul 2024
https://github.com/facundoolano/google-play-scraper
Node.js scraper to get data from Google Play
api crawler google-play nodejs scraper
Last synced: 01 Oct 2024
https://github.com/serene-arc/bulk-downloader-for-reddit
Downloads and archives content from reddit
archive downloader gfycat imgur python reddit scraper
Last synced: 30 Sep 2024
https://github.com/aliparlakci/bulk-downloader-for-reddit
Downloads and archives content from reddit
archive downloader gfycat imgur python reddit scraper
Last synced: 31 Jul 2024
https://github.com/mishushakov/llm-scraper
Turn any webpage into structured data using LLMs
ai artificial-intelligence browser browser-automation gpt gpt-4 langchain llama llm openai playwright puppeteer scraper
Last synced: 27 Sep 2024
https://github.com/joeyism/linkedin_scraper
A library that scrapes Linkedin for user data
chrome company driver firefox linkedin linkedin-profile linkedin-scraper linkedin-url profile scraper scrapes-linkedin users
Last synced: 29 Sep 2024
https://github.com/PaulMcInnis/JobFunnel
Scrape job websites into a single spreadsheet with no duplicates.
automated beautifulsoup beautifulsoup4 csv glassdoor indeed international job jobs monster python scraper search tfidf waterloo yaml
Last synced: 31 Jul 2024
https://github.com/paulmcinnis/jobfunnel
Scrape job websites into a single spreadsheet with no duplicates.
automated beautifulsoup beautifulsoup4 csv glassdoor indeed international job jobs monster python scraper search tfidf waterloo yaml
Last synced: 29 Sep 2024
https://github.com/feder-cr/linkedIn_auto_jobs_applier_with_AI
LinkedIn_AIHawk is a tool that automates the jobs application process on LinkedIn. Utilizing artificial intelligence, it enables users to apply for multiple job offers in an automated and personalized way.
application-resume automate automation bot challenge chatgpt chrome gpt job jobsearch jobseeker linkedin-api linkedin-scraper opeai python python3 resume scraper scraping selenium
Last synced: 12 Aug 2024
https://github.com/extractus/article-extractor
To extract main article from given URL with Node.js
article article-extractor article-parser crawler extract nodejs readability scraper
Last synced: 01 Oct 2024
https://github.com/website-scraper/node-website-scraper
Download website to local directory (including all css, images, js, etc.)
hacktoberfest javascript nodejs scraper website-scraper
Last synced: 30 Sep 2024
https://github.com/ahmadibrahiim/website-downloader
💡 Download the complete source code of any website (including all assets). [ Javascripts, Stylesheets, Images ] using Node.js
assets downloader offline-web-pages scraper
Last synced: 30 Sep 2024
https://github.com/edoardottt/cariddi
Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more
bugbounty crawler crawling endpoint-discovery endpoints go golang hacktoberfest infosec osint penetration-testing pentesting recon reconnaissance redteam scraper secret-keys secrets-detection security security-tools
Last synced: 30 Sep 2024
https://github.com/sqzw-x/mdcx
Movie metadata scraper
crawler emby jav-scraper jellyfin metadata movie-crawler movie-metadata movie-scrapper movies python scraper
Last synced: 30 Sep 2024
https://github.com/AhmadIbrahiim/Website-downloader
💡 Download the complete source code of any website (including all assets). [ Javascripts, Stylesheets, Images ] using Node.js
assets downloader offline-web-pages scraper
Last synced: 01 Aug 2024
https://github.com/paulpierre/informer
A Telegram Mass Surveillance Bot in Python
bot cryptocurrency listen listener monitor python scraper spy surveillance telegram telegram-api telegram-bot telegram-bot-api telegrambot telethon telethon-userbot
Last synced: 30 Sep 2024
https://github.com/teamnewpipe/newpipeextractor
NewPipe's core library for extracting data from streaming sites
bandcamp crawler extractor mediaccc newpipe peertube scraper soundcloud youtube
Last synced: 26 Sep 2024
https://github.com/felipecsl/wombat
Lightweight Ruby web crawler/scraper with an elegant DSL which extracts structured data from pages.
Last synced: 01 Oct 2024
https://github.com/Adyzng/jd-autobuy
Python爬虫,京东自动登录,在线抢购商品
crawler jingdong python scraper
Last synced: 04 Aug 2024
https://github.com/justfoolingaround/animdl
A highly efficient, fast, powerful and light-weight anime downloader and streamer for your favorite anime.
Last synced: 30 Sep 2024
https://github.com/lorey/mlscraper
🤖 Scrape data from HTML websites automatically by just providing examples
crawler crawler-python crawling extraction-engine html machine-learning scraper scraping
Last synced: 30 Sep 2024
https://github.com/TeamNewPipe/NewPipeExtractor
NewPipe's core library for extracting data from streaming sites
bandcamp crawler extractor mediaccc newpipe peertube scraper soundcloud youtube
Last synced: 07 Aug 2024
https://github.com/th3unkn0n/osi.ig
Information Gathering Instagram.
information-gathering information-retrieval instagram instagram-scraper linux osint python python3 scraper termux termux-tool
Last synced: 30 Sep 2024
https://github.com/avnsx/fansly-downloader
Easy to use fansly.com content downloading tool. Written in python, but ships as a standalone Executable App for Windows too. Enjoy your Fansly content offline anytime, anywhere in the highest possible content resolution! Fully customizable to download in bulk or single: photos, videos & audio from timeline, messages, collection & specific posts 👍
cross-platform database datascraping downloader fansly fansly-download fansly-downloader fansly-scraper gui image-download linux macos open-source portable python reddit scraper video video-download windows
Last synced: 25 Sep 2024
https://github.com/Avnsx/fansly-downloader
Easy to use fansly.com content downloading tool. Written in python, but ships as a standalone Executable App for Windows too. Enjoy your Fansly content offline anytime, anywhere in the highest possible content resolution! Fully customizable to download in bulk or single: photos, videos & audio from timeline, messages, collection & specific posts 👍
cross-platform database datascraping downloader fansly fansly-download fansly-downloader fansly-scraper gui image-download linux macos open-source portable python reddit scraper video video-download windows
Last synced: 04 Aug 2024
https://github.com/cinemagoer/cinemagoer
Cinemagoer is a Python package useful to retrieve and manage the data of the IMDb (to which we are not affiliated in any way) movie database about movies, people, characters and companies
actors cast character cinema cinemagoer company database db imdb internet-movie-database movie movie-database movies parser python scraper sql
Last synced: 01 Oct 2024
https://github.com/huaying/instagram-crawler
Get Instagram posts/profile/hashtag data without using Instagram API
auto autoliker instagram instagram-bot instagram-crawler instagram-liker instagram-scraper likers python scraper webdriver
Last synced: 30 Sep 2024
https://github.com/holgerd77/django-dynamic-scraper
Creating Scrapy scrapers via the Django admin interface
django python scraper scraping scrapy spider webscraping
Last synced: 03 Oct 2024
https://github.com/vesche/scanless
online port scan scraper
command-line pentesting port-scanner scanning scraper
Last synced: 26 Sep 2024
https://github.com/shadowmoose/RedditDownloader
Scrapes Reddit to download media of your choice.
archival backup downloader media python3 reactjs reddit scraper
Last synced: 01 Aug 2024
https://github.com/shadowmoose/redditdownloader
Scrapes Reddit to download media of your choice.
archival backup downloader media python3 reactjs reddit scraper
Last synced: 30 Sep 2024
https://github.com/zerodytrash/TikTok-Live-Connector
Node.js library to receive live stream events (comments, gifts, etc.) in realtime from TikTok LIVE.
api api-wrapper bot broadcast chat chat-reader connector hacktoberfest javascript live livestream nodejs package scraper stream tiktok tiktok-api tiktok-live webcast websocket
Last synced: 01 Aug 2024
https://github.com/metafates/mangal
📖 The most advanced (yet simple) cli manga downloader in the entire universe! Lua scrapers, export formats, anilist integration, fancy TUI and more!
anilist anime cli comic-downloader command-line go golang linux lua macos manga manga-downloader manga-reader mangadex mangal pdf scraper terminal tui windows
Last synced: 01 Oct 2024
https://github.com/altimis/scweet
A simple and unlimited twitter scraper : scrape tweets, likes, retweets, following, followers, user info, images...
dowload-images followers following python save-image scrape scrape-followers scrape-following scrape-images scrape-likes scrape-tweets scraper scraping selenium-webdriver tweets twitter twitter-scraper
Last synced: 28 Sep 2024
https://github.com/consumet/api.consumet.org
A Modern Search Engine API for Anime, Movies/TVShows, Books, Light Novels, Manga, etc.
anilist anime anime-api api books books-api comics comics-api light-novels lightnovel-api manga manga-api movies movies-api rest-api scraper search-engine streaming streaming-api typescript
Last synced: 31 Jul 2024
https://github.com/vifreefly/kimuraframework
Kimurai is a modern web scraping framework written in Ruby which works out of box with Headless Chromium/Firefox, PhantomJS, or simple HTTP requests and allows to scrape and interact with JavaScript rendered websites
crawler headless-chrome kimurai scraper scrapy
Last synced: 30 Sep 2024
https://github.com/mariostoev/finviz
Unofficial API for finviz.com
analysis api chart csv database finviz finviz-api finviz-csv finviz-scraper pypi scraper screener sql
Last synced: 30 Sep 2024
https://github.com/fredwu/crawler
A high performance web crawler / scraper in Elixir.
crawler elixir files offline scraper scraper-engine spider
Last synced: 31 Jul 2024
https://github.com/vladkens/twscrape
2024! X / Twitter API scrapper with authorization support. Allows you to scrape search results, User's profiles (followers/following), Tweets (favoriters/retweeters) and more.
api async automation elonmusk httpx python scraper snscrape twitter twitter-api twitter-bot twitter-scraper x-api
Last synced: 01 Oct 2024
https://github.com/Nriver/Episode-ReName
电视剧/番剧自动化重命名工具, 一键批量改名. 可配合QBittorrent下载后自动重命名, 方便Emby自动刮削. 支持Windows, Linux, MacOS, Docker 和 群晖套件环境运行
automation command-line command-line-tool docker linux macos python python3 qbittorrent rename rename-script scraper synology windows
Last synced: 02 Aug 2024
https://github.com/oltarasenko/crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
crawler crawling elixir erlang extract-data scraper scraping scraping-websites spider
Last synced: 01 Aug 2024
https://github.com/elixir-crawly/crawly
Crawly, a high-level web crawling & scraping framework for Elixir.
crawler crawling elixir erlang extract-data scraper scraping scraping-websites spider
Last synced: 29 Sep 2024
https://github.com/jikan-me/jikan
Unofficial MyAnimeList PHP+REST API which provides functions other than the official API
anime api json library manga myanimelist myanimelist-api parsing php psr-2 psr-4 rest rest-php scraper
Last synced: 04 Aug 2024
https://github.com/scrapinghub/scrapyrt
HTTP API for Scrapy spiders
crawler crawling hacktoberfest hacktoberfest2021 python scraper scrapy twisted webcrawler webcrawling
Last synced: 01 Aug 2024
https://github.com/iawia002/Lulu
[Unmaintained] A simple and clean video/music/image downloader 👾
crawler crawling downloader python python3 scraper scraping video
Last synced: 09 Aug 2024
https://github.com/DataHenHQ/till
DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.
crawler man-in-the-middle mitm proxy-server scraper scraping web-scraping
Last synced: 31 Jul 2024
https://github.com/postmodern/spidr
A versatile Ruby web spidering library that can spider a site, multiple domains, certain links or infinitely. Spidr is designed to be fast and easy to use.
crawler ruby scraper spider spider-links web web-crawler web-scraper web-scraping web-spider
Last synced: 31 Jul 2024
https://github.com/feder-cr/Auto_Jobs_Applier_AIHawk
LinkedIn_AIHawk is a tool that automates the jobs application process on LinkedIn. Utilizing artificial intelligence, it enables users to apply for multiple job offers in an automated and personalized way.
application-resume automate automation bot challenge chatgpt chrome gpt job jobsearch jobseeker linkedin-api linkedin-scraper opeai python python3 resume scraper scraping selenium
Last synced: 24 Sep 2024
https://github.com/sananth12/ImageScraper
:scissors: High performance, multi-threaded image scraper
command-line commandline-tool pypi python scraper scraping terminal
Last synced: 31 Jul 2024
https://github.com/JosephLai241/URS
Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.
archiving command-line comments csv data-analysis data-science json livestream osint-tool praw pyo3 python reddit reddit-scraper redditor rust scraper subreddit trees wordcloud
Last synced: 31 Jul 2024
https://github.com/ruippeixotog/scala-scraper
A Scala library for scraping content from HTML pages
dsl hacktoberfest html-parsing scala scraper
Last synced: 31 Jul 2024
https://github.com/fanyong920/jvppeteer
Headless Chrome For Java (Java 爬虫)
chrome chrome-headless crawler java jvppeteer puppeteer scraper
Last synced: 27 Sep 2024
https://github.com/gajus/surgeon
Declarative DOM extraction expression evaluator. 👨⚕️
css-selector parser scraper subroutines
Last synced: 01 Oct 2024
https://github.com/piquette/finance-go
:bar_chart: Financial markets data library implemented in go.
cryptocurrency data finance financial-data financial-markets go-library golang options pandas scraper stock-data stock-market stock-trading
Last synced: 31 Jul 2024
https://github.com/graniet/operative-framework
operative framework is a rust investigation OSINT framework, you can interact with multiple targets, execute multiple modules, create links with target, export rapport to PDF file, add note to target or results, interact with RESTFul API, write your own modules.
enterprise fingerprint forensics framework gathering geoint investigation linkedin osint phone rust rust-lang scraper societe whatsapp whatsapp-api whatsapp-web
Last synced: 11 Aug 2024
https://github.com/gaulliath/operative-framework
operative framework is a rust investigation OSINT framework, you can interact with multiple targets, execute multiple modules, create links with target, export rapport to PDF file, add note to target or results, interact with RESTFul API, write your own modules.
enterprise fingerprint forensics framework gathering geoint investigation linkedin osint phone rust rust-lang scraper societe whatsapp whatsapp-api whatsapp-web
Last synced: 30 Sep 2024
https://github.com/d60/twikit
Twitter API Scraper | Without an API key | Twitter Internal API | Free | Twitter scraper | Twitter Bot
api-wrapper bot client free python python3 scrape scraper scraping search twitter twitter-api twitter-bot twitter-client twitter-internal-api twitter-scraper twitter-scraper-2023 wrapper x x-api
Last synced: 31 Jul 2024
https://github.com/slotix/dataflowkit
Extract structured data from web sites. Web sites scraping.
cdp chrome-fetcher crawling extract-data go golang golang-library headless scraper scraping scraping-websites
Last synced: 30 Jul 2024
https://github.com/benibela/xidel
Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.
cli command-line css-selector curl data-processing datascraping html http httpie json rest scraper web webscraper webscraping wget xml xmlstarlet xpath xquery
Last synced: 30 Sep 2024