Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

https://github.com/glaucocustodio/tanakai

Tanakai is a modern web scraping framework written in Ruby. A fork of Kimurai.

chrome-headless crawler kimurai scraper scrapy webscraping

Last synced: 03 Jul 2024

https://github.com/navchandar/Wifi_Usage_Tracker

Used to track WiFi usage from ISP website and track network speed using fast.com

data fast isp isp-website python scraper speedtest task-scheduler track-wifi tracker wifi

Last synced: 03 Jul 2024

https://github.com/lorepozo/magnet

Search for a torrent from the command-line and start streaming

magnet-link scraper stream torrent

Last synced: 02 Jul 2024

https://github.com/CodeDotJS/allstars

:sparkles: A tiny tool to export all your starred repositories.

cli exporter fetcher github json python repository scraper stars

Last synced: 01 Jul 2024

https://github.com/masterT/bandcamp-scraper

A scraper for https://bandcamp.com

album api artist bandcamp hacktoberfest product scraper

Last synced: 01 Jul 2024

https://github.com/raystack/meteor

Meteor is an easy-to-use, plugin-driven metadata collection framework to extract data from different sources and sink to any data catalog.

bigdata collector data-catalog data-management dataops extractors metadata scraper sinks

Last synced: 01 Jul 2024

https://github.com/nberlette/dql

Web Scraping with Deno: DOM + GraphQL

deno deno-deploy denoland dom dom-parser dql graphql graphql-scraper scraper webscraping

Last synced: 29 Jun 2024

https://github.com/kauderk/youtube-browser-api

Retrieve YouTube data from your Browser

endpoint javascript scraper transcript video-data youtube youtube-api

Last synced: 29 Jun 2024

https://github.com/jacktuck/unfurl

Metadata scraper with support for oEmbed, Twitter Cards and Open Graph Protocol for Node.js :zap:

embed meta-tags metadata micro microservice nodejs oembed ogp open-graph scraper slack twitter-cards unfurl

Last synced: 29 Jun 2024

https://github.com/DIGITALCRIMINAL/ArchivedUltimaScraper

Scrape content from OnlyFans and Fansly

downloader fansly onlyfans scraper

Last synced: 27 Jun 2024

https://github.com/Avnsx/fansly-downloader

Easy to use fansly.com content downloading tool. Written in python, but ships as a standalone Executable App for Windows too. Enjoy your Fansly content offline anytime, anywhere in the highest possible content resolution! Fully customizable to download in bulk or single: photos, videos & audio from timeline, messages, collection & specific posts 👍

cross-platform database datascraping downloader fansly fansly-download fansly-downloader fansly-scraper gui image-download linux macos open-source portable python reddit scraper video video-download windows

Last synced: 27 Jun 2024

https://github.com/DenizShabani/telegramscraper

Scraper and adder for Telegram supporting multiple accounts at the same time. Adds via Telegram API and only by username. For adding via ID and not needing Telgram API contact me.

adder osint programming python scraper script telegram

Last synced: 27 Jun 2024

https://github.com/SepehrRasouli/DigikalaWebScraper

Digikala Webscraper Connected to excel

digikala digikala-crawler excel python python3 scrape scraper

Last synced: 27 Jun 2024

https://github.com/roniemartinez/dude

dude uncomplicated data extraction: A simple framework for writing web scrapers using Python decorators

async beautifulsoup4 crawler css framework lxml parsel playwright python scraper scraping selenium sync web-scraping webscraping xpath

Last synced: 27 Jun 2024

https://github.com/hemin1003/java-spider

一个基于webmagic框架二次开发的java爬虫框架实战,已实现能爬取腾讯,搜狐,今日头条(单独集成功能)等资讯内容,配合elasticsearch框架用法,实现了自动爬虫,已投入线上生产使用。

elasticsearch scraper spider webmagic

Last synced: 27 Jun 2024

https://github.com/DataHenHQ/till

DataHen Till is a companion tool to your existing web scraper that instantly makes it scalable, maintainable, and more unblockable, with minimal code changes on your scraper. Integrates with any scraper in 5 minutes.

crawler man-in-the-middle mitm proxy-server scraper scraping web-scraping

Last synced: 27 Jun 2024

https://github.com/daijro/SearchifyX

Fast flashcard searcher study tool

education quizizz quizlet scraper webscraper webscraping

Last synced: 27 Jun 2024

https://github.com/jikan-me/jikan

Unofficial MyAnimeList PHP+REST API which provides functions other than the official API

anime api json library manga myanimelist myanimelist-api parsing php psr-2 psr-4 rest rest-php scraper

Last synced: 27 Jun 2024

https://github.com/cornelk/goscrape

Web scraper that can create an offline readable version of a website

go golang scraper webscraping

Last synced: 27 Jun 2024

https://github.com/kalbhor/Image-Scraper

Fast concurrent image scraper

golang image-scraper multithreading scraper

Last synced: 27 Jun 2024

https://github.com/Adyzng/jd-autobuy

Python爬虫,京东自动登录,在线抢购商品

crawler jingdong python scraper

Last synced: 26 Jun 2024

https://github.com/fa0311/TwitterFrontendFlow

Unofficial Client for Twitter Internal API

scraper twitter twitter-bot unofficial

Last synced: 26 Jun 2024

https://github.com/marcopompili/django-instagram

Instagram application for Django.

django instagram scraper

Last synced: 26 Jun 2024

https://github.com/fa0311/twitter-openapi-typescript

Implementation of Twitter internal API (Twitter graphql API) in TypeScript

graphql openapi scraper twitter typescript undocumented unofficial

Last synced: 25 Jun 2024

https://github.com/cinemagoer/cinemagoer

Cinemagoer is a Python package useful to retrieve and manage the data of the IMDb (to which we are not affiliated in any way) movie database about movies, people, characters and companies

actors cast character cinema cinemagoer company database db imdb internet-movie-database movie movie-database movies parser python scraper sql

Last synced: 24 Jun 2024

https://github.com/AlexandreGazagnes/awdible

Awdible - Just the best free version of audible. Awdible is a free and open-source software that allows you to download music from youtube and convert it to mp3. The idea is to provide a free version of awdible.

api audio beginner-friendly downloader learning learning-by-doing mp3 python scraper scraping streamlit tool video youtube youtube-api youtube-dl youtube-downloader

Last synced: 24 Jun 2024

https://github.com/alash3al/scraply

Scraply a simple dom scraper to fetch information from any html based website

crawler crawling dom golang scraper scrapers scraping-websites scrapy server

Last synced: 21 Jun 2024

https://github.com/entrepreneur-interet-general/OpenScraper

An open source webapp for scraping: towards a public service for webscraping

bulma entrepreneur-interet-general html mongodb python python2 scraper scrapy spider tornado xpath

Last synced: 20 Jun 2024

https://github.com/hfreire/browser-as-a-service

A web browser :earth_americas: hosted as a service, to render your JavaScript web pages as HTML

browser browser-as-a-service crawler docker github-actions javascript puppeteer rest-api scraper server webcrawler

Last synced: 19 Jun 2024

https://github.com/website-scraper/node-website-scraper

Download website to local directory (including all css, images, js, etc.)

hacktoberfest javascript nodejs scraper website-scraper

Last synced: 19 Jun 2024

https://github.com/nigeld3v/Tumblr_Image_scrape

Download ALL the images (JPEG/GIF/PNG) from any Tumblr website! This project employs Python3 and BeautifulSoup4 to scrape a Tumblr site (with the url provided by the user) to download, page by page, all the images from the Tumblr site's posts. Ideal for archiving other peoples' Tumblrs <3

archive art beautifulsoup beautifulsoup4 blog blogging comics design fashion gif gifs graphics graphics-library image images scraper tumblr tumblr-image-scrape webcomics website-scraper

Last synced: 17 Jun 2024

https://github.com/lapwat/papeer

Scrape the web in the eink era. Convert websites into ebooks and markdown.

cli command-line command-line-tool ebook eink epub ereader kindle markdown mobi remarkable remarkable-tablet scraper

Last synced: 17 Jun 2024

https://github.com/Nriver/Episode-ReName

电视剧/番剧自动化重命名工具, 一键批量改名. 可配合QBittorrent下载后自动重命名, 方便Emby自动刮削. 支持Windows, Linux, MacOS, Docker 和 群晖套件环境运行

automation command-line command-line-tool docker linux macos python python3 qbittorrent rename rename-script scraper synology windows

Last synced: 16 Jun 2024

https://github.com/dbeley/rymscraper

Python library to extract data from rateyourmusic.com.

python rateyourmusic scraper web-scraping

Last synced: 16 Jun 2024

https://github.com/dbeley/senscritiquescraper

Python library to extract data from senscritique.com.

python scraper senscritique web-scraping

Last synced: 16 Jun 2024

https://github.com/pavlovtech/WebReaper

Web scraper, crawler and parser in C#. Designed as simple, declarative and scalable web scraping solution.

crawler datamining parser parsing scraper scraping scraping-api scraping-data scraping-tool scraping-web scraping-websites webcrawler webscraping

Last synced: 15 Jun 2024

https://github.com/Himatric/SEMID

SEMID is a OSINT module with lots of discord functions.

chat discord doxbin github ip osint playstation scraper socials tiktok tiktok-scraper twitter

Last synced: 14 Jun 2024

https://github.com/facundoolano/google-play-scraper

Node.js scraper to get data from Google Play

api crawler google-play nodejs scraper

Last synced: 14 Jun 2024

https://github.com/Kr0ff/Pasta

A PasteBin scrapper that doesnt rely on the PasteBin scrape API

crawler osint python scraper scraping

Last synced: 14 Jun 2024

https://github.com/tasos-py/Search-Engines-Scraper

Search google, bing, yahoo, and other search engines with python

bing crawler google python scraper search-engine yahoo

Last synced: 14 Jun 2024

https://github.com/riz4d/WaGpScraper

A Python Oriented tool to Scrap WhatsApp Group Link using Google Dork it Scraps Whatsapp Group Links From Google Results And Gives Working Links.

cli google python remoded scraper whatsapp

Last synced: 14 Jun 2024

https://github.com/jakecreps/ruby

A Rumble, BitChute, and YouTube scraper

bitchute osint python rumble scraper smat youtube

Last synced: 14 Jun 2024

https://github.com/bellingcat/reddit-post-scraping-tool

Given a subreddit name and a keyword, this program returns all top (by default) posts that contain the specified keyword.

command-line gui open-source-research python reddit scraper visual-basic

Last synced: 14 Jun 2024

https://github.com/achyuthjoism/tweeds

An advanced Twitter scraping & OSINT tool written in Python that doesn't use Twitter's API, allowing you to scrape a Tweets and more while evading most API limitations.

hacking-tool osint osint-python python scraper tweets twitter twitter-api twitter-osint

Last synced: 14 Jun 2024

https://github.com/markowanga/stweet

Advanced python library to scrap Twitter (tweets, users) from unofficial API

api crawl python scrap scrap-tweet scrape scraper scrapper search searchrunner tweet tweets twint twitter twitter-api unofficial user users

Last synced: 14 Jun 2024

https://github.com/edoardottt/cariddi

Take a list of domains, crawl urls and scan for endpoints, secrets, api keys, file extensions, tokens and more

bugbounty crawler crawling endpoint-discovery endpoints go golang hacktoberfest infosec osint penetration-testing pentesting recon reconnaissance redteam scraper secret-keys secrets-detection security security-tools

Last synced: 14 Jun 2024

https://github.com/ohmybahgosh/YT-DLP-SCRIPTS

...Just a place for me to share my various YT-DLP & related bash scripts.

bash bash-script downloading ffmpeg ffmpeg-script parser scraper shell-script youtube-dl yt-dlp

Last synced: 13 Jun 2024

https://github.com/ohmybahgosh/FONTS_DOT_COM_RIPPER

Script to extract entire font families from Fonts.com, rips them as woff2 and final output includes woff2 and ttf files

bash bash-script curl datamining download-fonts font fonts scrape scrape-websites scraper sed shell-script typography woff2 woff2-files xidel

Last synced: 13 Jun 2024

https://github.com/shailshouryya/yt-videos-list

Create and **automatically** update a list of all videos on a YouTube channel (in txt/csv/md form) via YouTube bot with end-to-end web scraping - no API tokens required. Multi-threaded support for YouTube videos list updates.

automation bravedriver chromedriver csv firefox-headless geckodriver operadriver safaridriver scraper selenium txt youtube youtube-api youtube-channel youtube-dl youtube-downloader youtube-playlist yt yt-downloader ytdl

Last synced: 13 Jun 2024

https://github.com/soxoj/marple

📖 Collect links to profiles by username through search engines and analyze with various plugins

osint scraper search-engine username-checker username-search

Last synced: 13 Jun 2024

https://github.com/withspectrum/micro-open-graph

A tiny Node.js microservice to scrape open graph data with joy.

micro microservices nodejs opengraph scraper

Last synced: 13 Jun 2024

https://github.com/Tatsh/youtube-unofficial

Access parts of your account unavailable through normal YouTube API access.

command-line python scraper utilities utility youtube

Last synced: 13 Jun 2024

https://github.com/sanghviharshit/pocket-tagger

📖👓🏷Tag your getpocket.com articles automatically using natural language processing

articles getpocket google-cloud natural-language-processing nlp pocket scraper tag

Last synced: 13 Jun 2024

https://github.com/lachlanjc/predictcovid

Visualize & track the 2020 COVID-19 pandemic by country.

coronavirus covid-19 covid19 dataviz prisma2 redwoodjs scraper

Last synced: 13 Jun 2024

https://github.com/christophebe/serp

Google Search SERP Scraper

google scraper seo serp serps

Last synced: 13 Jun 2024

https://github.com/RandomNinjaAtk/docker-raromprocessor

RA ROM Processor is a Docker container that is used to aquire/orgainze/process/verify/dedupe/scrape a ROMs library automatically by matching ROMs to the RetroAchievement.org website Hash database.

bash emulationstation rahasher retroachievements retrogaming roms scraper script

Last synced: 12 Jun 2024

https://github.com/mahesh-hegde/rrip

Bulk image downloader for reddit.

downloader golang reddit scraper

Last synced: 11 Jun 2024

https://github.com/Berke-Alp/kandilli-rasathanesi

Kandilli Rasathanesi'ndeki deprem verileri çeken uygulama.

api earthquake kandilli kandilli-api scraper

Last synced: 11 Jun 2024

https://github.com/zerodytrash/TikTok-Live-Connector

Node.js library to receive live stream events (comments, gifts, etc.) in realtime from TikTok LIVE.

api api-wrapper bot broadcast chat chat-reader connector hacktoberfest javascript live livestream nodejs package scraper stream tiktok tiktok-api tiktok-live webcast websocket

Last synced: 09 Jun 2024

https://github.com/cowboy-bebug/app-store-scraper

Single API ☝ App Store Review Scraper 🧹

app-store appstore review-data scraper

Last synced: 09 Jun 2024

https://github.com/SMerrony/aghast

AGHAST is A Go Home Automation SysTem closely coupled with MQTT.

automation csv-export daikin go golang home-automation influxdb lidl logging mqtt node-red postgresql scraper zigbee zigbee2mqtt

Last synced: 09 Jun 2024

https://github.com/jmrchelani/scrap-this-web

API to scrap HTML CSS and JS of a website

javascript scraper web web-scraping

Last synced: 09 Jun 2024

https://github.com/Dentrax/GMDB

GMDB is the ultra-simple, cross-platform Movie Library with Features (Search, Take Note, Watch Later, Like, Import, Learn, Instantly Torrent Magnet Watch)

goquery hacktoberfest imdb imdb-cli imdb-webscrapping magnet magnet-link movie movie-database movies netflix note-taking notebook peerflix rotten-tomatoes rottentomatoes scraper search-engine torrent torrent-search

Last synced: 09 Jun 2024

https://github.com/extractus/article-extractor

To extract main article from given URL with Node.js

article article-extractor article-parser crawler extract nodejs readability scraper

Last synced: 09 Jun 2024

https://github.com/lorey/mlscraper

🤖 Scrape data from HTML websites automatically by just providing examples

crawler crawler-python crawling extraction-engine html machine-learning scraper scraping

Last synced: 09 Jun 2024

https://github.com/aapatre/Automatic-Udemy-Course-Enroller-GET-PAID-UDEMY-COURSES-for-FREE

Do you want to LEARN NEW STUFF for FREE? Don't worry, with the power of web-scraping and automation, this script will find the necessary Udemy coupons & enroll you for PAID UDEMY COURSES, ABSOLUTELY FREE!

python python3 scraper scraping selenium

Last synced: 08 Jun 2024

https://github.com/Matthew17-21/Captcha-Tools

All-in-one Python (And now Go!) module to help solve captchas with Capmonster, 2captcha, Anticaptcha, and Capsolver API's!

2captcha 2captcha-api anticaptcha anticaptcha-client capsolver capsolvercom captcha hcaptcha recaptcha scraper scraping scraping-api sneakerbot sneakerbots sneakers

Last synced: 08 Jun 2024

https://github.com/JosephLai241/URS

Universal Reddit Scraper - A comprehensive Reddit scraping/archival command-line tool.

archiving command-line comments csv data-analysis data-science json livestream osint-tool praw pyo3 python reddit reddit-scraper redditor rust scraper subreddit trees wordcloud

Last synced: 08 Jun 2024

https://github.com/PaulMcInnis/JobFunnel

Scrape job websites into a single spreadsheet with no duplicates.

automated beautifulsoup beautifulsoup4 csv glassdoor indeed international job jobs monster python scraper search tfidf waterloo yaml

Last synced: 08 Jun 2024

https://github.com/JavScraper/Emby.Plugins.JavScraper

Emby/Jellyfin 的一个日本电影刮削器插件,可以从某些网站抓取影片信息。

adult emby fanart-poster fc2 japanese jav jav-scraper javbus jellyfin jsproxy metadata plugin scraper synology

Last synced: 08 Jun 2024

https://github.com/platonai/PulsarRPA

Automate webpages at scale, scrape web data completely and accurately with high performance, distributed RPA.

crawler data-mining data-science rpa scraper scraping web-automation web-crawler web-mining web-scraping web-sql

Last synced: 07 Jun 2024

https://github.com/guyueyingmu/avbook

AV 电影管理系统, avmoo , javbus , javlibrary 爬虫,线上 AV 影片图书馆,AV 磁力链接数据库,Japanese Adult Video Library,Adult Video Magnet Links - Japanese Adult Video Database

adult adult-video avmoo crawler database guzzlehttp javbus javlibrary laravel magnet magnet-link scraper spider

Last synced: 06 Jun 2024

https://github.com/robotshell/robotScraper

RobotScraper is a simple tool written in Python to check each of the paths found in the robots.txt file and what HTTP response code they return.

bounty-hunting-tools bugbounty hacking infosec python robots scraper tool

Last synced: 05 Jun 2024

https://github.com/h33tlit/Parameter-Reflect-Finder

Parameter-Reflect-Finder is a python based tool that helps you find reflected parameters which can have potential XSS or Open redirection vulnerabilities.

bug-bounty bugbounty open-redirect open-redirect-detection parameter-search reflector scanner scraper xss xss-detection xss-scanner

Last synced: 05 Jun 2024

https://github.com/consumet/consumet.ts

Nodejs library that provides high-level APIs for obtaining information on various entertainment media such as books, movies, comic books, anime, manga, and so on.

anilist anime anime-list api books light-novels manga manga-api movies movies-api npm npm-package reading scraper streaming streaming-api typescript

Last synced: 05 Jun 2024

https://github.com/geziyor/geziyor

Geziyor, blazing fast web crawling & scraping framework for Go. Supports JS rendering.

crawler go scraper scraping spider

Last synced: 05 Jun 2024

https://github.com/benibela/xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

cli command-line css-selector curl data-processing datascraping html http httpie json rest scraper web webscraper webscraping wget xml xmlstarlet xpath xquery

Last synced: 03 Jun 2024

https://github.com/ManiMozaffar/linkedIn-scraper

A playwright bot which is implemented to scrape linkedin and store advertisement data in a database and telegram channel

bot browser-fingerprint browser-fingerprinting chatgpt chatgpt-api cralwer fastapi linkedin linkedin-bot playwright python scraper scraping spider sqlalchemy

Last synced: 03 Jun 2024

https://github.com/mendableai/firecrawl-py

Crawl and convert any website into clean markdown

ai crawler llm python scraper

Last synced: 02 Jun 2024

https://github.com/tibobrc/Blinkist-to-Readwise

Extract highlights from your Blinkist account and upload them to your Readwise account, or download them to a CSV file.

blinkist blinkist-highlights blinkist-to-readwise highlights python readwise readwise-highlights scraper

Last synced: 02 Jun 2024

https://github.com/gajus/surgeon

Declarative DOM extraction expression evaluator. 👨‍⚕️

css-selector parser scraper subroutines

Last synced: 02 Jun 2024

https://github.com/redco/goose-parser

Universal scraping tool, which allows you to extract data using multiple environments

browser crawler docker goose jsdom nodejs parser parsing phantomjs scraper scraping

Last synced: 02 Jun 2024

https://github.com/jadkins89/Recipe-Scraper

A JS package for scraping recipes from the web.

food-recipes recipe-scraper recipes scraper

Last synced: 02 Jun 2024

https://github.com/nickysemenza/message-analyzer

💬 📊 Facebook Messenger history scraper + analyzer

chats facebook scraper

Last synced: 01 Jun 2024