An open API service indexing awesome lists of open source software.

Projects in Awesome Lists tagged with datascraping

A curated list of projects in awesome lists tagged with datascraping .

https://github.com/UltimaHoarder/UltimaScraper

Scrape all the media from an OnlyFans account - Updated regularly

archive datascraping onlyfans scraper

Last synced: 26 Mar 2025

https://github.com/avnsx/fansly-downloader

Easy to use fansly.com content downloading tool. Written in python, but ships as a standalone Executable App for Windows too. Enjoy your Fansly content offline anytime, anywhere in the highest possible content resolution! Fully customizable to download in bulk or single: photos, videos & audio from timeline, messages, collection & specific posts πŸ‘

cross-platform database datascraping downloader fansly fansly-download fansly-downloader fansly-scraper gui image-download linux macos open-source portable python reddit scraper video video-download windows

Last synced: 27 Sep 2025

https://github.com/Avnsx/fansly-downloader

Easy to use fansly.com content downloading tool. Written in python, but ships as a standalone Executable App for Windows too. Enjoy your Fansly content offline anytime, anywhere in the highest possible content resolution! Fully customizable to download in bulk or single: photos, videos & audio from timeline, messages, collection & specific posts πŸ‘

cross-platform database datascraping downloader fansly fansly-download fansly-downloader fansly-scraper gui image-download linux macos open-source portable python reddit scraper video video-download windows

Last synced: 09 Jul 2025

https://github.com/datawhores/OF-Scraper

A completely revamped and redesigned fork, reimagined from scratch based on the original onlyfans-scraper

datascraping downloader fansite onlyfans scraping

Last synced: 01 May 2025

https://github.com/benibela/xidel

Command line tool to download and extract data from HTML/XML pages or JSON-APIs, using CSS, XPath 3.0, XQuery 3.0, JSONiq or pattern matching. It can also create new or transformed XML/HTML/JSON documents.

cli command-line css-selector curl data-processing datascraping html http httpie json rest scraper web webscraper webscraping wget xml xmlstarlet xpath xquery

Last synced: 15 May 2025

https://github.com/Gertje823/Vinted-Scraper

This is a tool to scrape/download images and data from Vinted & Depop using the API and stores the data in a SQLite database.

database datascraping depop downloader python python3 scraper sqlite sqlite3 vinted

Last synced: 17 Jun 2025

https://github.com/castlelemongrab/parlance

A minimum-dependency ECMAScript client library and CLI tool for Parler – a "free speech" social network that accepts real money to buy "influence" points to boost organic non-advertising content

data-science datascience datascraping disinformation es7 hatespeech javascript law-enforcement misinformation node nodejs osint parlance parler social-media social-networks speech twitter

Last synced: 15 Apr 2025

https://github.com/dimitryzub/hotels-scraper-js

Scrape Airbnb, Booking, Hotels.com from a single JavaScript module. ❗No longer maintained.

airbnb booking data datascraping hotels hotels-api playwright puppeteer puppeteer-extra webscraping

Last synced: 07 Sep 2025

https://github.com/agenty/scrapingai

Build web scraping agents using AI to auto-extract the data from websites, capture screenshot, generate pdf from URL and web crawling with Agenty

crawler crawling datascraping extract-data scraping webscraper webscraping

Last synced: 12 Apr 2025

https://github.com/easonlai/playstore_reviews_scraping_and_text_analytics

This is demo repo to demostrate how to scrape apps review data from Google Play Store by Python with library Google-Play-Scraper. And then use Azure Text Analytics to perform sentiment analysis for reviews content (aka comments).

azure azure-text-analysis azure-text-analytics data-scraping datascraping google-play-store google-play-store-data-analysis google-play-store-scraper microsoft-azure microsoft-cognitive-services pandas python python3 seaborn sentiment-analysis text-analytics

Last synced: 08 Aug 2025

https://github.com/ice-wzl/datareaper

DataReaper is a powerful Python tool designed to harvest data from publicly accessible HTTP servers. It combines the capabilities of Shodan search with web scraping techniques to efficiently gather information from targeted websites.

data-visualization datascience datascraping osint osint-python osint-tool python3 redteam vulnerability

Last synced: 14 Aug 2025

https://github.com/dimitryzub/py-google-scholar-organic-cite-to-csv-sqlite

Scrape historic Google Scholar Organic and Cite results to CSV, MySQL Lite using Python and SerpApi.

csv data dataextraction datamining datascience datascraping dataset google googlescholar python scraper serpapi sqlite webscraper webscraping

Last synced: 14 Aug 2025

https://github.com/george-mountain/data-extraction-integration-and-analysis---clustering-operations

This repository for a project detailing the step by step approach of scraping data, integrating data from various sources, performing analysis on data from various sources for the purpose of analaysis. It also shows how APIs can be harnessed for data engr operations. In this project, the four square API was utilized for the location data.

clustering-algorithm dataingestion dataintegration dataproject datascience datascraping foursquare-api machine-learning

Last synced: 14 Mar 2025

https://github.com/pavankethavath/redbus-project

A Streamlit web app using Selenium for RedBus data scraping, MySQL for storage, and pandas for preprocessing, enabling dynamic filtering of buses by route, seat type, fare, ratings, and departure time. Offers an intuitive interface with robust data handling, supporting travel planning, customer insights, and market analysis for data-driven decision

datamanagement datascraping feature-engineering mysql mysqlworkbench pandas python regex selenium streamlit

Last synced: 14 Jul 2025

https://github.com/auvroislam/olympic_vs_gdp

Analyzing GDP vs Olympic Performance using Selenium and Tableau-public

beautifulsoup colab-notebook datascraping jupyter-notebook pandas python selenium table tableau-public

Last synced: 09 Mar 2025

https://github.com/mominurr/cars.com

Cars.com Scraper – Extracts car listings (make, model, year, price, seller details) from cars.com using Selenium and BeautifulSoup, saving data in CSV format.

datascraping pandas python scraper scraping webcrawler webcrawling webscraping

Last synced: 25 Mar 2025

https://github.com/mominurr/social-media-scraping

Social Media Scraping – Scrapes data from TikTok, LinkedIn, Facebook, and Twitter (X.com), including user profiles, posts, engagement metrics, and comments.

datascraping pandas python scraper scraping selenium webcrawler webcrawling webscraping

Last synced: 30 Jun 2025

https://github.com/mominurr/realself.com_scraper

realself.cm data scraper that scrape website all information and bypass ip blocking and press & hold captcha.

datascraper datascraping python security-bypass webcrawler webcrawling webscraper webscraping

Last synced: 25 Mar 2025

https://github.com/mominurr/stackoverflow.com

A web scraper collecting Stack Overflow questions for NLP, using threading and user-agent rotation

datascraping pandas python requests stackoverflow stackoverflowscraper webcrawler webcrawling webscraper webscraping

Last synced: 17 Mar 2025

https://github.com/abu14/web-scraping-for-hotel-booking

Scrapping from bookings.com using python for automated data scraping. Extract multiple variables such as available hotels, ditances, price, ratings, and others.

datascraping webscraping

Last synced: 29 Mar 2025

https://github.com/udhaya2823/red_bus_project

🚌 Red Bus Project Overview The Red Bus Project is a web scraping and visualization tool built with Selenium to extract bus information from the RedBus website. It stores the data in a MySQL database and provides an interactive visualization interface using Streamlit. The goal is to deliver insights into bus schedules, prices, ratings, etc...

data-science database-management datascraping pandas python selenium-python sql streamlit-webapp

Last synced: 10 Nov 2025

https://github.com/devv712/savvy-track

An app called Savvy Track. It’s designed to help users track product prices across various e-commerce sites, ensuring they never miss a deal.

cronjob datascraping nextjs14 perl replit shell typescript webscraping

Last synced: 01 Mar 2025

https://github.com/dineshh912/movie-dataextractor

This is small script for extracting movie data from different sites like IMDB

beautifulsoup4 datascience datascraping python3

Last synced: 01 Mar 2025

https://github.com/rahul-404/linkedin-data-scraper

Welcome to LinkedIn Data Scraping ProjectπŸŒπŸ”! We scrape LinkedIn profiles for insights in recruitment, market analysis, and networking. Using web scraping, we gather professional info like experiences, skills, and education.

analytics automation beautifulsoup4 datascience datascraping linkedin marketanalysis networking python recruitment techprojects webscarpping

Last synced: 08 Oct 2025

https://github.com/galal-pic/glassdoor_datascraping

Scrape data and job description from Glassdoor website

datascraping glassdoor-scraper jobpos pandas python selenium time

Last synced: 19 Oct 2025

https://github.com/mominurr/google-map-scraping

google map scraper collect google map all available data and collect email from business website.

datascraping python scraping selenium webcrawler webcrawling webscraper webscraping

Last synced: 26 Oct 2025

https://github.com/tamk-kol/mutual_fund_data_scrapper

This Python script fetches and stores mutual fund data from the MFAPI as CSV files. The script retrieves a list of mutual funds and then downloads their individual data, saving each to a CSV file in a specified folder.

datascraper datascraping mutual-funds pandas pandas-library pandas-python python3 tqdm

Last synced: 24 Feb 2025

https://github.com/yacineouhrouche/datascaping

A code that scrape data from all 4 major USA sports and display their best players

datascraping network python

Last synced: 04 Sep 2025

https://github.com/shoaib-akther-asif/country-wise-quality-of-life-overview

Data scraping with Selenium & visualizing the results through interactive dashboards in Tableau Public.

datanalytics datapreprocessing datascraping datavisualization python selenium-webdriver tableau-public

Last synced: 26 Feb 2025

https://github.com/codeofrahul/flipkart-laptop-data-scraping

This project tackles the common challenge of data acquisition from dynamic websites, specifically Flipkart's laptop listings. Facing the hurdles of complex HTML structures and potential JavaScript rendering, this scraper leverages the power of Python, Selenium to automate the extraction of crucial product data.

automation data-science dataanalysisusingpython datascraping laptop python3 selenium selenium-webdriver seleniumautomation webscraping

Last synced: 26 Dec 2025

https://github.com/agenty/agenty.testdata

This project contains the publc test data set to try and learn how to use cloud-based agents in Agenty.

bigdata datascraping htmlparser machine-intelligence ocr webdata webscraping

Last synced: 04 Jan 2026