Projects in Awesome Lists tagged with scraped-data
A curated list of projects in awesome lists tagged with scraped-data .
https://github.com/CUNY-CL/wikipron
Massively multilingual pronunciation mining
computational-linguistics g2p language linguistics nlp phonetics phonology pronunciation python-api scraped-data speech
Last synced: 03 Apr 2025
https://github.com/warifp/shopee-scrape
Shopee Scrape is a tool that functions to collect data - the data needed, such as finding data from photos, prices, names, store locations and others.
curl curl-functions curl-library curlphp indonesia marketplace php php-library scrape scrape-images scrape-websites scraped-data scraper scraper-engine shopee shopee-api
Last synced: 22 Mar 2025
https://github.com/swader/diffbot-php-client
[Deprecated - Maintenance mode - use APIs directly please!] The official Diffbot client library
ai artificial-intelligence bot crawl crawling diffbot machine-learning nlp php scrape scraped-data scraper scraping
Last synced: 20 Dec 2024
https://github.com/tangible-idea/bitutils
Systematic coin price notifier, Telegram public channel history parser, Trading tool with python
bittrex-api python scraped-data telegram-channel
Last synced: 10 Apr 2025
https://github.com/racinmat/mal-analysis
github repo for MyAnimeList analysis. Also links to the MAL dataset.
analysis anime crawling data-science kaggle-dataset mal scraped-data
Last synced: 07 Apr 2025
https://github.com/benjaminvdb/DBRD
110k Dutch Book Reviews Dataset for Sentiment Analysis
dataset dataset-creation dutch nlp nlp-machine-learning python python3 scraped-data scraper
Last synced: 10 May 2025
https://github.com/fernandod1/producthunt-scraper
Producthunt.com famous website scraper script. Scrap all offers and save in spreadsheet excel file.
crawler crawling crawling-sites data-mining datamining producthunt producthunt-api producthunt-users python python-script python3 scrape scraped-data scraper scraper-engine scraping scraping-bot scraping-python scraping-tool scraping-websites
Last synced: 03 May 2025
https://github.com/palahsu/youtubescraper
Scraping YouTube Video Description and Video Likes and Comments and Times and Replies! It's Automatically Extracting Data from Video.
scraped-data scraper scraping youtube youtube-api youtube-api-v3 youtube-data-analysis youtube-data-api-v3 youtube-data-scraping youtube-dl youtube-downloader youtube-scraper
Last synced: 24 Apr 2025
https://github.com/faheel/file-extensions
JSON collection of scraped file extensions, along with their description and type, from FileInfo.com
file-extensions fileinfo json python3 scraped-data scraper website-scraper
Last synced: 09 Apr 2025
https://github.com/superkogito/coinmarketcapscraper
a small python scraper to scrape historical data from the CoinMarketCap website and convert it to csv files . This is an initial step for a data mining process to develop a predictive model of cryptocurrencies prices.
big-data coinmarketcap coinmarketcap-website cryptocurrencies-prices cryptocurrency csv csv-files python-scraper scraped-data scraper
Last synced: 04 Apr 2025
https://github.com/harshcasper/blind-app-reviews
Scraped reviews of over 25 companies from the Blind App ⚡️
blind-app company-reviews dataset nlp scrape scraped-data text-mining webscraping
Last synced: 20 Feb 2025
https://github.com/malina/metascraper
Metascraper is a Crystal library for web scraping.
Last synced: 15 Mar 2025
https://github.com/davidbellamy/visa_dates
Web scraper for US visa bulletins
china india mexico philippines scraped-data us-government visa-application
Last synced: 14 Apr 2025
https://github.com/ephellon/game-store-catalog
Catalog of PlayStation, Xbox, Nintendo, and Steam games
catalog game games nintendo playstation playstore psn scraped-data steam steam-games store xbox
Last synced: 19 Feb 2025
https://github.com/nomansiddiqui0000/rozee.pk-jobs-scrapper
This scraper, built in Node.js using Puppeteer and Cheerio, is designed to extract job listings from the Rozee.pk website. It can scrape multiple pages and gather detailed information, including job titles, company names, skills, and more. The output is saved in structured CSV files, with sample datasets for cities like Lahore, Karachi, etc.
automation cheerio javascript jobs jobs-scraping jobs-search jobscraper nodejs puppetter scra scraped-data scraper-api scraping scraping-websites scrapy-crawler
Last synced: 14 Apr 2025
https://github.com/dbritto-dev/udacity-cloud-devops-engineer-capstone
Capstone Project for Cloud DevOps Engineer on Udacity
capstone covid19 eks eksctl flask jenkins kubernetes pipeline scraped-data udacity-devops-nanodegree
Last synced: 02 Apr 2025
https://github.com/nmaties/google-emails-scraper
Node.js Google email addresses scraper using puppeteer.
google-scrape google-scraper google-scrapers google-scraping scrape-google scrape-jobs scrape-websites scraped-data scrapeddata scraper scraperjs scrapper-bot
Last synced: 03 Jan 2025
https://github.com/emibcn/covid-data
Store and serve daily collected data from https://dadescovid.org for sibling app at https://emibcn.github.io/covid/
backend cache charts covid-data generalitat-de-catalunya github-actions github-page github-pages hacktoberfest json json-objects scraped-data scraper storage workflow
Last synced: 03 Mar 2025
https://github.com/fitzwilliammuseum/thresholds
An archive repository for thresholds.org.uk
cambridge-museums poetry scraped-data wordpress
Last synced: 04 Mar 2025
https://github.com/mrgeislinger/2018-olympics-pyeongchang-data
Scraped data of the 2018 Winter Olympics Games in Pyeongchang from www.olympic.org in an effort to make a tidy data set of all competitors (not just winners).
data-science data-scraping data-wrangling olympics pyeongchang scraped-data winter-olympics
Last synced: 26 Feb 2025
https://github.com/derrmru/whats-in-the-news
Data Visualisation of News Content
data-visualization nlp react scraped-data
Last synced: 09 Apr 2025
https://github.com/bhavyac16/flairifyme
FlairifyMe is a Reddit Flair Detector for r/india subreddit, that takes a post's URL as user input and predicts the flair for the post using a model generated by Logistic Regression.
flair-prediction flask hacktoberfest linear-svm logistic-regression naive-bayes-classifier nltk praw-reddit reddit-flair-detector scikit-learn scraped-data subreddit text-classification
Last synced: 26 Feb 2025
https://github.com/truongbo17/bookstory
Scrape Data PDF and Reup (Full UI/UX/Admin/Actions)
laravel pdf scraped-data tailwindcss
Last synced: 26 Mar 2025
https://github.com/wurstbroteater/hometemp
Fetch apartment data, online data, visualize it, analyse it and send it via email.
apartment-management-system data-visualization raspberry-pi scraped-data temperature temperature-monitoring temperature-sensor
Last synced: 03 Apr 2025