Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
Projects in Awesome Lists tagged with web-scraping
A curated list of projects in awesome lists tagged with web-scraping .
https://github.com/chrismuir/zillow
Zillow Scraper for Python using Selenium
chromedriver python scraper selenium web-scraping zillow
Last synced: 19 Dec 2024
https://github.com/NLPatVCU/PaperScraper
A web scraping tool to systematically extract the text of scientific papers and corresponding metadata from university accessible journals.
journal-web-scraper natural-language-processing pubmed-articles-grabber scientific-publications selenium-webdriver web-scraping
Last synced: 06 Nov 2024
https://github.com/nlpatvcu/paperscraper
A web scraping tool to systematically extract the text of scientific papers and corresponding metadata from university accessible journals.
journal-web-scraper natural-language-processing pubmed-articles-grabber scientific-publications selenium-webdriver web-scraping
Last synced: 19 Dec 2024
https://github.com/apify/actor-page-analyzer
Apify actor that opens a web page in headless Chrome and analyzes the HTML and JavaScript objects, looks for schema.org microdata and JSON-LD metadata, analyzes AJAX requests, etc.
headless-chrome javascript web-scraping
Last synced: 07 Nov 2024
https://github.com/nuhmanpk/webscrapper
Simple and powerfull all in one Telegram Bot to scrap / crawl webpages using Requests, html5lib and Beautifulsoup
beautifulsoup4 crawler crawler-engine crawler-python hacktoberfest hacktoberfest-accepted hacktoberfest2023 pyrogram pyrogram-bot requests scraper scraping selenium telegram telegram-bot web-scraping webscraping webscrapper webscrapping webscrapping-python
Last synced: 27 Dec 2024
https://github.com/HarryShomer/Hockey-Scraper
Python Package for scraping NHL Play-by-Play and Shift data
hockey nhl python scraper sports web-scraping
Last synced: 06 Nov 2024
https://github.com/suntong/cascadia
Go cascadia package command line CSS selector
cascadia command-line command-line-tool css-selector csv-table curl extract html-source html-text tsv web-scraper web-scraping
Last synced: 19 Nov 2024
https://github.com/nuhmanpk/WebScrapper
Simple and powerfull all in one Telegram Bot to scrap / crawl webpages using Requests, html5lib and Beautifulsoup
beautifulsoup4 crawler crawler-engine crawler-python hacktoberfest hacktoberfest-accepted hacktoberfest2023 pyrogram pyrogram-bot requests scraper scraping selenium telegram telegram-bot web-scraping webscraping webscrapper webscrapping webscrapping-python
Last synced: 29 Nov 2024
https://github.com/programminghistorian/ph-submissions
The repository and website hosting the peer review process for new Programming Historian lessons
api data-management dh digital-history digital-humanities distant-reading linked-open-data mapping multi-lingual network-analysis open-educational-resources open-source pedagogy programming-historian python r-studio web-archiving web-scraping
Last synced: 15 Nov 2024
https://github.com/jonascz/save-for-offline
Android app for saving webpages for offline reading.
android android-application html-files html-parser java offline offline-storage parser viewer web-scraping
Last synced: 07 Nov 2024
https://github.com/hominee/dyer
Dyer is designed for reliable, flexible and fast web crawling, providing some high-level, comprehensive features without compromising speed.
crawler rust rust-programming-language spider web-crawler web-framework web-scraping
Last synced: 06 Nov 2024
https://github.com/bertrandmartel/tableau-scraping
Tableau scraper python library. R and Python scripts to scrape data from Tableau viz
dataframe pandas python r tableau web-scraping
Last synced: 30 Dec 2024
https://github.com/trainingbypackt/data-wrangling-with-python
Simplify your ETL processes with these hands-on data sanitation tips, tricks, and best practices
analytics beautifulsoup data-analytics data-munging data-science data-wrangling database numpy pandas python regular-expression web-scraping
Last synced: 01 Jan 2025
https://github.com/my8100/scrapyd-cluster-on-heroku
Set up free and scalable Scrapyd cluster for distributed web-crawling with just a few clicks. DEMO :point_right:
cluster heroku logparser python scrapy scrapyd scrapydweb web-crawling web-scraping
Last synced: 20 Dec 2024
https://github.com/apify/actor-scraper
House of Apify Scrapers. Generic scraping actors with a simple UI to handle complex web crawling and scraping use cases.
Last synced: 07 Nov 2024
https://github.com/codingforentrepreneurs/Web-Scraping
Learn how to leverage Python's amazing tools to scrape data from other websites. The end goal of this course is to scrape blogs to analyze trending keywords and phrases. We'll be using Python 3.6, Requests, BeautifulSoup, Asyncio, Pandas, Numpy, and more!
aysncio beautifulsoup beautifulsoup4 joincfe numpy pandas python python-requests python3 requests scraper sraping tutorial web-scraping
Last synced: 22 Nov 2024
https://github.com/maxmindlin/scout-lang
A web crawling programming language
dsl programming-language scraper scraping scraping-websites web-crawling web-scraping
Last synced: 30 Dec 2024
https://github.com/sangaline/scrapy-wayback-machine
A Scrapy middleware for scraping time series data from Archive.org's Wayback Machine.
archive-dot-org middleware python scrapy scrapy-extension wayback-machine web-scraping
Last synced: 01 Jan 2025
https://github.com/Crinibus/scraper
Web scraper for scraping, tracking and visualizing prices of products on various websites.
amazon avcables computersalg coolshop ebay elgiganten expert komplett mm-vision newegg prices products proshop python scrape-prices scraper sharkgaming shein tech-scraper web-scraping
Last synced: 06 Nov 2024
https://github.com/vindarel/cl-torrents
Searching torrents on popular trackers - CLI, readline, GUI, web client. Tutorial and binaries (issue tracker on https://gitlab.com/vindarel/cl-torrents/)
1337 1337x common-lisp pirate-bay torrents tutorial web-scraping
Last synced: 19 Dec 2024
https://github.com/siongui/instago
Download/access photos, videos, stories, story highlights, postlives, following and followers of Instagram
downloader go golang gopherjs instagram web-scraping webscraping
Last synced: 29 Oct 2024
https://github.com/0xZDH/BridgeKeeper
Scrape, Hunt, and Transform names and usernames
linkedin-scraper name-generation osint python3 username username-generator usernames web-scraping
Last synced: 12 Nov 2024
https://github.com/minhlucvan/n8n-nodes-browserless
n8n node to interact with browserless instance
browser-automation browserless n8n n8n-community-node-package n8n-nodes web-scraping
Last synced: 28 Dec 2024
https://github.com/passivebot/facebook-marketplace-scraper
This repository contains a script to scrape Facebook Marketplace data using Playwright, BeautifulSoup and Streamlit.
database facebook facebook-marketing-automation facebook-marketplace playwright playwright-python python sqlite3 web-automation web-scraper web-scraping
Last synced: 19 Nov 2024
https://github.com/king04aman/all-in-one-python-projects
A huge collection of awesome beginner-friendly Python projects starting from very basics to advance. Prefect repository for learning python and enhancing your python programming skills.
artificial-intelligence automate-task automation beginner-friendly hacktoberfest hacktoberfest2024 machine-learning open-source-project python python-projects python-projects-basic-to-advanced python-tools web-scraping
Last synced: 29 Dec 2024
https://github.com/hrbrmstr/splashr
:sweat_drops: Tools to Work with the 'Splash' JavaScript Rendering Service in R
har phantomjs r r-cyber rstats selenium splash web-scraping
Last synced: 27 Oct 2024
https://github.com/scrapehero/zillow_real_estate
Zillow.com Web Scraper written in Python and LXML to extract real estate listings available based on a zip code.
html lxml parsing python-requests scraper web-scraping
Last synced: 04 Nov 2024
https://github.com/scrapinghub/web-poet
Web scraping Page Objects core library
hacktoberfest page-objects python web-scraping
Last synced: 30 Dec 2024
https://github.com/apify/browser-pool
A Node.js library to easily manage and rotate a pool of web browsers, using any of the popular browser automation libraries like Puppeteer, Playwright, or SecretAgent.
browser-automation headless-browsers playwright puppeteer rpa scraping web-scraping
Last synced: 01 Jan 2025
https://github.com/zoranpandovski/bookingscraper
:earth_americas: :hotel: Scrape Booking.com :hotel: :earth_americas:
beautifulsoup booking python3 request scraper web-scraping webscraper webscraping
Last synced: 29 Dec 2024
https://github.com/seanfhear/tab-scraper
Interface for downloading guitar tabs from Ultimate Guitar
chords guitar guitar-chords guitar-tablature guitar-tabs ultimate-guitar web-scraping
Last synced: 11 Nov 2024
https://github.com/mostlypanda/node-js-functionalities
This repository contains very useful restful API's and functionalities in node-js containing many important tutorial code for mastering node-js, all tutorials have been published on medium.com, tutorials link is given below
2-way-authentication crudapi express html login logout mongodb multer-storage node-js nodejs-tutorials npm packages payment-gateway rest-api signup sms-services smtp twilio web-scraping
Last synced: 05 Nov 2024
https://github.com/khuyentran1401/top-github-scraper
Scape top GitHub repositories and users based on keywords
github github-api python scraping web-scraper web-scraping
Last synced: 19 Dec 2024
https://github.com/umesh-01/python-assistant
Python Assistant (PA) is a voice command based assistant service written in Python 3.9+. It can recognize human speech or voice, talk to user and execute basic commands.
ai-assistants google-recognition nlp openweathermap-api pycharm-ide python python-assistant python-automation python39 pyttsx3 speech-recognition text-to-speech virtual-assistant voice-assistant voice-commands voice-recognition web-scraping wikipedia-search wolfram-alpha
Last synced: 09 Oct 2024
https://github.com/crawlzone/crawlzone
Crawlzone is a fast asynchronous internet crawling framework for PHP.
automated-testing crawler crawling-framework middleware php web-scraping web-search
Last synced: 29 Oct 2024
https://github.com/umarbutler/open-australian-legal-corpus-creator
The code used to create and update the Open Australian Legal Corpus, the first and only multijurisdictional open corpus of Australian legislative and judicial documents.
australia corpus dataset datasets law legal open-data scraping web-scraping
Last synced: 01 Jan 2025
https://github.com/mohamedhmini/iww
AI based web-wrapper for web-content-extraction
ai data-mining information-extraction library python web-content-extractor web-data-extraction web-mining web-scraping
Last synced: 15 Nov 2024
https://github.com/dddat1017/Scraping-Youtube-Comments
Scrape comments from any Youtube video
data-scraping python selenium web-scraping
Last synced: 02 Dec 2024
https://github.com/utkuufuk/ping-sm
Receive an email or Telegram message as soon as Migros Sanalmarket is available for delivery in your neighborhood.
Last synced: 07 Nov 2024
https://github.com/hrbrmstr/decapitated
Headless 'Chrome' Orchestration in R
headless-chrome javascript r r-cyber rstats web-scraping
Last synced: 22 Nov 2024
https://github.com/yusuftaufiq/laravel-books-api
Fully documented & tested Laravel 9 RESTful books API scraped from Gramedia.
docker laravel9 php81 restful-api web-scraping
Last synced: 11 Oct 2024
https://github.com/oxylabs/playwright-web-scraping
A tutorial for web scraping using Playwright headless browser
playwright web-scraper web-scraping
Last synced: 17 Nov 2024
https://github.com/mooseburger1/springboard-data-science-immersive
convolutional-neural-networks data-science deep-learning deep-neural-networks eda h5 hadoop nlp opencv pyspark python sql statistical-analysis statistical-inference statistical-modeling tensorboard tensorflow time-series-analysis time-series-prediction web-scraping
Last synced: 24 Nov 2024
https://github.com/b0o/apple-autofill-domains
Apple's allowed autofill domains
apple data-analysis github-actions web-scraping
Last synced: 29 Oct 2024
https://github.com/scrapehero/selectorlib
A library to read a YML file with Xpath or CSS Selectors and extract data from HTML pages using them
python scraping selectors web-scraping xpath
Last synced: 04 Nov 2024
https://github.com/techfanetechnologies/qtsapp
The Python Library For QtsApp which displays the option chain in near real-time. This program retrieves this data from the QtsApp site and then generates useful analysis of the Option Chain for the specified Index or Stock. It also continuously refreshes the Option Chain along with Implied Volatatlity (IV), Open Interest (OI), Delta, Theta, Vega, Gamma, Vanna, Charm, Speed, Zomma, Color, Volga, Veta at an interval of a second and visually displays the trend in various indicators useful for Technical Analysis.
analysis app banknifty derivatives drmoonejune equity moonedrjune nifty nifty50 nse option-chain option-greeks option-pricing option-trading options options-trading python script strike-price web-scraping
Last synced: 10 Nov 2024
https://jaeyk.github.io/comp_thinking_social_science/
Computational Thinking for Social Scientists book project
computational-social-science data-science digital-humanities machine-learning python r social-sciences visualization web-scraping
Last synced: 27 Oct 2024
https://github.com/edjopato/website-stalker
Track changes on websites via git
change-alert change-detection git monitoring scraper self-hosted url-monitor web-scraping website-change-detector website-change-monitor website-change-tracker website-monitor website-monitoring
Last synced: 02 Jan 2025
https://github.com/alex000kim/slack-gpt-bot
GPT4-powered Slack bot that can scrape URL contents
chatbot gpt-4 gpt4 slack slack-bot web-scraping webscraping
Last synced: 07 Nov 2024
https://github.com/dojutsu-user/imdb-scraper
Scrapy project for scraping data from IMDB with Movie Dataset including 58,623 movies' data.
imdb-webscrapping movie-dataset python3 scrapy scrapy-crawler scrapy-framework web-scraping
Last synced: 28 Oct 2024
https://github.com/hrbrmstr/wayback
:rewind: Tools to Work with the Various Internet Archive Wayback Machine APIs
internet-archive memento r r-cyber rstats wayback wayback-machine web-scraping
Last synced: 28 Oct 2024
https://github.com/gadingnst/kampus-scraper
Scraper & GraphQL API untuk data Perguruan Tinggi di Indonesia berdasarkan dari website Kementrian RISTEKDIKTI.
api graphql puppeteer scraper serverless web web-scraping
Last synced: 20 Dec 2024
https://github.com/dc-aichara/DS-ML-Public
Python Scripts and Jupyter Notebooks
bayesian-optimization beautifulsoup bitcoin catboost dash dashboard data-analysis data-mining data-science data-visualisation hyperparameter-tuning hyperparameters-optimization lightgbm machine-learning news plotly python telegram web-scraping xgboost
Last synced: 15 Nov 2024
https://github.com/ibnesayeed/linkextractor
A Docker tutorial using a link extraction application example
docker hacktoberfest interactive link-extraction php python ruby tutorial web-scraping
Last synced: 26 Dec 2024
https://github.com/D4Vinci/Scrapling
Lightning-Fast, Adaptive Web Scraping for Python
automation crawler crawling crawling-python css dom-manipulation hacktoberfest lxml playwright python python3 scraping selectors selenium stealth web-scraper web-scraping web-scraping-python webscraping xpath
Last synced: 18 Nov 2024
https://github.com/kaliiiiiiiiii-vinyzu/patchright-nodejs
Undetected NodeJS version of the Playwright testing and automation library.
automation bot bots botting browser chrome chromedriver chromium cloudflare cloudflare-bypass playwright stealth undetectable undetected web-auto web-scraping webautomation webdriver webscraping
Last synced: 01 Jan 2025
https://github.com/mkearney/rreddit
𝐫⟋ Get Reddit data
mkearney-r-package pushshift r r-package reddit reddit-api rstats social-media web-scraping
Last synced: 15 Nov 2024
https://github.com/deedy5/primp
🪞PRIMP (Python Requests IMPersonate). The fastest python HTTP client that can impersonate web browsers
akamai-fingerprint fingerprint http http-client http2-fingerprint https ja3-fingerprint ja4-fingerprint tls-fingerprint web-scraping
Last synced: 17 Dec 2024
https://github.com/avilum/smart-url-fuzzer
Explore URLs of domains fast and efficiently using fuzzing techniques
fuzzers http pentest-scripts pentest-tool pentesting python python-script python3 script scripts security security-tools urls web-crawler web-scraping website whitehat
Last synced: 28 Oct 2024
https://github.com/serpapi/public-roadmap
Public Roadmap for SerpApi, LLC (https://serpapi.com)
baidu-scraper google-image-scraper google-maps-scraping google-search-scraper scraper scraping serp-api serpapi web-scraper web-scraping webscraping yahoo-scraper
Last synced: 20 Nov 2024
https://github.com/sanjaysunil/email-scraper
Generate thousands of temporary emails within seconds!
automation email email-generator email-scraper email-scrapping email-service python scrape scraper temp-email temporary web-scraper web-scraping
Last synced: 10 Nov 2024
https://github.com/goodbyteco/letterboxd-watchlist-picker
A simple website that gives you a random film off your Letterboxd watchlist (or any list).
film go letterboxd movies watchlist web-scraping webapp
Last synced: 05 Nov 2024
https://github.com/ahmedshahriar/bd-medicine-scraper
Scrapy-Django PostgreSQL integrated API with Proxy IP configuration that scrapes all medicine data (meds, prices, generics, companies, indications) from Bangladesh (30k+ pages)
django django-rest-framework drug manufacturer medicine medicine-database postgresql proxy-ip python python3 rest-api scrapy web-scraping
Last synced: 19 Dec 2024
https://github.com/gugarosa/viviner
🍷 Scraps data from Vivino and collects outstanding wine-based meta-data.
data-mining requests vivino web-scraping wine
Last synced: 01 Oct 2024
https://github.com/danmorse314/hockeyR
Collect and Clean Hockey Stats
hockey nhl nhl-data web-scraping
Last synced: 04 Dec 2024
https://github.com/ramonpaolo/api-b3
API Simples que retorna dados sobre tal ação/empresa da B3
api flask heroku opensource python web-scraping
Last synced: 22 Oct 2024
https://github.com/rebrowser/rebrowser-playwright-python
A drop-in replacement for playwright-python patched with rebrowser-patches. It allows to pass modern automation detection tests.
automation bot bot-detection captcha headless playwright playwright-python rebrowser rebrowser-patches scraping web-scraping
Last synced: 01 Jan 2025
https://github.com/oxylabs/web-scraping-tutorials
Web scraping, data parsing and automation tutorials. Suited for both beginners and intermediate/advanced programmers.
csharp curl github-python golang javascript python r-language ruby web-proxies web-scraping wikipedia-scraper
Last synced: 17 Nov 2024
https://github.com/mike-gee/webtranspose
Web scraping API for building AI applications.
chatbots crawling crawling-python python scraping scraping-python web-crawling web-scraping web-scraping-python
Last synced: 11 Nov 2024
https://github.com/florents-tselai/greek-wines-analysis
Scraper, Data and Analysis for "Analyzing 1000+ Greek Wines with Python"
beautifulsoup data-science pandas python seaborn web-scraping
Last synced: 31 Oct 2024
https://github.com/oxylabs/scraping-dynamic-javascript-ajax-websites-with-beautifulsoup
A guide on how to scrape JavaScript rendered websites with Python and BeautifulSoup.
ajax beautiful-soup github-python javascript python scraping web-scraping
Last synced: 17 Nov 2024
https://github.com/hrbrmstr/htmlunit
🕸🧰☕️Tools to Scrape Dynamic Web Content via the 'HtmlUnit' Java Library
htmlunit javascript r r-cyber rstats web-scraping
Last synced: 28 Oct 2024
https://github.com/carpentries-incubator/lc-webscraping
Introduction to web scraping
alpha carpentries english lesson programming python scraping web-scraping webscraping
Last synced: 29 Dec 2024
https://github.com/wenyalintw/google-patents-scraper
Automatically download all PDF files of searching results & their patent families found on Google Patents.
crawler google-patents patent patents pdf scraper scraping scrapy web-scraping
Last synced: 11 Nov 2024
https://github.com/soumyajit4419/ai_for_social_good
Using natural language processing to analyze the sentiments of people and detect suicidal ideation on online social content.
lstm natural-language-processing random-forest tfidf-vectorizer web-scraping
Last synced: 22 Oct 2024
https://github.com/drewcarlson/ktsoup
A Kotlin multiplatform HTML5 parsing library
jsoup kotlin kotlin-multiplatform lexbor web-scraping
Last synced: 25 Dec 2024
https://github.com/serpapi/google-search-results-java
Google Search Results JAVA API via SerpApi
data-extraction data-scraping java java-api json serp-api serpapi web-scraping webscraping
Last synced: 20 Nov 2024
https://github.com/reljicd/spring-boot-web-scraper
Simple web scrapping app made using Spring Boot + Thymeleaf + Jsoup + Java 8 Lambdas & Streams
docker docker-compose functional-programming h2 h2-database java java-8 java-lambda java-streams jsoup lambda scraper spring spring-boot spring-data-jpa spring-mvc spring-security stream thymeleaf web-scraping
Last synced: 07 Nov 2024
https://github.com/ekarton/uoft-timetable-generator
A web application that generates timetables for university students at the University of Toronto
genetic-algorithm productivity timetable-generator uoft web-application web-scraping
Last synced: 24 Nov 2024
https://github.com/City-Bureau/city-scrapers-template
Template for creating a City Scrapers project in your area
city-scrapers open-data python web-scraping
Last synced: 04 Dec 2024
https://github.com/ayaka14732/lihkg-scraper
A Python script for scraping LIHKG
Last synced: 28 Oct 2024
https://github.com/zytedata/spidyquotes
Example site for web scraping tutorials
crawling playground scraping tutorials web-crawling web-scraping web-scraping-tutorials
Last synced: 11 Nov 2024
https://github.com/oxylabs/rotating-proxies-with-python
Learn about how to rotate proxies by using Python.
json-database-python proxies proxy proxy-list proxy-list-github proxy-rotator python python-image-scraper python-web-crawler rotating-proxy scraper-python scraping socks5-proxy socks5-proxy-list socks5-server web-proxies web-scraping
Last synced: 17 Nov 2024
https://github.com/dojutsu-user/gsoc-data-analyser
Simple search for organisations participating/participated in the GSoC
django django-rest-framework gsoc gsoc-2017 gsoc-2018 gsoc-2019 gsoc-search javasript python3 reactjs requests-module web-scraping
Last synced: 13 Oct 2024
https://github.com/smahesh29/web-scraping-python
It contains some web scraping examples implemented using Python.
beautifulsoup beautifulsoup4 flipkart-scraper-python flipkart-selenium google-images-crawler google-images-downloader internshala internships pandas pandas-dataframe python selenium selenium-python web-scapping web-scraping webscraping webscraping-search webscrapper youtube-scraper youtube-video
Last synced: 11 Oct 2024
https://github.com/vaasudevans/google-podcast-downloader
CL tool to download entire google podcast library for the provided URL 🎵
google-podcasts podcast-downloader python web-scraping
Last synced: 13 Dec 2024
https://github.com/0x0be/scrapeadvisor
A user-friendly python-based GUI which provides sentiment analysis of users' reviews toward a specific TripAdvisor facility
data-mining data-science python3 r scraping sentiment-analysis sentiment-classification text-mining tripadvisor tripadvisor-scraper web-scraping
Last synced: 04 Nov 2024
https://github.com/Smartproxy/Python-scraper-tutorial
A short introduction to scraping with Python with given steps and an example scraper script.
beautifulsoup crawler data-mining data-science github-python json-database-python learning python python-projects python-web-crawler python-web-scraper scraper-python scraping web-crawler-python web-scraping web-scraping-api web-scraping-python webscraping
Last synced: 20 Nov 2024
https://github.com/jetkai/proxy-scraper
This is an application that scrapes various Proxy API Endpoints, then compiles the proxies into files within the "/proxies/" directory.
exe gradle httpclient jackson-json jar java jdk11 kotlin launch4j proxies proxy proxy-scrape proxy-scraper scraper scraping selenium-java web-scraper web-scraping
Last synced: 30 Dec 2024
https://github.com/mainakrepositor/py-automation
Automating social media, mailing, and kernel processes using Python.
automated-tests automation modules os python3 security-tools selenium testing-tools web-scraping webdriver
Last synced: 12 Nov 2024
https://github.com/gabrieldim/a1on-webscraping-pandas-data-science
Learning WebScraping using Pandas in python. - Data Science
data data-science pandas sciecne web-scraping
Last synced: 20 Nov 2024
https://github.com/Granitosaurus/parsel-cli
cli for evaluating css and xpath selectors
cli css lxml parsel web-scraping xpath
Last synced: 06 Nov 2024
https://github.com/OpenJarbas/audiobooker
Audio Book scrapper
audiobooks librivox python web-scraping
Last synced: 28 Nov 2024
https://github.com/websemantics/codepen-puppeteer
Use Puppeteer to download pens from Codepen.io as single html pages
codepen headless-chrome puppeteer web-scraping
Last synced: 06 Nov 2024
https://github.com/papagorgio23/vegaslines
Historical Vegas betting lines for the NBA and NFL
nba nfl sports-betting sportsbetting vegas-lines web-scraping
Last synced: 01 Dec 2024
https://github.com/davidsvy/Neural-Scam-Artist
Web Scraping, Document Deduplication & GPT-2 Fine-tuning with a newly created scam dataset.
dataset deduplication fine-tuning fraud gpt2 huggingface lsh minhash nlp pytorch readability scam transformer web-scraping
Last synced: 22 Nov 2024
https://github.com/d4n3436/gscraper
A collection of search engine image scrapers (Google Images, DuckDuckGo and Brave)
brave duckduckgo google google-images gscraper web-scraping
Last synced: 08 Nov 2024
https://github.com/ahmedshahriar/youtube-comment-scraper
This script will dump youtube video comments to a CSV from youtube video links. Video links can be placed inside a variable or list or CSV
comment-parser csv data-mining-python data-science lxml pandas python python3 requests-library-python requests-module scraper scraping social-media web-crawler web-crawler-python web-scraping youtube youtube-crawler youtube-downloader youtube-scraper
Last synced: 16 Nov 2024
https://github.com/milahu/aiohttp_chromium
aiohttp-like interface to chromium. based on selenium_driverless to bypass cloudflare
aiohttp asyncio bypass-cloudflare chromium gui-scripting headful-chromium headful-scraper headful-web-scraper headful-webscraper selenium-driverless web-scraper web-scraping
Last synced: 12 Oct 2024
https://github.com/wiringbits/simple-http-proxy
A very simple http proxy that runs on a Raspberry Pi
http-proxy playframework raspberry-pi scala web-scraping
Last synced: 21 Nov 2024