Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/prankshaw/beware-web-scraper
Web Scraping project including; C projects scraper from GitHub , ICC rankings scraper, YouTube Trending Scrapper, LinkedIn Profile Scraper, Wikipedia Image Scraper
https://github.com/prankshaw/beware-web-scraper
batting c chrome-webdriver chromedriver cricket github icc icc-rankings-scraper pandas python python-3 rankings scraper selenium selenium-webdriver web-scraping wikipedia-image-scraper
Last synced: about 2 months ago
JSON representation
Web Scraping project including; C projects scraper from GitHub , ICC rankings scraper, YouTube Trending Scrapper, LinkedIn Profile Scraper, Wikipedia Image Scraper
- Host: GitHub
- URL: https://github.com/prankshaw/beware-web-scraper
- Owner: prankshaw
- License: mit
- Created: 2018-08-11T17:09:48.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-06-22T01:47:28.000Z (over 2 years ago)
- Last Synced: 2024-05-01T17:16:47.522Z (9 months ago)
- Topics: batting, c, chrome-webdriver, chromedriver, cricket, github, icc, icc-rankings-scraper, pandas, python, python-3, rankings, scraper, selenium, selenium-webdriver, web-scraping, wikipedia-image-scraper
- Language: Python
- Homepage: https://prankshaw.github.io/Beware-web-scraper/
- Size: 132 KB
- Stars: 13
- Watchers: 2
- Forks: 2
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
## Visit The project here
https://prankshaw.github.io/Beware-web-scraper/[![Build Status](https://travis-ci.com/prankshaw/Beware-web-scraper.svg?branch=master)](https://travis-ci.com/prankshaw/Beware-web-scraper)
[![Documentation Status](https://readthedocs.org/projects/beware-web-scraper/badge/?version=latest)](https://beware-web-scraper.readthedocs.io/en/latest/?badge=latest)
[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)
[![codecov](https://codecov.io/gh/prankshaw/Beware-web-scraper/branch/master/graph/badge.svg)](https://codecov.io/gh/prankshaw/Beware-web-scraper)
[![License: MIT](https://img.shields.io/badge/License-MIT-orange.svg)](https://opensource.org/licenses/MIT)
[![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/fold_left.svg?style=social&label=Follow%20%40mepranjal31)](https://twitter.com/mepranjal31)# Scrapers available
### C-project-scraper
Scrapes the top projects for 'C' language from github. It can be extended to get projects in any language present on GitHub.
### ICC Rankings-Scraper
Tells about top 100 ranked batsmen from all over the world for all 3 formats, i.e. Test cricket, One day International and T20 International.
### Youtube Trending-Scraper
Scrapes all the information from trending section of youtune, including video name, description available and video liks
### LinkedIn-Scraper
Automatically LogIn to the profile and scrapes the relavant information from profile, including name, location, title, connections and more
### Wikipedia Image-Scraper
Scrapes links of all the images present in the given wikipedia page and prints them
## These project use selenium driver.
#### To use project
> Just fork the project and the install the prerequisities.
> Simply run, if present in jupyter notebook, else follow below mentioned steps.
> Python (I am using Python 3.x). After downloading python, pip all the requirements(if any).
> Selenium Webdriver for Google Chrome: Chromedriver – Download it and place it anywhere on your machine.
> pip install selenium
> pip install pandas
> Change path of 'chromedriver' with your own path.
> Just run in IDLE and see the output
# License
Licensed under MIT-license
https://prankshaw.mit-license.org/