Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/prankshaw/beware-web-scraper

Web Scraping project including; C projects scraper from GitHub , ICC rankings scraper, YouTube Trending Scrapper, LinkedIn Profile Scraper, Wikipedia Image Scraper
https://github.com/prankshaw/beware-web-scraper

batting c chrome-webdriver chromedriver cricket github icc icc-rankings-scraper pandas python python-3 rankings scraper selenium selenium-webdriver web-scraping wikipedia-image-scraper

Last synced: about 2 months ago
JSON representation

Web Scraping project including; C projects scraper from GitHub , ICC rankings scraper, YouTube Trending Scrapper, LinkedIn Profile Scraper, Wikipedia Image Scraper

Host: GitHub
URL: https://github.com/prankshaw/beware-web-scraper
Owner: prankshaw
License: mit
Created: 2018-08-11T17:09:48.000Z (over 6 years ago)
Default Branch: master
Last Pushed: 2022-06-22T01:47:28.000Z (over 2 years ago)
Last Synced: 2024-05-01T17:16:47.522Z (9 months ago)
Topics: batting, c, chrome-webdriver, chromedriver, cricket, github, icc, icc-rankings-scraper, pandas, python, python-3, rankings, scraper, selenium, selenium-webdriver, web-scraping, wikipedia-image-scraper
Language: Python
Homepage: https://prankshaw.github.io/Beware-web-scraper/
Size: 132 KB
Stars: 13
Watchers: 2
Forks: 2
Open Issues: 2
Metadata Files:
- Readme: README.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md

Awesome Lists containing this project

README

        ## Visit The project here  

https://prankshaw.github.io/Beware-web-scraper/

[![Build Status](https://travis-ci.com/prankshaw/Beware-web-scraper.svg?branch=master)](https://travis-ci.com/prankshaw/Beware-web-scraper)

[![Documentation Status](https://readthedocs.org/projects/beware-web-scraper/badge/?version=latest)](https://beware-web-scraper.readthedocs.io/en/latest/?badge=latest)

[![Code style: black](https://img.shields.io/badge/code%20style-black-000000.svg)](https://github.com/ambv/black)

[![codecov](https://codecov.io/gh/prankshaw/Beware-web-scraper/branch/master/graph/badge.svg)](https://codecov.io/gh/prankshaw/Beware-web-scraper)

[![License: MIT](https://img.shields.io/badge/License-MIT-orange.svg)](https://opensource.org/licenses/MIT)







[![Twitter URL](https://img.shields.io/twitter/url/https/twitter.com/fold_left.svg?style=social&label=Follow%20%40mepranjal31)](https://twitter.com/mepranjal31)

# Scrapers available



  

### C-project-scraper

Scrapes the top projects for 'C' language from github. It can be extended to get projects in any language present on GitHub.


### ICC Rankings-Scraper

Tells about top 100 ranked batsmen from all over the world for all 3 formats, i.e. Test cricket, One day International and T20 International.


### Youtube Trending-Scraper

Scrapes all the information from trending section of youtune, including video name, description available and video liks


### LinkedIn-Scraper

Automatically LogIn to the profile and scrapes the relavant information from profile, including name, location, title, connections and more


### Wikipedia Image-Scraper

Scrapes links of all the images present in the given wikipedia page and prints them





  

## These project use selenium driver.

#### To use project

> Just fork the project and the install the prerequisities. 


> Simply run, if present in jupyter notebook, else follow below mentioned steps.


> Python (I am using Python 3.x). After downloading python, pip all the requirements(if any).


> Selenium Webdriver for Google Chrome: Chromedriver – Download it and place it anywhere on your machine.


> pip install selenium 


> pip install pandas 


> Change path of 'chromedriver' with your own path.


> Just run in IDLE and see the output 


# License

Licensed under MIT-license

https://prankshaw.mit-license.org/