https://github.com/shibam120302/youtube-channel-videos-scraper

Last synced: 7 months ago
JSON representation

Host: GitHub
URL: https://github.com/shibam120302/youtube-channel-videos-scraper
Owner: shibam120302
License: apache-2.0
Created: 2022-11-12T18:01:20.000Z (almost 3 years ago)
Default Branch: main
Last Pushed: 2022-11-12T18:12:30.000Z (almost 3 years ago)
Last Synced: 2025-01-21T17:50:40.072Z (9 months ago)
Language: Jupyter Notebook
Size: 11.7 KB
Stars: 1
Watchers: 1
Forks: 0
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Youtube Channel Videos Scraper BY-SHIBAM ❤

![youtube-scraper](https://cdn.dribbble.com/users/1369921/screenshots/5648162/media/b4c99cb834be57ef19b701579249e5ba.gif)

## Table of Contents

* [About the Project](#about-the-project)
* [Tasks](#tasks)
* [Built With](#built-with)
* [Fork the Repo and Contribute](#Fork-the-Repo-and-Contribute)
* [Contact](#contact)

## About the Project

In this [`Webscraping Project`](https://github.com/shibam120302/Youtube-Channel-Videos-Scraper) Jupyter notebook, we scrape the Wikipedia pages for Disney movies to create a Disney Movies dataset. We scrape data like `Title`, `Directed by`, `Produced by`, `Written by`, `Narrated by`, `Music by`, `Cinematography`, `Edited by`, `Production company`, `Distributed by`, `Release date`, `Running time`, `Country`, `Language` from Wikipedia. We also work with OMDb API to get `imdb`, `metascore`, `rotten_tomatoes` data. The data is stored as JSON and CSV and intermediately using Pickle library in Python.

![Disney-Movies](https://cliply.co/wp-content/uploads/2019/07/371907120_YOUTUBE_ICON_400px.gif)

### Tasks

* Task 1: Scrape info box from Toy Story 3 Wiki page and save in python dictionary.
* Task 2: Scrape info box for all Disney movies and save in list of python dictionaries.
* Task 3: Clean the data!
- Strip out all references ([1], [2], etc)
- Split up long strings
- Convert 'Running time' field to integer
- Convert 'Budget' and 'Box office' fields to floats
- Convert dates to datetime objects
- Save data using Pickle
* Task 4: Attach IMDb, Rotten Tomatoes, Metascores to dataset using OMDb API.
* Task 5: Save final dataset as JSON and CSV files.

### Built With

* Jupyter Notebook
* Beautiful Soup
* Requests
* Pickle
* Pandas

## Fork the Repo and Contribute

Contributions are what make the open source community such an amazing place to be learn, inspire, and create. Any contributions you make are **greatly appreciated**.

1. Fork the Project (click on `Fork` in the top-left corner)
2. Create your Feature Branch (`git checkout -b feature`)
3. Commit your Changes (`git commit -m 'Add some AmazingFeature'`)
4. Push to the Branch (`git push origin feature`)
5. Open a Pull Request

## Contact

### SHIBAM NATH ❤❤
* [LinkedIn](https://www.linkedin.com/in/shibam-nath-0a23a6227/)

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/shibam120302/youtube-channel-videos-scraper

Awesome Lists containing this project

README