https://github.com/comsavvy/punch-scraping-engine

Scraping the top Punch news
https://github.com/comsavvy/punch-scraping-engine

news newsfeed punch python3 scrapy web-scraping

Last synced: 4 months ago
JSON representation

Scraping the top Punch news

Host: GitHub
URL: https://github.com/comsavvy/punch-scraping-engine
Owner: comsavvy
Created: 2020-12-02T09:01:08.000Z (almost 5 years ago)
Default Branch: main
Last Pushed: 2021-01-15T23:27:26.000Z (over 4 years ago)
Last Synced: 2025-01-25T15:29:14.683Z (8 months ago)
Topics: news, newsfeed, punch, python3, scrapy, web-scraping
Language: Jupyter Notebook
Homepage:
Size: 86.9 KB
Stars: 1
Watchers: 2
Forks: 2
Open Issues: 0
Metadata Files:
- Readme: README.md

Awesome Lists containing this project

README

# News
This code is for scraping the latest Punch News (here) by crawling through different NEWS url.

End product:
- The URL of the News
- Title of the news
- News content

*All in one file!*
This project has three branches:
1. main: For storing the NEWS into a text file.
2. CSV: For storing the NEWS into a csv file.
3. deployment:This can be deployed in SCRAPYHUB platform
# Requirement
*scrapy_engine.py* module will handle the installation of the necessary libraries,
are you scared if the libraries is too much?

Don't be!
Because we are only installing one library called **SCRAPY**

But to install it manually,

copy and paste this **pip install scrapy** to your console.

You can visit the **SCRAPY** documentation if you are curious about how it works.

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/comsavvy/punch-scraping-engine

Awesome Lists containing this project

README