https://github.com/niftycode/fetch_hackernews
A simple program to fetch Hackernews from news.ycombinator.com written in Python.
https://github.com/niftycode/fetch_hackernews
beautifulsoup4 hackernews python requests webscraping
Last synced: about 2 months ago
JSON representation
A simple program to fetch Hackernews from news.ycombinator.com written in Python.
- Host: GitHub
- URL: https://github.com/niftycode/fetch_hackernews
- Owner: niftycode
- License: mit
- Created: 2022-01-29T11:36:20.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2025-03-04T09:44:38.000Z (about 1 year ago)
- Last Synced: 2025-05-19T00:49:57.871Z (11 months ago)
- Topics: beautifulsoup4, hackernews, python, requests, webscraping
- Language: Python
- Homepage:
- Size: 70.3 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- Changelog: Changelog.md
- Contributing: CONTRIBUTING.md
- License: LICENSE
Awesome Lists containing this project
README
# fetch-hackernews






A simple program to fetch [Hackernews](https://news.ycombinator.com) from *news.ycombinator.com* written in Python.
I know, there are already some similar projects via [PyPi](https://pypi.org) available, but I said to myself, why not add one more app :wink: It gave me the opportunity to finally deal with the subject of web scraping (with [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/bs4/doc/)).
## Operating System
macOS, Linux and Windows
## Requirements
* Python >= 3.12
* `requests`
* `beautifulsoup4`
## Install
$ pip3 install fetch_hackernews
## Usage
Start the program with following command:
$ fetch_hackernews
This shows you the 30 most recent messages. The output looks similar to the one shown below:
Found no local index.html file.
Fetch data from https://news.ycombinator.com…
##############################
# #
# Fetch Hacker News #
# Version: 1.0.7 #
# #
##############################
1 - Red Light Green Light
Link: https://jamessevedge.com/articles/red-light-green-light/
2 - You can now send replies from your Duck Addresses
Link: https://duckduckgo.com/email/faq
…
After running this program, an index.html file is created locally. This reduces requests to the server from [news.ycombinator.com](https://news.ycombinator.com).
So, all news will be read from the local `index.html` file. This program will search for such a file. If no file has been created yet, it will create this file, download the content (using `requests`) and save it. After that, the content will be parsed using `BeautifulSoup`.
By default, the `index.html` file is only updated every six hours.
The `index.html` file is stored in the following directory (macOS):
~/.config/hackernews
## Changelog
see [Changelog.md](https://github.com/niftycode/fetch_hackernews/blob/main/Changelog.md)