https://github.com/edsu/whrss
scrape White House Blog to generate RSS until it starts working again
https://github.com/edsu/whrss
Last synced: about 1 year ago
JSON representation
scrape White House Blog to generate RSS until it starts working again
- Host: GitHub
- URL: https://github.com/edsu/whrss
- Owner: edsu
- Created: 2017-01-23T17:17:02.000Z (over 9 years ago)
- Default Branch: master
- Last Pushed: 2021-01-21T18:04:41.000Z (over 5 years ago)
- Last Synced: 2025-03-31T12:58:14.564Z (about 1 year ago)
- Language: Python
- Homepage: https://inkdroid.org/rss/whitehouse.xml
- Size: 6.84 KB
- Stars: 9
- Watchers: 2
- Forks: 3
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
During the Trump administration the [White House Website] disabled the RSS feed for its Wordpress site. During the four long years of the Trump administration I ran this script (whrss.py) from cron every 30 minutes to scrape the White House website and create an RSS feed that was available at:
[https://inkdroid.org/rss/whitehouse.xml](https://inkdroid.org/rss/whitehouse.xml)
On January 20, 2021 whitehouse.gov began providing an RSS feed again, and so the script was retired, and the old RSS URL permanently redirected to the new location. I took a quick look in the Apache logs and saw that it was being used about 1,500 times a day by close to 1,000 different clients (IP addresses).
[https://whitehouse.gov/feed/](https://whitehouse.gov/feed)
The main impetus for doing this was to use [diffengine] to publish [whitehouse_diff]. But maybe you'll find the RSS useful for other things too?
[White House Website]: https://www.whitehouse.gov/news/
[diffengine]: https://github.docnow/diffengine
[whitehouse_diff]: https://twitter.com/whitehouse_diff