Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/basemax/jadi-net-blog
This Python script is used to extract posts from a WordPress blog (https://jadi.net/) and save them in HTML format. The script fetches the RSS feed, parses the posts, and saves each post as an individual HTML file.
https://github.com/basemax/jadi-net-blog
blog-copier copier crawler crawler-python crawlers jadi-blog jadi-clone jadi-net-blog jadi-net-clone jadinet-blog py python python-crawler wordpress wp
Last synced: 17 days ago
JSON representation
This Python script is used to extract posts from a WordPress blog (https://jadi.net/) and save them in HTML format. The script fetches the RSS feed, parses the posts, and saves each post as an individual HTML file.
- Host: GitHub
- URL: https://github.com/basemax/jadi-net-blog
- Owner: BaseMax
- License: mit
- Created: 2024-12-01T20:34:00.000Z (about 1 month ago)
- Default Branch: main
- Last Pushed: 2024-12-17T00:08:10.000Z (23 days ago)
- Last Synced: 2024-12-17T01:26:54.559Z (23 days ago)
- Topics: blog-copier, copier, crawler, crawler-python, crawlers, jadi-blog, jadi-clone, jadi-net-blog, jadi-net-clone, jadinet-blog, py, python, python-crawler, wordpress, wp
- Language: HTML
- Homepage: https://basemax.github.io/jadi-net-blog
- Size: 5.11 MB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Jadi Net Blog Extractor
This Python script is used to extract posts from a WordPress blog (https://jadi.net/) and save them in HTML format. The script fetches the RSS feed, parses the posts, and saves each post as an individual HTML file.
## Requirements
To use the script, you need to install the following dependencies:
```
pip install -r requirements.txt
```## Why This Project?
The website **jadi.net** has been blocked within Iran's internet network, making it difficult for users in Iran to access the site. To view the blog, one must use a VPN to bypass the restrictions. To address this issue, I created this GitHub repository to automatically fetch and extract the blog posts from **jadi.net** and save them in this repository.
This repository serves as a **clone blog** of **jadi.net**, where all blog posts are saved in a filter-less, easily accessible manner. You can now access the content directly on GitHub, which is not blocked by the Iranian government.
## How to Use
1. Clone this repository:
```
git clone https://github.com/BaseMax/jadi-net-blog.git
cd jadi-net-blog
```2. Run the script:
```
python extract_posts.py
```The script will fetch posts from the RSS feed and save them as HTML files in the current directory.
## License
MIT License
© 2024 Max Base