https://github.com/rmax/rho-blogs-crawler
A Scrapy project to export my legacy blogs
https://github.com/rmax/rho-blogs-crawler
Last synced: 9 months ago
JSON representation
A Scrapy project to export my legacy blogs
- Host: GitHub
- URL: https://github.com/rmax/rho-blogs-crawler
- Owner: rmax
- License: bsd-3-clause
- Created: 2010-07-04T19:36:21.000Z (almost 16 years ago)
- Default Branch: master
- Last Pushed: 2010-07-05T01:12:57.000Z (almost 16 years ago)
- Last Synced: 2025-05-15T09:14:58.001Z (about 1 year ago)
- Language: Python
- Homepage:
- Size: 94.7 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.rst
- License: LICENSE
Awesome Lists containing this project
README
===================
Rho's Blogs Crawler
===================
:Author: Rolando Espinoza La fuente
About
=====
This is a Scrapy's project to export my legacy blogs:
- http://ajayu.memi.umss.edu.bo/rho/weblog/
- http://www.softwarelibre.org.bo/rolando/weblog/
And aims to be a example of using Scrapy's CrawlSpider,
Item Loaders, Processors and Pipelines.
Requirements
============
- `Scrapy `_
- lxml