https://github.com/scrapinghub/portia

Visual scraping for Scrapy
https://github.com/scrapinghub/portia

Last synced: about 1 month ago
JSON representation

Visual scraping for Scrapy

Host: GitHub
URL: https://github.com/scrapinghub/portia
Owner: scrapinghub
License: bsd-3-clause
Created: 2014-03-21T14:24:31.000Z (over 11 years ago)
Default Branch: master
Last Pushed: 2024-06-26T19:43:46.000Z (12 months ago)
Last Synced: 2024-10-29T10:55:05.325Z (8 months ago)
Language: Python
Size: 24.4 MB
Stars: 9,296
Watchers: 504
Forks: 1,408
Open Issues: 130
Metadata Files:
- Readme: README.md
- Changelog: CHANGES
- License: LICENSE

Awesome Lists containing this project

my-awesome-starred - portia - Visual scraping for Scrapy (JavaScript)
my-awesome-github-stars - scrapinghub/portia - Visual scraping for Scrapy (Python)
awesome-scrapy - Portia
starred-awesome - portia - Visual scraping for Scrapy (Python)
awesome-python-resources - GitHub - 24% open · ⏱️ 10.07.2019): (HTML 处理)

README

Portia
======

Portia is a tool that allows you to visually scrape websites without any programming knowledge required. With Portia you can annotate a web page to identify the data you wish to extract, and Portia will understand based on these annotations how to scrape data from similar pages.

# Running Portia

The easiest way to run Portia is using [Docker]:

You can run Portia using Docker & official Portia-image by running:

docker run -v ~/portia_projects:/app/data/projects:rw -p 9001:9001 scrapinghub/portia

You can also set up a local instance with [Docker-compose] by cloning this repo & running from the root of the folder:

docker-compose up

For more detailed instructions, and alternatives to using Docker, see the [Installation] docs.

# Documentation

Documentation can be found from [Read the docs]. Source files can be found in the ``docs`` directory.

[Docker]: https://www.docker.com/
[Docker-compose]:https://docs.docker.com/compose
[Installation]: http://portia.readthedocs.org/en/latest/installation.html
[Read the docs]: http://portia.readthedocs.org/en/latest/index.html
[Scrapinghub]: https://portia.scrapinghub.com/

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/scrapinghub/portia

Awesome Lists containing this project

README