Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/theobrigitte/octocat
Content scrapper for https://budgetparticipatif.paris.fr/
https://github.com/theobrigitte/octocat
Last synced: 6 days ago
JSON representation
Content scrapper for https://budgetparticipatif.paris.fr/
- Host: GitHub
- URL: https://github.com/theobrigitte/octocat
- Owner: TheoBrigitte
- Created: 2018-04-26T20:04:06.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2019-05-07T18:04:07.000Z (over 5 years ago)
- Last Synced: 2024-10-10T19:39:38.804Z (28 days ago)
- Language: Go
- Homepage:
- Size: 1.48 MB
- Stars: 0
- Watchers: 2
- Forks: 1
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
[![GoDoc](https://godoc.org/github.com/TheoBrigitte/octocat?status.svg)](https://godoc.org/github.com/TheoBrigitte/octocat)
[![Go Report Card](https://goreportcard.com/badge/github.com/TheoBrigitte/octocat)](https://goreportcard.com/report/github.com/TheoBrigitte/octocat)# Octocat
Octocat is a content scrapper for [budgetparticipatif.paris.fr](https://budgetparticipatif.paris.fr/).
It fetch every projects informations found on the [search page](https://budgetparticipatif.paris.fr/bp/jsp/site/Portal.jsp?page=search-solr&conf=list_idees) and store them in json or csv.It uses:
* [goquery](https://github.com/PuerkitoBio/goquery) for html parsing.
* [logrus](https://github.com/sirupsen/logrus) for logging.[https://godoc.org/github.com/TheoBrigitte/octocat/pkg](https://godoc.org/github.com/TheoBrigitte/octocat/pkg)
### Scraped data
For every project found it scrap the following data:
* Title
* URL
* Description
* Localisation
* Theme
* Like
* Follower
* IsPopular
* Year
* Status
* Cost
* Author
* Creation date
* Creation author
* Attachements
* CommentsFor a visual description of those elements on a project page see [example.png](https://github.com/TheoBrigitte/octocat/project-example.png)
### Download
`go get -v github.com/TheoBrigitte/octocat/cmd/scrape`
### Run
`scrape -o result.json`
### Dependencies
Dependencies are managed using [dep](https://github.com/golang/dep)