Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/dav009/congresovisible
Data dumps of Colombian Senate votes
https://github.com/dav009/congresovisible
colombia open-data scraper
Last synced: about 2 months ago
JSON representation
Data dumps of Colombian Senate votes
- Host: GitHub
- URL: https://github.com/dav009/congresovisible
- Owner: dav009
- Created: 2014-11-05T23:47:38.000Z (about 10 years ago)
- Default Branch: master
- Last Pushed: 2014-11-15T20:48:10.000Z (about 10 years ago)
- Last Synced: 2024-10-25T06:49:21.553Z (3 months ago)
- Topics: colombia, open-data, scraper
- Language: Python
- Homepage:
- Size: 5.62 MB
- Stars: 3
- Watchers: 4
- Forks: 2
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Dumps congresovisible.org
[Congresovisible.org](http://www.congresovisible.org) is a great project which provides information about :
- Colombian law projects
- How are those projects voted
- Votes made by Senators and CongressmenSadly they don't provide an API for this valuable information. So this repo provides :
- code to scrape their website in order to extract valuable information
- data dumps (in json format)## How is the data structured
### Json Dump
Every line of the json dump corresponds to a json dictionary representing a voting event, every event contains the following data:
```json
[
{
"camara" : "Cámara de Representantes",
"estado" : "aprobado",
"id": 3014,
"ano": "2014",
"mes_dia": "Sep 03",
"desacuerdo": "1%",
"comisiones": "",
"acuerdo": "99%",
"procedimiento": "Descripcion proyecto de ley","detailed" : {
{"Álvaro Uribe": {"party": "Centro Democratico", "vote": "Aprobado"},
....
....
}}
]
```- `camara`: Which Legislature voted
- `id`: Congresovisible.org database identifier
- `ano`: Year in which the voting took place
- `mes_dia`: month, day in which the voting took place
- `detailed`: dictionary containing the name of politicians as keys, and a json object describing their party and vote as a value.Each line of the file should be a parsable json object.
### TSV Data
The tsv data is split in two files:
- `votes.tsv`: contains the votes of politicians in sessions, each session is an identifier referencing a session description in `sessions.csv`
- `sessions.tsv`: contains a session description, date, and legislature.# How to use it?
- If you just want to use the data, clone this repo and go to the folder `dumps`, pick your file ^^.
- If you want to generate a new dump:
1. Create a virtualenv with python3.4
2. `pip install -r requirements.txt`
3. `python main.py`## Examples
### Clustering Senators
![](https://d262ilb51hltx0.cloudfront.net/max/2000/1*EMhjnbqtFA5Qjf8wBWo54w.png)
`clustering.r` :
- Set your working folder to the clustering sample:
```
setwd("path...to..repo/congresovisible/samples/senators_clustering/")
```- Run the clustering by doing: `source("clustering.r")`
- Note: please install the needed r packages.
## Contact