{"id":17790426,"url":"https://github.com/briatte/swd","last_synced_at":"2025-04-02T01:18:51.828Z","repository":{"id":144778545,"uuid":"158514679","full_name":"briatte/swd","owner":"briatte","description":"Two-day workshop on scraping legislative data, organised by URFIST Bordeaux in 2018.","archived":false,"fork":false,"pushed_at":"2018-11-21T08:28:41.000Z","size":996,"stargazers_count":3,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-02-07T16:34:27.871Z","etag":null,"topics":["legislative-bill-analysis","legislative-data","r","web-scraping"],"latest_commit_sha":null,"homepage":"https://politbistro.hypotheses.org/6828","language":"R","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/briatte.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-11-21T08:23:32.000Z","updated_at":"2023-04-20T19:19:37.000Z","dependencies_parsed_at":null,"dependency_job_id":"16a07190-4d56-42a4-b057-c6f2654e9533","html_url":"https://github.com/briatte/swd","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/briatte%2Fswd","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/briatte%2Fswd/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/briatte%2Fswd/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/briatte%2Fswd/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/briatte","download_url":"https://codeload.github.com/briatte/swd/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":246735325,"owners_count":20825223,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["legislative-bill-analysis","legislative-data","r","web-scraping"],"created_at":"2024-10-27T10:43:38.475Z","updated_at":"2025-04-02T01:18:51.812Z","avatar_url":"https://github.com/briatte.png","language":"R","funding_links":[],"categories":[],"sub_categories":[],"readme":"R code examples to teach basic Web scraping with [`rvest`](https://cran.r-project.org/package=rvest) and related packages.\n\nUsed at a [two-day workshop][a] in November 2018: refer to the [introductory slides][b], in French, for details.\n\n[a]: https://sygefor.reseau-urfist.fr/#/training/7432/8123/?from=true\n[b]: https://frama.link/urfist-2018-slides\n\n\u003e Please report any __bugs or errors__ in the [issues](/briatte/urfist2018) of this repository, or [email me](mailto:f.briatte@gmail.com).\n\n# DEMOS\n\n1. `lagasafn` · legal cross-references in [Icelandic law](http://www.althingi.is/lagasafn/zip-skra-af-lagasafni/)\n2. `jorf` · XML field extraction from the [French _Official Journal_](https://echanges.dila.gouv.fr/OPENDATA/JORFSIMPLE/)\n3. `cop21` · word extraction from the [UNCC Paris Accord](https://unfccc.int/resource/docs/2015/cop21/eng/l09r01.pdf)\n4. `qosd` · keyword co-occurrence in [French parliamentary questions](http://questions.assemblee-nationale.fr/)\n\nProjects mentioned but not included in the repository:\n\n- [`marsad`](https://github.com/briatte/marsad) · voting behaviour in the [Tunisian parliament](https://majles.marsad.tn/fr/assemblee)\n- [`parlnet`](https://github.com/briatte/parlnet) · bill cosponsorship in European parliaments\n- [`parlviz`](https://github.com/briatte/parlviz) · interactive visualizations of the above\n\nSlides shown but not included in the repository (available on request):\n\n- \"Large-scale legislative data collection from online sources\" (2016)\n- \"_Web scraping_ et APIs avec R\" (2017)\n\n# HOWTO\n\n1. Run the [`dependencies.r`](dependencies.r) script to install all required packages.\n2. Run each code folder separately. Each has its own `.Rproj` file.\n\n# THANKS\n\n- Sabrina Granger and Isabelle Scarpat-Bouvet for excellent logistics.\n- Thomas J. Leeper for his [`word_count` function](https://gist.github.com/leeper/0d0c1ee2c671e03db21bbc45acf6b351), used in the `cop21` example.\n- Emiliano Grossman for inspiring the `qosd` example.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbriatte%2Fswd","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fbriatte%2Fswd","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fbriatte%2Fswd/lists"}