Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/Leagify/colly-draft-prospects
Source code for web scraping football draft prospects
https://github.com/Leagify/colly-draft-prospects
colly golang
Last synced: 2 months ago
JSON representation
Source code for web scraping football draft prospects
- Host: GitHub
- URL: https://github.com/Leagify/colly-draft-prospects
- Owner: Leagify
- Created: 2018-10-18T02:16:31.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2020-10-14T05:38:10.000Z (over 4 years ago)
- Last Synced: 2024-08-03T01:14:42.453Z (6 months ago)
- Topics: colly, golang
- Language: Go
- Size: 13.9 MB
- Stars: 9
- Watchers: 2
- Forks: 5
- Open Issues: 3
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- awesome-rainmana - Leagify/colly-draft-prospects - Source code for web scraping football draft prospects (Go)
README
# colly-draft-prospects
Source code for web scraping football draft prospects from the DraftTek NFL Big Board.The scraper is written in Golang and uses the [Colly](http://go-colly.org/) scraper. The binary file in the repo is compiled for Linux, but it could be compiled to use in a different operating system if needed.
Once the ranks have been scraped, I use [csvkit](https://csvkit.readthedocs.io) to merge all of the files and join them together with information about the locations of the schools. The csvkit commands are in the csvkitcommands.txt files.
Once the ranks have been assembled, I use [OpenRefine](http://openrefine.org/) to clean the data for consistency. The data cleaning steps are contained in openRefineDataMerge.json.