https://github.com/ranman/angellist-scraper
https://github.com/ranman/angellist-scraper
Last synced: 10 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/ranman/angellist-scraper
- Owner: ranman
- Created: 2013-09-30T15:27:58.000Z (over 12 years ago)
- Default Branch: master
- Last Pushed: 2014-03-24T19:33:50.000Z (about 12 years ago)
- Last Synced: 2025-08-03T23:37:28.485Z (11 months ago)
- Language: Python
- Size: 176 KB
- Stars: 6
- Watchers: 0
- Forks: 11
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
## Synopsis
Put the Angellist startups into MongoDB using a Iron.io queue and python.
## Code Example
nope.
## Motivation
Sometimes Randall programs things and I try to fix them.
## Installation
Set the follow environment variables
* MONGO_URL- url to your mongoDB instance
* MONGO_USER - user for mongoDB instance
* MONGO_PASSWORD - password for above user
* SCHEMA_NAME - name of MongoDB schema
* IRON_PROJECT_ID- IRON_MQ project ID
* IRON_TOKEN - IRON_MQ project token
Install the python environment
```
virtualenv venv
. venv/bin/activate
pip install -r requirements.txt
```
Then do this:
```shell
$ gem install iron_worker_ng
$ iron_worker upload startup.worker
$ python pyq.py
$ repeat 500 python enqueue.py
```
That last line starts up 500 workers that will ask for 1000 ids at a time for the next few hours.
Apparently you don't have to do this anymore you can just do:
```shell
$ repeat 500 iron_worker queue startup.worker --timeout 3600
```
## Contributors
I welcome you.
## License
I have no idea what licensing is but feel free to use this code without any warranty or liability.