{"id":20234960,"url":"https://github.com/datamade/car-scraper","last_synced_at":"2025-07-12T04:39:04.457Z","repository":{"id":53302185,"uuid":"88547221","full_name":"datamade/car-scraper","owner":"datamade","description":"💲Make spreadsheets out of Chicago Association of REALTORS® reports","archived":false,"fork":false,"pushed_at":"2023-04-28T13:36:23.000Z","size":71,"stargazers_count":3,"open_issues_count":2,"forks_count":2,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-04-04T15:50:46.463Z","etag":null,"topics":["makefile","pdf-converter","web-scraping"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/datamade.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2017-04-17T20:21:31.000Z","updated_at":"2024-12-26T06:18:45.000Z","dependencies_parsed_at":"2022-09-02T03:23:00.472Z","dependency_job_id":null,"html_url":"https://github.com/datamade/car-scraper","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/datamade/car-scraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datamade%2Fcar-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datamade%2Fcar-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datamade%2Fcar-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datamade%2Fcar-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/datamade","download_url":"https://codeload.github.com/datamade/car-scraper/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/datamade%2Fcar-scraper/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":264939777,"owners_count":23686216,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["makefile","pdf-converter","web-scraping"],"created_at":"2024-11-14T08:13:56.178Z","updated_at":"2025-07-12T04:39:04.440Z","avatar_url":"https://github.com/datamade.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CAR Scraper\n\nGrab Chicagoland real estate reports from the CAR website and convert them all to spreadsheets.\n\n### Requirements\n\nMake sure you have OS-level requirements installed:\n\n* Python 3.3+ (standard DataMade tool)\n* Java (or any JRE)\n* pdfinfo (built-in on Ubuntu, available for other Linux distros as part of [Xpdf](http://www.foolabs.com/xpdf/download.html) - mac users can also use the [Poppler](https://poppler.freedesktop.org/) fork via homebrew: `brew install poppler`)\n\nThen, make a virtualenv and install Python requirements:\n\n```\nmkvirtualenv car-scraper\npip install -U -r requirements.txt\n```\n\nFinally, build tabula-java 0.9.1 from source:\n\n```\nmake tabula-java\n```\n\n### Running the scraper\n\nYou'll need to decrypt the CAR login credentials before you can scrape the PDFs.\nIf you're on the keyring for this repo, you can decrypt the secrets file:\n\n```\nblackbox_cat configs/secrets.py.gpg \u003e scripts/secrets.py\n```\n\nOtherwise, copy over the example secrets file:\n\n```\ncp configs/secrets.example.py \u003e scripts/secrets.py\n```\n\nThen, adjust the variables to reflect your CAR username and password:\n\n```\nCAR_USER = '\u003cyour_username\u003e'\nCAR_PASS = '\u003cyour_password\u003e'\n```\n\nSet the desired month and year for the reports in `config.mk`:\n\n```bash\n# follow this format:\nyear = 2016\nmonth = 02\n```\n\nUse the DataMade Make standard operating procedure to get your files. `make all` produces the final output for the year/month you selected, and `make clean` removes all generated files from your repo.\n\n### Output\n\nOutput files land in the `final/` directory. Files with `monthly` in the name catalogue month-over-month statistics, while files with `yearly` in the name catalogue year-to-date totals. \n\nIf you're interested in year-end statistics, just run the scraper for December of a given year (`$(month) = 12`) and grab the `yearly` files. **These are the files we use in Where to Buy.**\n\n### Errors\n\nIn the process of cleaning the CSVs, the scraper will double-check to make sure that table values look plausible. It will print these errors to the console while making the target `cleaned_csvs`, but you can also examine the output file `conversion_errors.csv` if you want to inspect further. Error messages look something like this:\n\n```\nPercentage error in raw/csvs/suburbs/clean/DuPage_County_4.csv\nCommunity: Carol Stream\nColumn: months_supply_change\nRow value: -35.8\nCalculated delta: -34.5\n(Note: calculated deltas should be within +-1 of the row value.)\n```\n\nCAR often slightly miscalculates changes in values between years, as you can see above. This is the most frequent error I've encountered, and you can safely ignore it as long as the delta is within a reasonable range.\n\n### Team\n\n* Jean Cochrane - code\n* Forest Gregg - mentorship\n\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatamade%2Fcar-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fdatamade%2Fcar-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fdatamade%2Fcar-scraper/lists"}