Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/clhenrick/nyc_property_tax_docs
Scraping data from NYC dept of finance tax documents
https://github.com/clhenrick/nyc_property_tax_docs
Last synced: 11 days ago
JSON representation
Scraping data from NYC dept of finance tax documents
- Host: GitHub
- URL: https://github.com/clhenrick/nyc_property_tax_docs
- Owner: clhenrick
- Created: 2015-02-04T18:07:28.000Z (almost 10 years ago)
- Default Branch: master
- Last Pushed: 2016-10-01T17:36:01.000Z (about 8 years ago)
- Last Synced: 2024-11-05T13:56:51.533Z (15 days ago)
- Language: Python
- Size: 3.91 KB
- Stars: 1
- Watchers: 5
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# nyc_property_tax_docs
Scraping data from NYC Dept Of Finance building tax document PDFs## DEPRECIATED: PROJECT MOVED TO [talos/nyc-stabilization-unit-counts](https://github.com/talos/nyc-stabilization-unit-counts)
## To Do:
- [ ] script to dl PDFs by bbl & date
- [ ] script to scrape data from tax documents
- [ ] determine which properties have rent stabilized units
- [ ] determine rent stabilized unit increase or decrease from 2009 - 2014## URL Endpoints:
* **Example URL for a PDF Document:** For a property in Staten Island (5) with the block 881 and lot 161:
http://nycprop.nyc.gov/nycproperty/StatementSearch?bbl=5008810161&stmtDate=20141121&stmtType=SOA
* **Note:** URLs don't all use the same date for statements so it might be better to do a POST request in the following URL: http://nycprop.nyc.gov/nycproperty/nynav/jsp/selectbbl.jsp
Then do a search for the links to the quarterly statements.### Form Data
```
FFUNC:C
q49_boro:1
q49_block_id:01221
q49_lot:0007
q49_prp_ad_street_no:109
q49_prp_nm_street:WEST 90 STREET
q49_prp_id_apt_num:
q49_prp_ad_city:New york
q49_prp_cd_state:NY
q49_prp_cd_addr_zip:10024
bblAcctKeyIn1:1
bblAcctKeyIn2:01221
bblAcctKeyIn3:0007
bblAcctKeyIn4:
ownerName:NYC HOUSING AUTHORITY
ownerName1:NYC HOUSING AUTHORITY
ownerName2:
ownerName3:
ownerName4:
```
## PDF scraping options
* [pdf.js](http://mozilla.github.io/pdf.js/)
* [pdf extract](https://github.com/nisaacson/pdf-extract)## Resources
These things might be helpful:- [Rent Stabilized Building List](https://github.com/clhenrick/dhcr-rent-stabilized-data) (via DHCR/RGB)
- [NYC Geo Client API](https://developer.cityofnewyork.us/api/geoclient-api)
- [NYC MapPluto data sets archive](http://www.nyc.gov/html/dcp/html/bytes/archive_pluto_mappluto.shtml)