{"id":28882680,"url":"https://github.com/caesarw0/lancaster-property-tax-scraper","last_synced_at":"2026-04-28T00:32:31.957Z","repository":{"id":299515501,"uuid":"996591586","full_name":"caesarw0/lancaster-property-tax-scraper","owner":"caesarw0","description":"Automated Python scraper for extracting delinquent tax data from Lancaster County, PA's public parcel viewer. Accepts parcel list input, extracts only relevant data, and outputs clean CSV files. Built for large-scale use with request throttling and error handling.","archived":false,"fork":false,"pushed_at":"2025-06-16T23:19:49.000Z","size":20829,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-17T00:25:01.269Z","etag":null,"topics":["csv-export","playwright","python","python-scraping","real-estate-data","scrapping-python","web-scraper","webscraping"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/caesarw0.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-05T07:04:55.000Z","updated_at":"2025-06-16T23:19:49.000Z","dependencies_parsed_at":"2025-06-17T00:35:31.416Z","dependency_job_id":null,"html_url":"https://github.com/caesarw0/lancaster-property-tax-scraper","commit_stats":null,"previous_names":["caesarw0/lancaster-property-tax-scraper"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/caesarw0/lancaster-property-tax-scraper","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caesarw0%2Flancaster-property-tax-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caesarw0%2Flancaster-property-tax-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caesarw0%2Flancaster-property-tax-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caesarw0%2Flancaster-property-tax-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/caesarw0","download_url":"https://codeload.github.com/caesarw0/lancaster-property-tax-scraper/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/caesarw0%2Flancaster-property-tax-scraper/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32361477,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-27T20:07:02.737Z","status":"ssl_error","status_checked_at":"2026-04-27T20:07:00.910Z","response_time":128,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv-export","playwright","python","python-scraping","real-estate-data","scrapping-python","web-scraper","webscraping"],"created_at":"2025-06-20T21:01:46.013Z","updated_at":"2026-04-28T00:32:31.952Z","avatar_url":"https://github.com/caesarw0.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Lancaster County Tax Delinquency Scraper\n\n[![Python 3.7+](https://img.shields.io/badge/python-3.7+-blue.svg)](https://www.python.org/downloads/)\n[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)\n[![Playwright](https://img.shields.io/badge/playwright-1.41+-green.svg)](https://playwright.dev/)\n\n## Demo\n\nWatch the scraper in action:\n\n![Scraper Demo](/img/property_scraper_demo.gif)\n*Automated scraping process demonstration*\n\n## Overview\n\nAn automated Python script to extract delinquent tax information from Lancaster County, PA's public parcel viewer system.\nThis scraper extracts delinquent tax data from the Lancaster County Property Tax portal:\n\n```javascript\nhttps://lancasterpa.devnetwedge.com/parcel/view/{parcel_number}/{tax_year}\n```\n\nExample URL:\n\n[https://lancasterpa.devnetwedge.com/parcel/view/5408465600000/2025](https://lancasterpa.devnetwedge.com/parcel/view/5408465600000/2025)\n\n\n### Web Interface\n\n![Parcel View](/img/lancaster_landing.png)\n\n![Tax Img](/img/delinquent_taxes_screenshot.png)\n*Screenshot of the Lancaster County Parcel Viewer interface where data is extracted from*\n\n## Sample Output\n\n### CSV Output Format\n\n![CSV Output](/img/sample_output.png)\n\nThe script generates a structured CSV file containing delinquent tax information:\n\n```csv\nparcel_number,address,owner,scrape_date,tax_year,amount_due,amount_paid,total_due\n5408465600000,123 MAIN ST LANCASTER PA,JOHN DOE,2024-03-20,2023,1500.00,0.00,1500.00\n1200794700000,456 ELM ST LANCASTER PA,JANE SMITH,2024-03-20,2022,2000.00,500.00,1500.00\n```\n\n## Project Structure\n\n```text\nlancaster-property-tax-scraper/\n├── src/\n│   └── property_scraper.py  # Main scraper implementation\n├── output/\n│   └── delinquent_taxes.csv # Generated output file\n├── img/                     # Documentation images\n├── requirements.txt         # Python dependencies\n└── README.md               # Documentation\n```\n\n## How It Works\n\n### Overall Workflow\n\n```mermaid\ngraph LR\n    A[\"Input Parcel List\"] --\u003e B[\"Initialize Scraper\"]\n    B --\u003e C[\"Process Each Parcel\"]\n    C --\u003e D[\"Check for\u003cbr/\u003eDelinquent Taxes\"]\n    D --\u003e E{\"Has Delinquent\u003cbr/\u003eTaxes?\"}\n    E --\u003e|\"Yes\"| F[\"Extract Data\"]\n    E --\u003e|\"No\"| G[\"Skip Parcel\"]\n    F --\u003e H[\"Add to Results\"]\n    G --\u003e C\n    H --\u003e C\n    C --\u003e I[\"Export to CSV\"]\n```\n\n### Data Extraction Process\n\n```mermaid\ngraph TD\n    A[\"Parcel Page\"] --\u003e B[\"Basic Info\"]\n    A --\u003e C[\"Tax Info\"]\n    B --\u003e D[\"Parcel Number\"]\n    B --\u003e E[\"Property Address\"]\n    B --\u003e F[\"Owner Details\"]\n    C --\u003e G[\"Tax Year\u003cbr/\u003e2022-2024\"]\n    C --\u003e H[\"Amount Due\"]\n    C --\u003e I[\"Amount Paid\"]\n    C --\u003e J[\"Total Due\"]\n    G \u0026 H \u0026 I \u0026 J --\u003e K[\"CSV Record\"]\n```\n\n### Error Handling \u0026 Rate Limiting\n\n```mermaid\nsequenceDiagram\n    participant S as Scraper\n    participant W as Web Server\n    participant D as Database\n    S-\u003e\u003eW: Request Parcel Page\n    Note over S,W: 2-5 second delay\n    W-\u003e\u003eS: Return Page\n    S-\u003e\u003eS: Extract Data\n    alt Success\n        S-\u003e\u003eD: Store Results\n    else Network Timeout\n        S-\u003e\u003eS: Retry Request\n    else No Data Found\n        S-\u003e\u003eS: Log \u0026 Skip\n    end\n```\n\n## Features\n\n- Automated scraping of delinquent tax data from Lancaster County's parcel viewer\n- Handles multiple parcel numbers in batch\n- Extracts data for tax years 2022-2024\n- Collects property address and owner information\n- Outputs results to CSV format\n- Built-in rate limiting to prevent server overload\n- Only captures parcels with actual delinquent taxes\n\n## Data Extracted\n\nFor each parcel with delinquent taxes, the script collects:\n\n- Parcel number\n- Property address\n- Owner information\n- Tax year (2022-2024)\n- Amount due\n- Amount paid\n- Total due\n- Scrape date\n\n## Prerequisites\n\n- Python 3.7+\n- Playwright\n- Pandas\n\n## Installation\n\n1. Clone this repository:\n\n```bash\ngit clone https://github.com/caesarw0/lancaster-property-tax-scraper.git\ncd lancaster-property-tax-scraper\n```\n\n2. Install required packages:\n\n```bash\npip install -r requirements.txt\n```\n\n3. Install Playwright browsers:\n\n```bash\nplaywright install\n```\n\n## Usage\n\n1. Prepare a list of parcel numbers in the script or import them from a file.\n\n2. Run the script:\n\n```bash\npython src/property_scraper.py\n```\n\nThe script will:\n\n- Process each parcel number\n- Extract delinquent tax information if available\n- Save results to `output/delinquent_taxes.csv`\n\n### Example Code\n\n```python\nfrom property_scraper import scrape_multiple_parcels\n\nparcel_numbers = [\n    \"5408465600000\",\n    \"1200794700000\",\n]\n\ndf = scrape_multiple_parcels(parcel_numbers)\n```\n\n## Rate Limiting\n\nThe script includes built-in delays between requests (2-5 seconds) to avoid overwhelming the server. This helps ensure:\n\n- Ethical scraping practices\n- Reduced likelihood of IP blocking\n- Server resource conservation\n\n## Output Format\n\nThe script generates a CSV file with the following columns:\n\n- parcel_number\n- address\n- owner\n- scrape_date\n- tax_year\n- amount_due\n- amount_paid\n- total_due\n\n## Error Handling\n\nThe script includes robust error handling for:\n\n- Network timeouts\n- Missing data\n- Invalid parcel numbers\n- Server errors\n\n## Legal Notice\n\nThis tool is designed for legitimate data collection from publicly available information. Users should:\n\n- Review and comply with Lancaster County's terms of service\n- Use reasonable request rates\n- Respect the public resource\n\n## Contributing\n\nContributions are welcome! Please feel free to submit a Pull Request.\n\n## License\n\n[MIT License](LICENSE)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcaesarw0%2Flancaster-property-tax-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcaesarw0%2Flancaster-property-tax-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcaesarw0%2Flancaster-property-tax-scraper/lists"}