{"id":24318073,"url":"https://github.com/redjax/ohio_utility_scraper","last_synced_at":"2025-09-16T15:44:53.913Z","repository":{"id":168532215,"uuid":"634756178","full_name":"redjax/ohio_utility_scraper","owner":"redjax","description":"Find rates for Ohio gas \u0026 electric utilities. Pulls from https://energychoice.ohio.gov/ApplesToApplesComparision.aspx for utility prices.","archived":false,"fork":false,"pushed_at":"2024-04-25T05:40:27.000Z","size":538,"stargazers_count":0,"open_issues_count":10,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-01-17T14:42:59.878Z","etag":null,"topics":["pdm","python","python3","ruff","scraper","scrapy","sqlalchemy"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/redjax.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null}},"created_at":"2023-05-01T05:19:08.000Z","updated_at":"2023-05-01T05:21:09.000Z","dependencies_parsed_at":"2023-09-22T05:32:40.957Z","dependency_job_id":null,"html_url":"https://github.com/redjax/ohio_utility_scraper","commit_stats":null,"previous_names":["redjax/ohio_utility_scraper"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redjax%2Fohio_utility_scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redjax%2Fohio_utility_scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redjax%2Fohio_utility_scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/redjax%2Fohio_utility_scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/redjax","download_url":"https://codeload.github.com/redjax/ohio_utility_scraper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":242903019,"owners_count":20204203,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["pdm","python","python3","ruff","scraper","scrapy","sqlalchemy"],"created_at":"2025-01-17T14:37:02.197Z","updated_at":"2025-09-16T15:44:48.854Z","avatar_url":"https://github.com/redjax.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Ohio Energy Provider Comparison\n\n[![pdm-managed](https://img.shields.io/badge/pdm-managed-blueviolet)](https://pdm.fming.dev)\n\nScrapes [Energy Choice Ohio](https://energychoice.ohio.gov/ApplesToApplesComparision.aspx)'s provider comparison tables.\n\n- [ELECTRIC](https://energychoice.ohio.gov/ApplesToApplesComparision.aspx?Category=Electric\u0026TerritoryId=6\u0026RateCode=1)\n- [GAS](https://energychoice.ohio.gov/ApplesToApplesComparision.aspx?Category=NaturalGas\u0026TerritoryId=8\u0026RateCode=1)\n\n## Usage\n\n### With PDM\n\n- Setup environment\n  - `$ pdm install`\n- Run start script\n  - `$ pdm start`\n\n### Without PDM\n\n- Create `venv`\n  - `$ virtualenv .venv`\n- Activate `venv`\n  - Linux: `$ . .venv/bin/activate`\n- Install requirements\n  - `$ pip install -r requirements.txt`\n- `cd` to app directory\n  - `$ cd ohioenergy`\n- Run crawler(s)\n  - `$ python main.py`\n\n\n## Notes\n\n### Run Scrapy spiders from a Python script\n\n#### Scrapy's CrawlerRunner, for running multiple crawlers\n\nUtilized `twisted` for async crawls.\n\nExample single crawler, using the `ohioenergy.spiders.ohioenergyproviders.OhioenergyprovidersSpider` spider:\n\n```\nfrom twisted.internet import reactor\nfrom scrapy.crawler import CrawlerRunner\n\nfrom ohioenergy.spiders.ohioenergyproviders import OhioenergyprovidersSpider\n\nif __name__ == \"__main__\":\n    \n    configure_logging({\"LOG_FORMAT\": default_fmt})\n    settings = get_project_settings()\n    \n    runner = CrawlerRunner(settings=settings)\n    \n    electric_providers = runner.crawl(OhioenergyprovidersSpider)\n    \n    ## Add runners and a twisted reactor.stop() to runner\n    electric_providers.addBoth(lambda _: reactor.stop())\n    \n    ## Run crawlers\n    reactor.run()\n\n```\n\nExample multiple crawlers, using hypothetical `Crawler1` and `Crawler2`:\n\n```\nfrom twisted.internet import reactor\nfrom scrapy.crawler import CrawlerRunner\nfrom scrapy.utils.log import configure_logging\nfrom scrapy.utils.project import get_projectsettings\n\n\nclass Spider1(scrapy.Spider):\n    ...\n\nclass Spider2(scrapy.Spider):\n    ...\n\nif __name__ == \"__main__\":\n    settings = get_project_settings()\n    runner = CrawlerRunner(settings)\n\n    ## Add spiders to runner\n    runner.crawl(Spider1)\n    runner.crawl(Spider2)\n\n    ## Join crawlers\n    crawl = runner.join()\n\n    ## Set Twisted's reactor.stop()\n    crawl.addBoth(lambda _: reactor.stop())\n\n    ## Run crawler\n    reactor.run()\n\n```\n\n#### Scrapy's CrawlerProcess\n\nUse `scrapy.crawler.CrawlerProcess` to run spiders. Make sure to import spiders into the script.\n\nExample using the `ohioenergy.spiders.ohioenergyproviders.OhioenergyprovidersSpider` spider:\n\n```\n## main.py\n\nimport scrapy\n## Import CrawlerProcess\nfrom scrapy.crawler import CrawlerProcess\n## Import scrapy project's settings\nfrom scrapy.utils.project import get_project_settings\n\n## Import OhioenergyprovidersSpider\nfrom ohioenergy.spiders.ohioenergyproviders import OhioenergyprovidersSpider\n\nif __name__ == \"__main__\":\n    \n    ## Create CrawlerProcess object. Initialize with Scrapy project's settings\n    process = CrawlerProcess(get_project_settings())\n    \n    ## Prepare crawl\n    process.crawl(OhioenergyprovidersSpider)\n    ## Start crawl\n    process.start()\n\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fredjax%2Fohio_utility_scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fredjax%2Fohio_utility_scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fredjax%2Fohio_utility_scraper/lists"}