{"id":18511329,"url":"https://github.com/hasnocool/indeed-job-scraper","last_synced_at":"2026-01-05T13:02:34.735Z","repository":{"id":194672809,"uuid":"691200599","full_name":"hasnocool/indeed-job-scraper","owner":"hasnocool","description":"A web scraper built using Selenium and Python to extract job listings from Indeed.com with rate limiting and logging features.","archived":false,"fork":false,"pushed_at":"2024-09-18T01:02:36.000Z","size":9,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2024-12-25T20:41:41.792Z","etag":null,"topics":["chromedriver","indeed","job","json","listings","logging","pagination","python","scraper","scraping","script","selenium","web","webdriver"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/hasnocool.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-13T17:38:06.000Z","updated_at":"2024-11-08T03:36:56.000Z","dependencies_parsed_at":"2024-05-06T07:43:12.246Z","dependency_job_id":"c3ce935d-58d0-4f54-ae0d-7f0135697684","html_url":"https://github.com/hasnocool/indeed-job-scraper","commit_stats":null,"previous_names":["hasnocool/python-indeed-scraper","hasnocool/indeed-job-scraper"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hasnocool%2Findeed-job-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hasnocool%2Findeed-job-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hasnocool%2Findeed-job-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/hasnocool%2Findeed-job-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/hasnocool","download_url":"https://codeload.github.com/hasnocool/indeed-job-scraper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":239225764,"owners_count":19603162,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chromedriver","indeed","job","json","listings","logging","pagination","python","scraper","scraping","script","selenium","web","webdriver"],"created_at":"2024-11-06T15:28:02.510Z","updated_at":"2025-10-31T19:30:32.210Z","avatar_url":"https://github.com/hasnocool.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"**README.md**\n================================\n\n**Indeed Job Scraper**\n======================\n\n**Project Title**: Indeed Job Scraper\n----------------------------------------\n\nI built this to automate the process of scraping job listings from Indeed.com, making it easier to collect and analyze data on job postings in a specific location. This project leverages web scraping techniques using Selenium and JSON parsing with Python.\n\n**Description**\n---------------\n\nIndeed Job Scraper is designed to fetch job listings from Indeed.com based on specified criteria (e.g., sponsorship, Chicago, IL), then parse the extracted data into a more structured format (JSON) for further analysis. The tool includes rate limiting to prevent overloading the website and ensure smooth operation.\n\n**Features**\n------------\n\n*   **Web Scraping**: Utilizes Selenium to fetch job listings from Indeed.com.\n*   **Rate Limiting**: Includes a retry mechanism with delays to avoid overwhelming the website.\n*   **JSON Output**: Exports extracted data in JSON format for further processing.\n*   **CSV Conversion**: Optionally, parses JSON output into a CSV file.\n\n**Installation**\n----------------\n\n### Prerequisites\n\n*   Python 3.x (preferably 3.9 or later)\n*   Selenium WebDriver (ChromeDriver)\n*   json and csv libraries\n\n### Installation Steps\n\n1.  Clone this repository using Git.\n2.  Install required libraries using pip: `pip install selenium`\n3.  Download the ChromeDriver from [here](https://chromedriver.chromium.org/downloads) and add it to your system's PATH.\n\n**Usage**\n----------\n\n### Running the Scraper\n\n1.  Execute the `job_scraper_with_rate_limiting.py` script.\n2.  The tool will fetch job listings based on the specified criteria (sponsorship, Chicago, IL).\n3.  It will parse extracted data into JSON format and save it to a file named `log_{timestamp}.json`.\n\n### Optional CSV Conversion\n\n1.  After running the scraper, execute the `parse_json_file_to_csv.py` script.\n2.  This will convert the JSON output from the previous step into a CSV file named `job_data_extended.csv`.\n\n**Contributing**\n---------------\n\nContributions are welcome! If you'd like to enhance this project or add new features, please follow these steps:\n\n1.  Fork this repository on GitHub.\n2.  Make your changes in a new branch (e.g., `feature/new-feature`).\n3.  Commit your changes with descriptive commit messages.\n4.  Submit a pull request for review.\n\n**License**\n----------\n\nIndeed Job Scraper is released under the [MIT License](https://opensource.org/licenses/MIT).\n\n**Tags/Keywords**\n-----------------\n\nIndeed, web scraping, Selenium, rate limiting, JSON parsing, CSV conversion\n\n[![Python Version](https://img.shields.io/badge/Python-3.x-green.svg)](https://www.python.org/)\n[![Selenium](https://img.shields.io/badge/Selenium-4.0-green.svg)](https://selenium.dev/)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhasnocool%2Findeed-job-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fhasnocool%2Findeed-job-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fhasnocool%2Findeed-job-scraper/lists"}