{"id":18146457,"url":"https://github.com/sameermujahid/naukri-web-scraper","last_synced_at":"2026-04-27T22:31:21.258Z","repository":{"id":258267323,"uuid":"873746424","full_name":"sameermujahid/Naukri-web-scraper","owner":"sameermujahid","description":"Naukri Web Scraping is the project based on scraping jobs post related to any domains and their details.","archived":false,"fork":false,"pushed_at":"2024-11-05T10:44:19.000Z","size":485,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-02-13T02:47:32.087Z","etag":null,"topics":["chromedriver","python","selenium","webscraping"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sameermujahid.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2024-10-16T16:42:00.000Z","updated_at":"2024-12-10T21:33:35.000Z","dependencies_parsed_at":"2024-10-18T07:50:32.614Z","dependency_job_id":"175d2d45-6cfe-4f35-8ffd-ea352ef38d4a","html_url":"https://github.com/sameermujahid/Naukri-web-scraper","commit_stats":null,"previous_names":["sameermujahid/naukari-web-scraper","sameermujahid/naukri-web-scraper"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sameermujahid%2FNaukri-web-scraper","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sameermujahid%2FNaukri-web-scraper/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sameermujahid%2FNaukri-web-scraper/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sameermujahid%2FNaukri-web-scraper/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sameermujahid","download_url":"https://codeload.github.com/sameermujahid/Naukri-web-scraper/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247550642,"owners_count":20956984,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["chromedriver","python","selenium","webscraping"],"created_at":"2024-11-01T21:07:48.461Z","updated_at":"2026-04-27T22:31:21.227Z","avatar_url":"https://github.com/sameermujahid.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# Job Scraper\n\n## What is this?\n\nThis Job Scraper is a Python-based web scraping tool that extracts job listings from various job categories on the Naukri.com website. The scraper gathers detailed job information, including job title, company, location, salary, key skills, and more, and saves the collected data in a CSV format for further analysis or use.\n\n## Features\n\n- **Customizable Job Categories**: Scrapes job details from various categories, allowing you to add or modify categories as needed (e.g., Data Scientist, Data Analyst, DevOps, Full Stack, Python Developer).\n\n- **Flexible Data Fields**: Customize the specific data fields you want to scrape, ensuring you only collect information relevant to your needs.\n\n- **CSV Export**: Saves the scraped data into a CSV file, making it easy to access, analyze, and integrate with other tools.\n\n\n## Installation\n\nTo set up this project on your local machine, follow these steps:\n\n1. **Clone the repository**:\n   ```bash\n   git clone https://github.com/sameermujahid/Naukari-web-scraper.git\n   cd Naukari-web-scraper\n   ```\n\n2. **Install required packages**:\n   Make sure you have Python installed on your machine. Then, install the necessary packages using pip:\n   ```bash\n   pip install selenium beautifulsoup4\n   ```\n\n3. **Download ChromeDriver**:\n   - Ensure you have the Chrome browser installed.\n   - Download the ChromeDriver that matches your browser version from [ChromeDriver Download](https://googlechromelabs.github.io/chrome-for-testing/).\n   - Place the downloaded `chromedriver.exe` file in a suitable directory, e.g., `C:\\Downloads\\chromedriver-win64\\`.\n\n4. **Update the script**:\n   - Open the scraper code in your preferred text editor.\n   - Update the `chrome_driver_path` variable in the script to point to the location of your `chromedriver.exe` file.\n\n## How to Run the Scraper\n\nTo run the job scraper, execute the following command in your terminal:\n\n```bash\npython scraper.py\n```\n## Modifying Job Categories and Data Columns\n\nYou can easily customize the job categories and the specific data fields you want to scrape by modifying the code.\n\n### To change job categories:\n1. Locate the `job_categories` dictionary in the script.\n2. Add or modify categories and their corresponding URLs as needed:\n   ```python\n   job_categories = {\n       \"New Category\": \"https://www.naukri.com/new-category-jobs\",\n       ...\n   }\n   ```\n\n### To change the columns in the dataset:\n1. Locate the `fieldnames` list in the script where the CSV header is defined.\n2. Add or remove fields from this list to match the data you want to collect:\n   ```python\n   fieldnames = [\n       \"Job ID\", \"Job Title\", \"Company\", \"Reviews\", ...\n       # Add or remove fields as needed\n   ]\n   ```\n### To customize the number of jobs:\nAdjust the number of jobs you want to scrape by modifying the function call in the script:\n ```python\nscrape_jobs_from_category(url, 150)  # Replace 150 with your desired number of jobs\n```\n## Acknowledgements\n\n- Thanks to the developers of [Selenium](https://www.selenium.dev/) and [BeautifulSoup](https://www.crummy.com/software/BeautifulSoup/) for their excellent libraries that make web scraping easier.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsameermujahid%2Fnaukri-web-scraper","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsameermujahid%2Fnaukri-web-scraper","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsameermujahid%2Fnaukri-web-scraper/lists"}