Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/swarajkumarsingh/job-data-digger
Job data digger, scrapes various website and makes it available in a single API, and refreshes every 24hr
https://github.com/swarajkumarsingh/job-data-digger
api backend cache go golang google job-data-digger jobs redis rest-api scrapper
Last synced: about 1 month ago
JSON representation
Job data digger, scrapes various website and makes it available in a single API, and refreshes every 24hr
- Host: GitHub
- URL: https://github.com/swarajkumarsingh/job-data-digger
- Owner: swarajkumarsingh
- License: mit
- Created: 2023-09-30T17:47:19.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-01-19T11:04:11.000Z (about 1 year ago)
- Last Synced: 2024-12-04T07:07:15.843Z (about 2 months ago)
- Topics: api, backend, cache, go, golang, google, job-data-digger, jobs, redis, rest-api, scrapper
- Language: Go
- Homepage: https://github.com/swarajkumarsingh/job-data-digger
- Size: 40 KB
- Stars: 3
- Watchers: 1
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: Readme.md
- License: LICENSE
Awesome Lists containing this project
README
# Job Data Digger
![GitHub License](https://img.shields.io/github/license/swarajkumarsingh/job-data-digger)
![GitHub Stars](https://img.shields.io/github/stars/swarajkumarsingh/job-data-digger)
![GitHub Issues](https://img.shields.io/github/issues/swarajkumarsingh/job-data-digger)
![GitHub Forks](https://img.shields.io/github/forks/swarajkumarsingh/job-data-digger)Job Data Scraper is a Go-based web scraper application that collects job listings from the Google Careers page and stores them in Redis. It is designed to run in a Docker container and automatically refreshes the data every 24 hours by re-scraping the Google Careers page.
## Table of Contents
- [Features](#features)
- [Prerequisites](#prerequisites)
- [Getting Started](#getting-started)
- [Installation](#installation)
- [Configuration](#configuration)
- [Usage](#usage)
- [Docker](#docker)
- [Contributing](#contributing)
- [License](#license)## Features
- **Google Careers Page Scraping**: Automatically scrapes job listings from the Google Careers page.
- **Data Storage**: Stores scraped job data in a Redis database.
- **Scheduled Refresh**: Refreshes job data by re-scraping the Google Careers page every 24 hours.
- **Dockerized**: Easily deploy the application using Docker.## Realtime images
![image](https://github.com/gin-gonic/gin/assets/89764448/f7aa1778-56e1-4f79-8b6a-19b58edb9341)## Prerequisites
Before you begin, ensure you have met the following requirements:
- Docker installed on your system
**NOTE: if you don't have docker installed then install the following programs**
- Go (Golang) installed on your system.
- Redis server up and running.## Getting Started
### Installation
To get started with the Job Data Scraper, follow these steps:
1. Clone the repository:
```bash
git clone https://github.com/swarajkumarsingh/job-data-digger.git
cd job-data-digger
```2. Run the Go application:
```
make compose
```### Configuration
No configuration needed for this project### Usage
1. For development(make sure a redis container is running)
```bash
./dev.sh
```2. For running in production
```bash
./run.sh
```### Docker
You can also run the Job Data Scraper in a Docker container. To build the Docker image, use the following command:
```bash
docker build -t job-data-scraper . && docker run -p 8080:8080 job-data-scraper
```or
```bash
make run
```### Contributing
Contributions are welcome! If you'd like to contribute to this project, please follow these guidelines:1. Fork the repository.
2. Create a new branch for your feature or bug fix.
3. Make your changes and test thoroughly.
4. Commit your changes with clear commit messages.
5. Create a pull request against the main branch.### License
This project is licensed under the MIT License. See the LICENSE file for details.Happy job data scraping! If you have any questions or run into any issues, feel free to open an issue or reach out to us.