Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/sanlamamba/redbot
This project is a Python-based bot that scrapes job postings from multiple Reddit communities and sends relevant posts to a Discord channel.
https://github.com/sanlamamba/redbot
automation bot discord prawl python reddit
Last synced: 7 days ago
JSON representation
This project is a Python-based bot that scrapes job postings from multiple Reddit communities and sends relevant posts to a Discord channel.
- Host: GitHub
- URL: https://github.com/sanlamamba/redbot
- Owner: sanlamamba
- Created: 2024-10-20T19:40:01.000Z (2 months ago)
- Default Branch: main
- Last Pushed: 2024-11-18T14:34:36.000Z (about 1 month ago)
- Last Synced: 2024-11-18T15:14:45.224Z (about 1 month ago)
- Topics: automation, bot, discord, prawl, python, reddit
- Language: Python
- Homepage:
- Size: 15.6 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
Awesome Lists containing this project
README
# Reddit-Discord Job Scraper Bot
This project is a Python-based bot that scrapes job postings from multiple Reddit communities and sends relevant posts to a Discord channel. The bot is designed to be run continuously on an AWS EC2 instance or any server, with logging, duplicate post prevention, and configurable settings through environment variables and configuration files.
## Features
- **Scrapes Multiple Subreddits:** The bot scrapes job postings from specific subreddits (e.g., `forhire`, `jobopenings`, `remotejobs`) to find relevant job posts.
- **Keyword Filtering:** Uses keywords related to coding and development jobs to filter posts and avoid irrelevant content.
- **Prevents Duplicate Posts:** The bot tracks previously sent URLs to prevent posting the same job multiple times.
- **Bulk Deletes Discord Messages:** Deletes old messages in the Discord channel to keep the conversation relevant.
- **Configurable:** The bot uses environment variables and a configuration file to easily modify settings, such as subreddits, keywords, and post-check frequency.
- **Logging:** The bot logs its operations, including post-scraping activities, sending messages, and error handling, with options for adjusting log verbosity.## Prerequisites
Before running the bot, make sure you have the following:
- **Python 3.7+**
- **Pipenv** or `pip` for managing dependencies
- A **Reddit** account and API keys for access (client ID, client secret)
- A **Discord** bot token and channel ID where job postings will be sent## Setup
### 1. Clone the repository
```bash
git clone https://github.com/sanlamamba/redbot.git
cd redbot
```### 2. Install dependencies
```bash
pip install -r requirements.txt
```### 3. Configure environment variables
Create a `.env` file in the project root with the following content:
```bash
REDDIT_CLIENT_ID=
REDDIT_CLIENT_SECRET=
REDDIT_USER_AGENT=
DISCORD_TOKEN=
DISCORD_CHANNEL_ID=
```### 4. Configure additional settings in `config.py`
Modify `config.py` to adjust subreddit lists, keywords, and other options:
```python
SUBREDDITS = ['forhire', 'jobbit', 'jobopenings', 'remotejs', 'remotejobs']
KEYWORDS = ['python', 'javascript', 'java', 'developer', 'software']
CHECK_FREQUENCY_SECONDS = 60
POST_LIMIT = 20
SENT_POSTS_FILE = 'sent_posts.csv'
```### 5. Run the bot
To start the bot, simply run:
```bash
python main.py
```### 6. Deploy on AWS EC2
1. **Launch an EC2 instance**: Choose a t2.micro or higher instance with Ubuntu 20.04.
2. **Set up the environment**:
- SSH into the instance.
- Install Python and Git.
- Clone this repository.
- Install dependencies.
3. **Run the bot**: Use a process manager like `pm2` or `nohup` to ensure the bot continues running in the background.Example using `nohup`:
```bash
nohup python main.py &
```