An open API service indexing awesome lists of open source software.

https://github.com/4uffin/auto-news-aggregator

An automated tech news aggregation and summarization tool that delivers concise TL;DR summaries from major tech publications every 12 hours.
https://github.com/4uffin/auto-news-aggregator

actions ai automated automation github-actions google-gemini llm python python-script python3 xml xml-parser xml-parsing yaml yml

Last synced: about 2 months ago
JSON representation

An automated tech news aggregation and summarization tool that delivers concise TL;DR summaries from major tech publications every 12 hours.

Awesome Lists containing this project

README

          

> [!WARNING]
> THIS README WAS GENERATED BY CLAUDE SONNET 4 AND HAS BEEN DOUBLE CHECKED BY 4UFFIN.
>
> THE REST OF THIS PROJECT WAS CREATED USING GOOGLE GEMINI 2.5 FLASH. EXPECT BUGS AND USE WITH CAUTION!

# Tech News Digest 📰🤖

![GitHub repo size](https://img.shields.io/github/repo-size/4uffin/auto-news-aggregator)
[![GitHub Actions](https://img.shields.io/badge/GitHub%20Actions-Automated-brightgreen)](https://github.com/features/actions)
[![Python](https://img.shields.io/badge/Python-3.12+-blue.svg)](https://www.python.org/downloads/)
[![License](https://img.shields.io/badge/License-MIT-yellow.svg)](LICENSE)
[![AI Powered](https://img.shields.io/badge/AI-Gemini%202.5%20Flash-orange)](https://openrouter.ai/)

An intelligent, automated tech news aggregation and summarization tool that delivers concise TL;DR summaries from major tech publications every 12 hours.

## 🚀 Quick Start

```bash
# Clone and setup
git clone
cd tech-news-digest
pip install -r requirements.txt

# Set API key
export OPENROUTER_API_KEY="your-openrouter-api-key"

# Generate your first digest
python generate_digest.py
```

## ✨ Features

| Feature | Description |
|---------|-------------|
| 🌐 **Multi-Source Aggregation** | Pulls from 5+ major tech publications |
| 🧠 **AI-Powered Summaries** | Uses Google Gemini 2.5 Flash for intelligent summarization |
| 🔄 **Robust Fallbacks** | RSS feeds → Web scraping → Error handling |
| ⏰ **Automated Scheduling** | Runs every 12 hours via GitHub Actions |
| 📁 **Organized Storage** | Timestamped digests with proper file structure |
| 🔧 **Easy Configuration** | Simple customization of sources and parameters |

## 📊 News Sources

| Publication | RSS Feed | Backup Scraping |
|-------------|----------|-----------------|
| 📱 The Verge | ✅ | ✅ |
| 💼 TechCrunch | ✅ | ✅ |
| 🔧 CNET | ✅ | ✅ |
| ⚗️ Ars Technica | ✅ | ❌ |
| 📟 Engadget | ✅ | ❌ |

## 🛠️ Installation & Setup

### Prerequisites

- **Python 3.12+** - [Download here](https://www.python.org/downloads/)
- **OpenRouter API Key** - [Get yours here](https://openrouter.ai/keys)
- **Git** (for cloning and automation)

### Local Setup

1. **Clone the repository:**
```bash
git clone
cd tech-news-digest
```

2. **Create a virtual environment (recommended):**
```bash
python -m venv venv
source venv/bin/activate # On Windows: venv\Scripts\activate
```

3. **Install dependencies:**
```bash
pip install openai feedparser requests beautifulsoup4
```

Or create a `requirements.txt`:
```txt
openai>=1.0.0
feedparser>=6.0.0
requests>=2.31.0
beautifulsoup4>=4.12.0
```

4. **Configure your API key:**
```bash
# Option 1: Environment variable
export OPENROUTER_API_KEY="your-api-key-here"

# Option 2: Create .env file (add to .gitignore!)
echo "OPENROUTER_API_KEY=your-api-key-here" > .env
```

### GitHub Actions Setup

1. **Add repository secret:**
- Go to `Settings` → `Secrets and variables` → `Actions`
- Click `New repository secret`
- Name: `OPENROUTER_API_KEY`
- Value: Your OpenRouter API key

2. **Enable Actions:**
- Go to `Actions` tab
- Click `I understand my workflows, go ahead and enable them`

3. **Manual trigger (optional):**
- Actions tab → `Tech News Digest` → `Run workflow`

## 📖 Usage Guide

### Command Line Interface

```bash
# Basic usage
python generate_digest.py

# The script will:
# 1. Fetch headlines from all configured sources
# 2. Generate AI summary using Gemini 2.5 Flash
# 3. Save to news_digests/digest_YYYY-MM-DD_HH-MM.txt
# 4. Display results in terminal
```

### Expected Output

```
Starting tech news digest generation...
Fetching headlines from The Verge...
Fetching headlines from TechCrunch...
Fetching headlines from CNET...
Generating summary with LLM...

==================================================
TECH NEWS DIGEST GENERATED
==================================================

Tech News Digest - 2024-01-15 02:30 PM
==================================================

🚀 **Major Product Launches**
• Apple announces new MacBook Pro with M4 chip
• Google releases Gemini Advanced with improved coding

📱 **Mobile & Hardware**
• Samsung Galaxy S25 leaked specifications surface
• iPhone 16 sales exceed expectations in Q4

🤖 **AI & Machine Learning**
• OpenAI unveils GPT-5 preview for developers
• Microsoft integrates Copilot into more Office apps

==================================================

Successfully wrote the digest to news_digests/digest_2024-01-15_14-30.txt
Tech news digest generation completed successfully!
```

## ⚙️ Configuration Options

### Customizing News Sources

Edit the `feeds` list in `generate_digest.py`:

```python
feeds = [
("Your Source", "https://example.com/rss.xml"),
("The Verge", "https://www.theverge.com/rss/index.xml"),
# Add more sources here
]
```

### AI Model Configuration

Change the model in `generate_digest.py`:

```python
# Available models on OpenRouter:
MODEL = "google/gemini-2.5-flash" # Current (fast, cost-effective)
MODEL = "anthropic/claude-sonnet-4" # Alternative (higher quality)
MODEL = "openai/gpt-4" # Alternative (reliable)
```

### Schedule Customization

Modify the cron expression in `.github/workflows/tech-news.yml`:

```yaml
schedule:
# Every 12 hours (current)
- cron: '0 */12 * * *'

# Every 6 hours
# - cron: '0 */6 * * *'

# Daily at 9 AM UTC
# - cron: '0 9 * * *'

# Weekdays only at 9 AM UTC
# - cron: '0 9 * * 1-5'
```

### Summary Customization

Adjust the AI prompt for different summary styles:

```python
# In generate_tech_news_digest(), modify the system message:
"content": f"""You are a tech news summarizer focused on [YOUR FOCUS AREA].
Create summaries that emphasize [YOUR PREFERENCES].
Target audience: [YOUR AUDIENCE]
Style: [FORMAL/CASUAL/TECHNICAL]"""
```

## 📁 Project Structure

```
tech-news-digest/
├── generate_digest.py # Main script
├── .github/
│ └── workflows/
│ └── tech-news.yml # GitHub Actions workflow
├── news_digests/ # Generated digest files
│ ├── digest_2024-01-15_14-30.txt
│ ├── digest_2024-01-15_02-30.txt
│ └── ...
├── requirements.txt # Python dependencies
├── README.md # This file
├── .gitignore # Git ignore rules
└── .env.example # Environment template
```

## 🔧 Advanced Usage

### Custom Error Handling

The script includes comprehensive error handling:

```python
# Automatic fallback chain:
RSS Feeds → Web Scraping → Graceful Error Messages

# Exit codes:
# 0: Success
# 1: Critical error (API key missing, file write failure, etc.)
```

### Integration with Other Tools

```bash
# Pipe output to other tools
python generate_digest.py > daily_digest.txt

# Run with custom timestamp
python generate_digest.py && echo "Digest generated at $(date)"

# Integration with cron (alternative to GitHub Actions)
0 */12 * * * cd /path/to/tech-news-digest && python generate_digest.py
```

### API Usage Optimization

Monitor your OpenRouter usage:

```python
# The script uses these parameters for cost optimization:
max_tokens=1024 # Reasonable limit for summaries
temperature=0.3 # Consistent, focused output
model="gemini-2.5-flash" # Cost-effective choice
```

## 🐛 Troubleshooting

### Common Issues & Solutions

| Issue | Symptoms | Solution |
|-------|----------|----------|
| **Missing API Key** | `OPENROUTER_API_KEY environment variable not set` | Set environment variable or add to GitHub secrets |
| **Network Issues** | `Unable to fetch current tech news` | Check internet connection, RSS feeds may be temporarily down |
| **API Quota** | `API-related error occurred` | Check OpenRouter billing and quota limits |
| **File Permissions** | `Error creating output directory` | Ensure write permissions in project directory |
| **GitHub Actions Failure** | Workflow shows red X | Check Actions logs, verify secrets are set |

### Debug Mode

Enable detailed logging by modifying the script:

```python
import logging
logging.basicConfig(level=logging.DEBUG)

# Add this at the start of generate_digest.py
```

### Manual Testing

Test individual components:

```bash
# Test RSS feed access
python -c "import feedparser; print(len(feedparser.parse('https://www.theverge.com/rss/index.xml').entries))"

# Test OpenRouter connection
python -c "import openai; client = openai.OpenAI(base_url='https://openrouter.ai/api/v1', api_key='your-key'); print('Connection OK')"
```

## 🤝 Contributing

We welcome contributions! Here's how to help:

### Development Setup

```bash
git clone
cd tech-news-digest
python -m venv venv
source venv/bin/activate
pip install -r requirements.txt
pip install pytest black flake8 # Development tools
```

### Contribution Guidelines

1. **Fork** the repository
2. **Create** a feature branch: `git checkout -b feature/amazing-feature`
3. **Test** your changes thoroughly
4. **Format** code: `black generate_digest.py`
5. **Lint** code: `flake8 generate_digest.py`
6. **Commit** changes: `git commit -m 'Add amazing feature'`
7. **Push** to branch: `git push origin feature/amazing-feature`
8. **Submit** a Pull Request

### Ideas for Contributions

- 📰 Add new news sources
- 🎨 Improve summary formatting
- 🔧 Add configuration file support
- 📊 Add analytics/statistics
- 🌐 Add internationalization
- 📱 Create web interface
- 🔔 Add notification systems

## 📊 Performance & Costs

### Typical Usage

- **Runtime**: 30-60 seconds per execution
- **API calls**: 1 per run (batch processing)
- **Storage**: ~2-5KB per digest file
- **Bandwidth**: ~100-500KB per run

### OpenRouter Costs (Approximate)

- **Gemini 2.5 Flash**: $0.001-0.002 per digest
- **Monthly cost (2x daily)**: ~$0.06-0.12
- **Annual cost**: ~$0.70-1.44

## 📄 License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for more information.

## 🙏 Acknowledgments & Credits

- **🤖 AI Provider**: [OpenRouter](https://openrouter.ai/) & Google Gemini
- **📰 News Sources**: The Verge, TechCrunch, CNET, Ars Technica, Engadget
- **⚡ Automation**: GitHub Actions
- **🐍 Libraries**:
- [OpenAI Python Client](https://github.com/openai/openai-python)
- [feedparser](https://github.com/kurtmckee/feedparser)
- [Beautiful Soup](https://www.crummy.com/software/BeautifulSoup/)
- [Requests](https://requests.readthedocs.io/)

## 📞 Support & Contact

- **Issues**: [GitHub Issues](../../issues)
- **Discussions**: [GitHub Discussions](../../discussions)
- **Documentation**: This README and inline code comments