https://github.com/vitorpy/noreposts-atproto-feed
https://github.com/vitorpy/noreposts-atproto-feed
atproto bluesky feed-generator jetstream rust
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/vitorpy/noreposts-atproto-feed
- Owner: vitorpy
- License: gpl-3.0
- Created: 2025-09-30T16:48:33.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2025-12-19T20:28:28.000Z (6 months ago)
- Last Synced: 2025-12-22T08:45:04.178Z (6 months ago)
- Topics: atproto, bluesky, feed-generator, jetstream, rust
- Language: Rust
- Size: 266 KB
- Stars: 0
- Watchers: 0
- Forks: 0
- Open Issues: 8
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Following No Reposts Feed Generator
A production-ready Bluesky feed generator written in Rust that shows posts from people you follow, excluding all reposts. Built using Jetstream for efficient real-time data consumption with full JWT signature verification.
## Table of Contents
- [Features](#features)
- [How It Works](#how-it-works)
- [Prerequisites](#prerequisites)
- [Installation](#installation)
- [Configuration](#configuration)
- [Running Locally](#running-locally)
- [Deployment](#deployment)
- [Publishing Your Feed](#publishing-your-feed)
- [Architecture](#architecture)
- [API Endpoints](#api-endpoints)
- [Performance](#performance)
- [Development](#development)
- [Troubleshooting](#troubleshooting)
- [Contributing](#contributing)
- [License](#license)
## Features
- **π« No Reposts**: Automatically filters out all reposts, showing only original content
- **π₯ Personalized**: Shows only posts from accounts you follow
- **β‘ Real-time**: Updates in real-time as new posts are created
- **π Secure**: Full ES256K JWT signature verification with DID resolution
- **π‘ Efficient**: Uses Jetstream for lightweight event consumption (~850 MB/day vs 200+ GB/day)
- **ποΈ Smart Caching**: Automatic cleanup of posts older than 48 hours
- **π Auto-recovery**: Automatic reconnection on Jetstream disconnects
- **π Observable**: Structured logging with configurable verbosity
- **ποΈ Production Ready**: Battle-tested error handling and recovery mechanisms
## How It Works
This feed generator:
1. **Consumes Events**: Connects to Bluesky's Jetstream to receive real-time events for posts and follows
2. **Filters Content**: Only subscribes to `app.bsky.feed.post` and `app.bsky.graph.follow` collections
3. **Stores Data**: Maintains a local SQLite database of recent posts and follow relationships
4. **Serves Feeds**: Provides personalized feeds via AT Protocol's `app.bsky.feed.getFeedSkeleton` endpoint
5. **Authenticates Users**: Validates JWT tokens by resolving user DIDs and verifying signatures
## Prerequisites
- **Rust** 1.70+ (install via [rustup](https://rustup.rs))
- **SQLite** 3.35+ (usually pre-installed on modern systems)
- **Domain** with HTTPS (required for production deployment)
- **Bluesky Account** (for publishing the feed)
## Installation
### 1. Clone the Repository
```bash
git clone https://github.com/vitorpy/noreposts-atproto-feed.git
cd noreposts-atproto-feed
```
### 2. Build the Project
```bash
cargo build --release
```
The compiled binary will be at `target/release/following-no-reposts-feed`.
## Configuration
### Environment Variables
Create a `.env` file in the project root (see `.env.example` for reference):
```bash
# Required: Database location
DATABASE_URL=sqlite:./feed.db
# Required: Server port
PORT=3000
# Required: Your domain name
FEEDGEN_HOSTNAME=your-domain.com
# Required: Your service DID
FEEDGEN_SERVICE_DID=did:web:your-domain.com
# Optional: Jetstream server (defaults to jetstream1.us-east.bsky.network)
JETSTREAM_HOSTNAME=jetstream1.us-east.bsky.network
```
### Service DID Setup
Your `FEEDGEN_SERVICE_DID` should match your domain. For `did:web`, it's typically:
- Domain: `feed.example.com` β DID: `did:web:feed.example.com`
- Domain: `example.com` β DID: `did:web:example.com`
## Running Locally
### Quick Start
```bash
# Set up environment
cp .env.example .env
# Edit .env with your configuration
nano .env
# Run the server
cargo run --release
```
The server will:
1. Automatically run database migrations
2. Connect to Jetstream and start consuming events
3. Start the HTTP server on the configured port
4. Serve the DID document at `/.well-known/did.json`
### Testing Locally
```bash
# Test DID document endpoint
curl http://localhost:3000/.well-known/did.json
# Test feed endpoint (requires authentication in production)
curl "http://localhost:3000/xrpc/app.bsky.feed.getFeedSkeleton?feed=at://did:web:your-domain.com/app.bsky.feed.generator/following-no-reposts&limit=10"
```
### Command-Line Options
```bash
# Override environment variables
./following-no-reposts-feed --port 8080 --hostname feed.example.com
# Run database migrations only
./following-no-reposts-feed migrate
# Publish feed to your Bluesky account
./following-no-reposts-feed publish \
--handle your-handle.bsky.social \
--password your-app-password \
--record-name following-no-reposts \
--display-name "Following (No Reposts)" \
--description "See posts from people you follow, without any reposts"
# Backfill posts from firehose (optional)
./following-no-reposts-feed backfill --cursor
```
## Deployment
### 1. Build for Production
```bash
cargo build --release --locked
```
### 2. Set Up Your Server
Transfer the binary to your server:
```bash
scp target/release/following-no-reposts-feed user@your-server:/opt/feed-generator/
```
### 3. Create a Systemd Service
Create `/etc/systemd/system/feed-generator.service`:
```ini
[Unit]
Description=Bluesky Feed Generator - Following No Reposts
After=network.target
[Service]
Type=simple
User=feedgen
WorkingDirectory=/opt/feed-generator
Environment="DATABASE_URL=sqlite:/opt/feed-generator/feed.db"
Environment="PORT=3000"
Environment="FEEDGEN_HOSTNAME=your-domain.com"
Environment="FEEDGEN_SERVICE_DID=did:web:your-domain.com"
ExecStart=/opt/feed-generator/following-no-reposts-feed
Restart=always
RestartSec=10
[Install]
WantedBy=multi-user.target
```
Enable and start the service:
```bash
sudo systemctl daemon-reload
sudo systemctl enable feed-generator
sudo systemctl start feed-generator
sudo systemctl status feed-generator
```
### 4. Configure Reverse Proxy
The feed generator must be accessible via HTTPS. Configure your reverse proxy:
#### Nginx
```nginx
server {
listen 443 ssl http2;
server_name your-domain.com;
ssl_certificate /etc/letsencrypt/live/your-domain.com/fullchain.pem;
ssl_certificate_key /etc/letsencrypt/live/your-domain.com/privkey.pem;
location / {
proxy_pass http://127.0.0.1:3000;
proxy_set_header Host $host;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Forwarded-For $proxy_add_x_forwarded_for;
proxy_set_header X-Forwarded-Proto $scheme;
# WebSocket support for Jetstream
proxy_http_version 1.1;
proxy_set_header Upgrade $http_upgrade;
proxy_set_header Connection "upgrade";
}
}
```
#### Caddy
```caddyfile
your-domain.com {
reverse_proxy localhost:3000
}
```
### 5. Verify Deployment
```bash
# Test DID document
curl https://your-domain.com/.well-known/did.json
# Check if feed endpoint is accessible
curl https://your-domain.com/xrpc/app.bsky.feed.getFeedSkeleton
```
## Publishing Your Feed
Once your feed generator is deployed and accessible via HTTPS:
### Method 1: Using the Built-in Publish Command
```bash
./following-no-reposts-feed publish \
--handle your-handle.bsky.social \
--password your-app-password \
--record-name following-no-reposts \
--display-name "Following (No Reposts)" \
--description "See posts from people you follow, without any reposts" \
--avatar ./avatar.png
```
**Note**: Use an [App Password](https://bsky.app/settings/app-passwords), not your main account password!
### Method 2: Manual Publishing
1. **Get your DID**:
```bash
curl "https://bsky.social/xrpc/com.atproto.identity.resolveHandle?handle=yourhandle.bsky.social"
```
2. **Create a session**:
```bash
curl -X POST https://bsky.social/xrpc/com.atproto.server.createSession \
-H "Content-Type: application/json" \
-d '{"identifier": "yourhandle.bsky.social", "password": "your-app-password"}'
```
3. **Publish the feed generator record**:
```bash
curl -X POST https://bsky.social/xrpc/com.atproto.repo.putRecord \
-H "Authorization: Bearer YOUR_ACCESS_JWT" \
-H "Content-Type: application/json" \
-d '{
"repo": "your.did",
"collection": "app.bsky.feed.generator",
"rkey": "following-no-reposts",
"record": {
"$type": "app.bsky.feed.generator",
"did": "did:web:your-domain.com",
"displayName": "Following (No Reposts)",
"description": "See posts from people you follow, without any reposts",
"createdAt": "2025-01-01T00:00:00.000Z"
}
}'
```
### Finding Your Feed
After publishing, your feed will be available at:
```
https://bsky.app/profile/yourhandle.bsky.social/feed/following-no-reposts
```
## Architecture
### System Components
```
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Jetstream ββββββββββΆβ Consumer ββββββββββΆβ Database β
β (WebSocket) β Events β (Async) β Store β (SQLite) β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β
β Read
βΌ
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
β Bluesky ββββββββββΆβ HTTP Server ββββββββββΆβ Feed β
β App β JWT β (Axum) β Query β Algorithm β
ββββββββββββββββ ββββββββββββββββ ββββββββββββββββ
```
### Code Structure
- **`main.rs`**: Application entry point, HTTP server setup, routing
- **`jetstream_consumer.rs`**: WebSocket client for Jetstream events
- **`database.rs`**: SQLite abstraction layer, queries, and migrations
- **`feed_algorithm.rs`**: Feed generation logic (filtering by follows, excluding reposts)
- **`auth.rs`**: JWT validation with ES256K signature verification
- **`backfill.rs`**: Optional historical data backfilling from firehose
- **`publish.rs`**: Feed generator publishing utilities
- **`admin_socket.rs`**: Unix socket for admin commands
- **`types.rs`**: Shared data structures
### Data Flow
1. **Event Ingestion**: Jetstream sends `commit` events when posts are created or follows happen
2. **Event Processing**: Consumer parses events and extracts relevant data
3. **Database Storage**: Posts and follows are stored in SQLite with TTL
4. **Feed Requests**: Bluesky app requests feed via `getFeedSkeleton`
5. **Authentication**: JWT is validated by resolving DID and verifying signature
6. **Feed Generation**: Algorithm queries posts from followed users, excludes reposts
7. **Response**: Ordered list of post URIs returned with pagination cursor
## API Endpoints
### `GET /.well-known/did.json`
Returns the DID document for the feed generator service.
**Response**:
```json
{
"@context": ["https://www.w3.org/ns/did/v1"],
"id": "did:web:your-domain.com",
"service": [{
"id": "#bsky_fg",
"type": "BskyFeedGenerator",
"serviceEndpoint": "https://your-domain.com"
}]
}
```
### `GET /xrpc/app.bsky.feed.getFeedSkeleton`
Returns a personalized feed skeleton for the authenticated user.
**Query Parameters**:
- `feed` (required): Feed AT-URI (e.g., `at://did:web:your-domain.com/app.bsky.feed.generator/following-no-reposts`)
- `limit` (optional): Number of posts (1-100, default: 50)
- `cursor` (optional): Pagination cursor
**Headers**:
- `Authorization`: Bearer JWT token from Bluesky app
**Response**:
```json
{
"feed": [
{"post": "at://did:plc:xxx/app.bsky.feed.post/abc123"},
{"post": "at://did:plc:yyy/app.bsky.feed.post/def456"}
],
"cursor": "1234567890"
}
```
## Performance
### Resource Usage
- **Memory**: ~50-100 MB (depends on database size)
- **CPU**: Minimal (<1% on modern hardware)
- **Bandwidth**: ~850 MB/day (Jetstream with compression)
- **Storage**: ~100-500 MB (48-hour post retention)
### Scalability
The feed generator can handle:
- **10,000+** active users
- **1M+** posts/day ingestion
- **100+** requests/second
### Database Optimization
```sql
-- Efficient indexes
CREATE INDEX idx_posts_author_did ON posts(author_did);
CREATE INDEX idx_posts_indexed_at ON posts(indexed_at);
CREATE INDEX idx_follows_follower ON follows(follower_did);
-- Automatic cleanup (posts > 48 hours old)
DELETE FROM posts WHERE indexed_at < datetime('now', '-2 days');
```
## Development
### Running Tests
```bash
cargo test
```
### Database Migrations
Create a new migration:
```bash
sqlx migrate add your_migration_name
```
Run migrations manually:
```bash
cargo install sqlx-cli --no-default-features --features sqlite
sqlx migrate run
```
### Logging
Set the `RUST_LOG` environment variable:
```bash
# Debug level (verbose)
RUST_LOG=debug cargo run
# Info level (default)
RUST_LOG=info cargo run
# Specific module logging
RUST_LOG=following_no_reposts_feed::jetstream_consumer=debug cargo run
```
### Code Quality
```bash
# Format code
cargo fmt
# Run linter
cargo clippy --all-targets
# Check for security vulnerabilities
cargo audit
```
## Troubleshooting
### Jetstream Connection Issues
**Problem**: Cannot connect to Jetstream
**Solutions**:
- Verify network connectivity: `ping jetstream1.us-east.bsky.network`
- Check firewall rules (port 443 outbound)
- Try alternative Jetstream servers
- Monitor logs for specific error messages
### Database Locked Errors
**Problem**: `database is locked` errors
**Solutions**:
- Ensure only one instance is running
- Check for long-running transactions
- Increase `busy_timeout` in database configuration
- Consider using WAL mode: `PRAGMA journal_mode=WAL;`
### Authentication Failures
**Problem**: JWT validation errors
**Solutions**:
- Verify `FEEDGEN_SERVICE_DID` matches your domain
- Check network access to `plc.directory` for DID resolution
- Enable debug logging: `RUST_LOG=following_no_reposts_feed::auth=debug`
- Verify your domain's HTTPS certificate is valid
### Feed Not Updating
**Problem**: Feed shows stale content
**Solutions**:
- Check Jetstream connection: look for "Connected to Jetstream" in logs
- Verify database is being updated: `sqlite3 feed.db "SELECT COUNT(*) FROM posts;"`
- Check for errors in logs: `journalctl -u feed-generator -n 100`
- Restart the service: `systemctl restart feed-generator`
### High Memory Usage
**Problem**: Memory usage growing over time
**Solutions**:
- Verify automatic cleanup is working: check `posts` table size
- Manually trigger cleanup: `DELETE FROM posts WHERE indexed_at < datetime('now', '-2 days');`
- Reduce retention period in code if needed
- Monitor with: `ps aux | grep following-no-reposts-feed`
### Debug Mode
Enable comprehensive debugging:
```bash
RUST_LOG=debug,hyper=info,tokio=info cargo run
```
Test endpoints directly:
```bash
# Test DID endpoint
curl -v https://your-domain.com/.well-known/did.json
# Test feed endpoint with authentication
curl -v -H "Authorization: Bearer YOUR_JWT" \
"https://your-domain.com/xrpc/app.bsky.feed.getFeedSkeleton?feed=at://did:web:your-domain.com/app.bsky.feed.generator/following-no-reposts&limit=5"
```
## Contributing
Contributions are welcome! Please:
1. Fork the repository
2. Create a feature branch (`git checkout -b feature/amazing-feature`)
3. Make your changes
4. Add tests if applicable
5. Run `cargo fmt` and `cargo clippy`
6. Commit your changes (`git commit -m 'Add amazing feature'`)
7. Push to the branch (`git push origin feature/amazing-feature`)
8. Open a Pull Request
### Development Guidelines
- Follow Rust best practices and idioms
- Add tests for new functionality
- Update documentation for user-facing changes
- Keep commits atomic and well-described
- Ensure CI passes before submitting PR
## License
This project is licensed under the GNU General Public License v3.0 - see the [LICENSE](LICENSE) file for details.
This ensures the code remains free and open source. If you modify and distribute this software, you must:
- Disclose your source code
- License your modifications under GPLv3
- State significant changes made
- Include the original copyright notice
## Resources
- [AT Protocol Documentation](https://atproto.com)
- [Bluesky API Reference](https://docs.bsky.app)
- [Jetstream Documentation](https://github.com/bluesky-social/jetstream)
- [ATrium Rust Library](https://github.com/sugyan/atrium)
- [Feed Generator Guide](https://docs.bsky.app/docs/starter-templates/custom-feeds)
## Acknowledgments
- Built with [ATrium](https://github.com/sugyan/atrium) - Rust libraries for AT Protocol
- Uses [Jetstream](https://github.com/bluesky-social/jetstream) for efficient event streaming
- Inspired by the Bluesky community's work on custom feeds
---
**Made with β€οΈ for the Bluesky community**