An open API service indexing awesome lists of open source software.

https://github.com/rufilboss/real-time-log-processing-api

Real-Time Log Processing API built with FastAPI, Celery, Redis, and MongoDB
https://github.com/rufilboss/real-time-log-processing-api

api celery docker docker-compose fraud-detection logging monitoring python redis

Last synced: 3 months ago
JSON representation

Real-Time Log Processing API built with FastAPI, Celery, Redis, and MongoDB

Awesome Lists containing this project

README

          

# Real-Time Log Processing API

## Project Overview

This project is a **Real-Time Log Processing API** built with **FastAPI**, **Celery**, **Redis**, and **MongoDB**. It processes incoming log data asynchronously, stores them in a MongoDB database, and performs background tasks (e.g., filtering sensitive information) using Celery workers. The infrastructure is containerized with **Docker**, making it highly scalable and suitable for real-world applications such as centralized log processing, fraud detection, and asynchronous data processing.

The project demonstrates core **DevOps** principles like **containerization**, **task queuing**, **asynchronous processing**, and **microservice architecture**. It can be extended for logging, monitoring, and background data processing use cases in production environments.

---

## Features

- **FastAPI Backend**: A REST API to submit logs and retrieve log processing statuses.
- **Celery with Redis**: Asynchronous task processing to handle large amounts of data.
- **MongoDB**: Database for storing log entries.
- **Docker Compose**: Containerized architecture with easy deployment using Docker.
- **Asynchronous Log Processing**: Processing logs in the background without blocking the main API.

---

## Real-World Use Cases

1. **Centralized Logging and Monitoring**:
- The API can be integrated into large-scale applications for real-time log aggregation and analysis.
- It could work with a central logging service where logs from various services are processed and stored in a centralized database for monitoring and alerting purposes.

2. **Fraud Detection and Monitoring**:
- The background task can analyze log entries for patterns of fraudulent activity, such as multiple failed login attempts or unusual transaction volumes.
- In the financial or e-commerce sector, this system could be extended to alert when suspicious activity is detected.

3. **Asynchronous Data Processing**:
- Can be used in systems where heavy data processing (e.g., image processing, video encoding) needs to happen in the background without blocking the main user-facing API.
- The system can handle tasks like sending out emails, notifications, or generating reports asynchronously.

---

## Architecture

- **FastAPI**: Handles incoming requests and submits logs for processing.
- **Celery**: Manages background tasks for log processing.
- **Redis**: Acts as the message broker and backend for Celery task states.
- **MongoDB**: Stores processed and raw log data.
- **Docker Compose**: Simplifies the deployment and orchestration of these services as separate containers.

---

## Prerequisites

Ensure you have the following installed on your machine:

1. **Docker**: To run the containerized services.
2. **Docker Compose**: To orchestrate multiple containers.

---

## Project Setup

1. Clone the repository:

```bash
git clone https://github.com/rufilboss/real-time-log-processing-api.git
cd real-time-log-processing-api
```

2. Create a `.env` file in the root of the project (if needed), specifying environment variables for the API, MongoDB, Redis, etc. Example:

```sh
APP_NAME=real-time-log-processing-api
LOG_LEVEL=INFO
MONGO_URI=mongodb://mongo:27017/log_database
REDIS_URL=redis://redis:6379/0
```

3. Start the application using Docker Compose:

```bash
docker-compose up --build
```

This command will start the following services:
- **FastAPI** backend
- **Celery** workers
- **MongoDB**
- **Redis**

---

## Step-by-Step Guide to Test the API

### 1. Submit a Log for Processing

To submit a log for processing, send a POST request to the `/logs` endpoint with the log data (either JSON or plain text).

#### Example (JSON Log Data)

```bash
POST http://localhost:8000/logs
Content-Type: application/json

{
"log_data": {
"username": "john_doe",
"email": "john@example.com",
"activity": "login_attempt"
}
}
```

Response:

```json
{
"status": "Log stored and processing",
"log_id": "64f64e0e924b81b597ca24f1",
"task_id": "1b41b21d-dc5f-4ac9-9da4-6353b2de70a7"
}
```

### 2. Check the Status of the Task

To check the status of a log processing task, send a GET request to the `/task/{task_id}` endpoint using the `task_id` you received in the POST response.

#### Example

```bash
GET http://localhost:8000/task/1b41b21d-dc5f-4ac9-9da4-6353b2de70a7
```

Response:

```json
{
"task_id": "1b41b21d-dc5f-4ac9-9da4-6353b2de70a7",
"status": "PENDING",
"result": null
}
```

### 3. Check the Stored Logs

Logs processed and stored in MongoDB can be checked via the MongoDB database. If you have MongoDB tools installed, use the following command:

```bash
docker exec -it mongo mongo
use log_database
db.logs.find()
```

---

## Troubleshooting

- **Redis Connection Error**: If you encounter a connection issue with Redis, ensure that the Redis container is running and accessible. Check your `.env` file or the Docker Compose file for correct Redis configurations.
- **Pending Task Status**: If the task remains in a "PENDING" state for too long, check the Celery worker logs to ensure that the worker is running and properly connected to Redis.
- **MongoDB Connection Error**: Ensure that the MongoDB container is running and the connection URI in the `.env` file is correct.

---

## Conclusion

This project demonstrates a microservice-based architecture for real-time log processing using Docker, FastAPI, Celery, Redis, and MongoDB. It showcases core DevOps principles such as containerization, task orchestration, and scalable background processing. The API can be easily extended to meet the requirements of real-world applications, including centralized logging, fraud detection, and large-scale data processing.

## LICENSE

[MIT](License)