https://github.com/onepunchmonk/fix-feeder
An open-source, modular FIX protocol ingestion and monitoring engine that supports high-performance message processing, multi-source integration, real-time dashboards, and fault-tolerant, cloud-ready deployment for financial market data.
https://github.com/onepunchmonk/fix-feeder
docker flask github-actions grafana kafka postgresql prometheus pytest python rest-api yaml
Last synced: 2 months ago
JSON representation
An open-source, modular FIX protocol ingestion and monitoring engine that supports high-performance message processing, multi-source integration, real-time dashboards, and fault-tolerant, cloud-ready deployment for financial market data.
- Host: GitHub
- URL: https://github.com/onepunchmonk/fix-feeder
- Owner: OnePunchMonk
- Created: 2025-08-06T19:09:15.000Z (11 months ago)
- Default Branch: main
- Last Pushed: 2025-08-17T18:01:17.000Z (10 months ago)
- Last Synced: 2025-08-17T20:18:38.033Z (10 months ago)
- Topics: docker, flask, github-actions, grafana, kafka, postgresql, prometheus, pytest, python, rest-api, yaml
- Language: Python
- Homepage:
- Size: 527 KB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# FIX Feeder
[](https://github.com/OnePunchMonk/FIX-Feeder)
**Modular, Fault-Tolerant FIX Protocol Feed Ingestion & Monitoring Engine**
> **Throughput Metrics**
> • **Peak Throughput:** 139,534 messages/second
> • **Data Volume:** ~17.4 MB/second
> • **Connection Handling:** 20 concurrent connections sustained
> • **Memory Usage:** Stable throughout test duration
> • **Error Rate:** 0% (all messages processed successfully)
>
> **System Resource Utilization**
> • **CPU Usage:** ~45% peak during burst
> • **Memory:** ~200MB steady state
> • **Network I/O:** Saturated during peak load
> • **Disk I/O:** Efficient batch writes to database
>

FixFeeder is an open-source, extensible, and production-ready ingestion engine for the Financial Information Exchange (FIX) protocol. It is designed for high-performance, reliability, and observability, making it ideal for trading firms, quantitative analysts, and infrastructure engineers.
## 📋 Table of Contents
- [Problem Statement](#-problem-statement)
- [Tech Stack](#-tech-stack)
- [Features](#-features)
- [Architecture](#-architecture)
- [Data Flow](#-data-flow)
- [Getting Started](#-getting-started)
- [Sending Mock Data](#-sending-mock-data)
- [Stress Testing](#-stress-testing)
- [Monitoring & Observability](#-monitoring--observability)
- [Configuration](#-configuration)
- [API Reference](#-api-reference)
- [Future Roadmap](#-future-roadmap)
- [Contributing](#-contributing)
## 🎯 Problem Statement
Financial markets generate massive volumes of structured messages through protocols like FIX (Financial Information Exchange). Organizations face several challenges:
### Core Challenges
1. **Real-time Processing**: Markets operate at microsecond latencies, requiring systems that can ingest and process thousands of messages per second
2. **Data Reliability**: Financial data must be captured with 100% accuracy - missing or corrupted messages can result in significant financial losses
3. **Multi-Source Integration**: Data comes from various sources (TCP sockets, files, message queues) requiring unified processing
4. **Observability**: Systems must provide real-time monitoring, alerting, and audit trails for regulatory compliance
5. **Scalability**: Architecture must handle varying load patterns from market open/close surges
### Current Market Gap
Existing solutions are either:
- **Proprietary & Expensive**: Vendor-locked solutions costing hundreds of thousands annually
- **Limited Flexibility**: Rigid architectures that don't adapt to changing requirements
- **Poor Observability**: Black-box systems with limited monitoring capabilities
- **Complex Deployment**: Monolithic architectures difficult to deploy and maintain
### FixFeeder Solution
FixFeeder addresses these challenges by providing:
- **Open Source**: No vendor lock-in, fully customizable
- **Modular Architecture**: Plug-and-play components for different requirements
- **Production-Ready**: Built-in monitoring, logging, and fault tolerance
- **High Performance**: Capable of processing 100,000+ messages/second
- **Cloud-Native**: Containerized with Docker, ready for Kubernetes deployment
## 🔧 Tech Stack
### Core Technologies
- **Python 3.9+**: Primary runtime for rapid development and extensive ecosystem
- **Apache Kafka**: Distributed streaming platform for real-time message processing
- **PostgreSQL**: Primary database with JSONB support for flexible message storage
- **SQLite**: Development/testing database for local environments
- **Flask**: Web framework for dashboard and REST API
- **Prometheus**: Metrics collection and monitoring
- **Grafana**: Visualization and alerting dashboards
### Infrastructure & DevOps
- **Docker**: Containerization for consistent deployments
- **Docker Compose**: Multi-service orchestration
- **GitHub Actions**: CI/CD pipeline for automated testing
- **YAML**: Configuration management
### Python Libraries
- **kafka-python**: Kafka client library
- **psycopg2**: PostgreSQL adapter
- **prometheus_client**: Metrics instrumentation
- **PyYAML**: Configuration parsing
- **pytest**: Testing framework
## ✨ Features
### 🔌 Multiple Ingestion Sources
- **TCP Socket Listener**: Real-time FIX session connectivity
- **File Reader**: Batch processing of historical log files
- **Kafka Consumer**: Integration with existing streaming infrastructure
- **Replay Engine**: Simulated live feeds from historical data with configurable speed
### 🎯 Flexible Output Sinks
- **Kafka Producer**: Real-time streaming to downstream systems
- **Database Writer**: Persistent storage with PostgreSQL/SQLite support
- **Dual-sink Architecture**: Simultaneous streaming and archival
### 📊 Real-time Monitoring
- **Prometheus Metrics**: Message counts, error rates, processing latency
- **Grafana Dashboards**: Visual monitoring with pre-built panels
- **Custom Metrics**: Extensible monitoring framework
### 🖥️ Live Dashboard
- **Web Interface**: Real-time message viewer with filtering
- **REST API**: Programmatic access to message data and statistics
- **Audit Trail**: Complete message history for compliance
### 🔧 Extensible Design
- **Protocol Abstraction**: Easy addition of ITCH, OUCH, and other protocols
- **Pluggable Components**: Swap implementations without code changes
- **Configuration-Driven**: YAML-based configuration for all components
### ☁️ Cloud-Native
- **Containerized**: Docker images for consistent deployment
- **Orchestration**: Docker Compose for development, Kubernetes-ready
- **Scalable**: Horizontal scaling support for high-volume environments
## 🏗️ Architecture
FixFeeder follows a modular, pipeline-based architecture designed for scalability and maintainability:

```mermaid
graph TD
A[External FIX Source] -->|TCP Socket| B[SocketListener]
A2[Log File] -->|File Read| B2[FileReader]
A3[Kafka Topic] -->|Consume| B3[KafkaConsumer]
A4[Replay File] -->|Timed Replay| B4[ReplayEngine]
B --> C[FixParser]
B2 --> C
B3 --> C
B4 --> C
C -->|Parsed Message| D[KafkaProducer]
C -->|Parsed Message| E[DBWriter]
D -->|JSON Messages| F[Kafka Topic: fix.messages]
E -->|Structured Data| G[PostgreSQL/SQLite]
H[Prometheus Client] -->|Metrics| I[Prometheus Server]
I -->|Data| J[Grafana Dashboard]
K[Flask Dashboard] -->|Query| G
L[REST API] -->|Query| G
M[Monitor Module] -->|Counters| H
C -.->|Increment Counters| M
N[main.py] -->|Orchestrates| B
N -->|Manages| C
N -->|Starts| K
N -->|Starts| L
N -->|Starts| H
O[Signal Handler] -->|Graceful Shutdown| N
P[Configuration] -->|YAML Config| N
style A fill:#f9f9f9
style F fill:#e1f5fe
style G fill:#e8f5e8
style I fill:#fff3e0
style J fill:#fce4ec
```
For detailed architecture information, see [ARCHITECTURE.md](docs/ARCH.md).
### High-Level Components
1. **Source Layer**: Message ingestion from multiple sources
2. **Parser Layer**: Protocol-specific message parsing and enrichment
3. **Queue Layer**: Asynchronous message distribution via Kafka
4. **Storage Layer**: Persistent storage for audit and analysis
5. **Monitoring Layer**: Metrics collection and observability
6. **API Layer**: Web dashboard and REST API for access
## 🌊 Data Flow
Understanding how data flows through FixFeeder is crucial for optimization and troubleshooting:
### 1. Message Ingestion
```
External Source → FixFeeder Source Component
```
- **TCP Socket**: Live FIX sessions connect directly to SocketListener
- **File**: FileReader processes log files line by line
- **Kafka**: KafkaConsumer subscribes to raw message topics
- **Replay**: ReplayEngine simulates live feeds from historical data
### 2. Protocol Parsing
```
Raw Message → FixParser → Structured Message
```
- **Input**: Raw FIX string (e.g., `8=FIX.4.2\x0135=D\x01...`)
- **Processing**:
- Split by SOH delimiter (`\x01`)
- Parse tag=value pairs
- Validate message structure
- Apply filtering rules (if configured)
- Enrich with metadata (timestamps, human-readable values)
- **Output**: JSON structure with raw and enriched data
### 3. Message Distribution
```
Structured Message → [KafkaProducer, DBWriter]
```
- **Kafka Stream**: Real-time distribution to downstream consumers
- **Database**: Persistent storage for audit and analysis
- **Monitoring**: Metrics collection (success/error counters)
### 4. Data Access
```
Stored Data → [Dashboard, REST API]
```
- **Dashboard**: Web interface for real-time monitoring
- **API**: Programmatic access for external systems
- **Grafana**: Metrics visualization and alerting
### Message Structure Example
```json
{
"raw": {
"8": "FIX.4.2",
"35": "D",
"55": "AAPL",
"54": "1",
"38": "100",
"ingestion_timestamp": "2025-01-07T12:30:00.123456Z"
},
"enriched": {
"tags": {
"BeginString": {
"tag": "8",
"value": "FIX.4.2",
"hr_value": "FIX.4.2"
},
"MsgType": {
"tag": "35",
"value": "D",
"hr_value": "D (NewOrderSingle)"
},
"Symbol": {
"tag": "55",
"value": "AAPL",
"hr_value": "AAPL"
}
}
}
}
```
### Error Handling & Resilience
- **Parse Failures**: Logged and counted, don't stop processing
- **Connection Issues**: Automatic reconnection with exponential backoff
- **Graceful Shutdown**: Signal handling ensures clean resource cleanup
- **Data Consistency**: Transactional writes to prevent partial data
## 🚀 Getting Started
### Prerequisites
- **Docker & Docker Compose**: For containerized deployment
- **Git**: For cloning the repository
- **Python 3.9+**: For local development (optional)
### Quick Start
1. **Clone the repository:**
```bash
git clone https://github.com/your-username/FixFeeder.git
cd FixFeeder
```
2. **Configure FixFeeder:**
Review and modify `config/config.yml` for your environment:
```yaml
source:
type: socket # socket, file, kafka, replay
host: 0.0.0.0
port: 9876
storage:
type: postgresql # postgresql, sqlite
monitoring:
enabled: true
prometheus_port: 8000
```
3. **Launch the complete stack:**
```bash
docker-compose up --build
```
4. **Verify deployment:**
- **FixFeeder Socket**: `telnet localhost 9876`
- **Dashboard**: http://localhost:5001
- **Prometheus Metrics**: http://localhost:8000
- **Grafana**: http://localhost:3000 (admin/admin)
### Service Ports
| Service | Port | Description |
|---------|------|-------------|
| FixFeeder Socket | 9876 | FIX message ingestion |
| Dashboard | 5001 | Web interface |
| Prometheus | 8000 | Metrics endpoint |
| Grafana | 3000 | Monitoring dashboards |
| PostgreSQL | 5432 | Database access |
| Kafka | 9092 | Message streaming |
### Local Development
For development without Docker:
1. **Install dependencies:**
```bash
pip install -r requirements.txt
```
2. **Start external services:**
```bash
docker-compose up kafka postgres prometheus grafana
```
3. **Run FixFeeder locally:**
```bash
python main.py
```
## 📡 Sending Mock Data
### Method 1: Using netcat (Linux/Mac/WSL)
```bash
echo -e "8=FIX.4.2\x0135=D\x0155=BASH_TEST\x0111=CMD_123\x0154=1\x0138=100\x0152=20250807-00:30:00\x0110=042\x01" | nc localhost 9876
```
### Method 2: Using PowerShell (Windows)
```powershell
$message = "8=FIX.4.2`0135=D`0155=BASH_TEST`0111=CMD_123`0154=1`0138=100`0152=20250807-00:30:00`0110=042`01"
$bytes = [System.Text.Encoding]::UTF8.GetBytes($message)
$tcpClient = New-Object System.Net.Sockets.TcpClient("localhost", 9876)
$stream = $tcpClient.GetStream()
$stream.Write($bytes, 0, $bytes.Length)
$tcpClient.Close()
```
### Method 3: Using Python Script
```python
import socket
def send_fix_message(message, host='localhost', port=9876):
with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
s.connect((host, port))
s.sendall(message.encode('utf-8'))
print(f"Sent: {message}")
# Example usage
fix_msg = "8=FIX.4.2\x0135=D\x0155=AAPL\x0111=ORDER_123\x0154=1\x0138=100\x0152=20250807-10:30:00\x0110=123\x01"
send_fix_message(fix_msg)
```
### Sample FIX Messages
**New Order Single (35=D):**
```
8=FIX.4.2\x0135=D\x0155=AAPL\x0154=1\x0138=100\x0140=2\x0144=150.50\x0111=ORD_001\x0152=20250807-10:30:00\x0110=123\x01
```
**Order Cancel Request (35=F):**
```
8=FIX.4.2\x0135=F\x0111=CANCEL_001\x0141=ORD_001\x0155=AAPL\x0154=1\x0138=100\x0152=20250807-10:31:00\x0110=456\x01
```
## ⚡ Stress Testing
FixFeeder includes a comprehensive stress testing tool to validate performance under high load conditions.
### Running Stress Tests
The stress test simulates multiple concurrent clients sending bursts of messages:
```bash
# Test with default settings (10,000 messages, 10 clients)
python stress_test.py
# Custom test configuration
python stress_test.py --messages 100000 --clients 20
```
### Stress Test Results
**Test Configuration:**
- **Messages**: 100,000 total
- **Clients**: 20 concurrent connections
- **Message Size**: ~125 bytes per message
- **Test Environment**: Local Docker setup
**Performance Results:**
```
Preparing to send 100000 messages using 20 concurrent clients...
All clients started. Triggering barrier to send messages now!
...
--- Stress Test Complete ---
Sent 100000 messages in 0.72 seconds.
Average throughput: 139,534.78 messages/sec.
```
### Performance Analysis
**Throughput Metrics:**
- **Peak Throughput**: 139,534 messages/second
- **Data Volume**: ~17.4 MB/second
- **Connection Handling**: 20 concurrent connections sustained
- **Memory Usage**: Stable throughout test duration
- **Error Rate**: 0% (all messages processed successfully)
**System Resource Utilization:**
- **CPU Usage**: ~45% peak during burst
- **Memory**: ~200MB steady state
- **Network I/O**: Saturated during peak load
- **Disk I/O**: Efficient batch writes to database
### Stress Test Architecture
The stress testing tool (`stress_test.py`) implements:
1. **Barrier Synchronization**: All clients start simultaneously for accurate throughput measurement
2. **Concurrent Connections**: Simulates real-world multi-client scenarios
3. **Realistic Messages**: Uses valid FIX message structures
4. **Performance Metrics**: Detailed timing and throughput analysis
### Interpreting Results
**Key Performance Indicators:**
- **Messages/Second**: Primary throughput metric
- **Latency Distribution**: P50, P95, P99 processing times
- **Error Rate**: Parse failures or connection drops
- **Resource Efficiency**: CPU/Memory usage per message
**Scaling Considerations:**
- **Linear Scaling**: Performance scales with CPU cores
- **Memory Bounded**: Large message volumes require memory tuning
- **Network Limited**: Gigabit networks can become bottleneck
- **Database Performance**: Bulk inserts optimize storage throughput
## 📊 Monitoring & Observability
### Prometheus Metrics
FixFeeder exposes comprehensive metrics for monitoring:
```
# Message processing metrics
fix_messages_ingested_total{} 12450
fix_parse_errors_total{} 3
# System health metrics
fixfeeder_uptime_seconds{} 3600
fixfeeder_active_connections{} 5
```
### Grafana Dashboards
Pre-configured dashboards provide visual monitoring:
- **Message Throughput**: Real-time ingestion rates
- **Error Tracking**: Parse failures and system errors
- **Connection Monitoring**: Active socket connections
- **Resource Utilization**: CPU, memory, and network usage
### Log Analysis
Structured logging provides detailed operational insights:
```
2025-01-07 10:30:00 INFO [SocketListener] Connected by ('192.168.1.100', 54321)
2025-01-07 10:30:00 DEBUG [FixParser] Parsed message: {"35": "D", "55": "AAPL"}
2025-01-07 10:30:00 INFO [DBWriter] Wrote message to database
```
## ⚙️ Configuration
FixFeeder uses YAML configuration for all components:
```yaml
# config/config.yml
# Message source configuration
source:
type: socket # socket, file, kafka, replay
host: 0.0.0.0 # Bind address
port: 9876 # TCP port
file_path: ./logs/fix_sample.log # For file source
kafka_brokers: ["kafka:9092"] # For Kafka source
kafka_topic: "fix-raw-in" # Input topic
# Message filtering
filter:
enabled: true
include_only:
"55": ["AAPL", "GOOG", "MSFT"] # Only these symbols
# Output configuration
queue:
enabled: true
type: kafka
brokers: ["kafka:9092"]
topic: fix.messages
storage:
enabled: true
type: postgresql
postgres_config:
user: "user"
password: "password"
host: "postgres"
port: 5432
database: "fixfeeder"
# Monitoring
monitoring:
enabled: true
prometheus_port: 8000
dashboard:
enabled: true
host: 0.0.0.0
port: 5001
# Logging
core:
log_level: INFO # DEBUG, INFO, WARNING, ERROR
```
### Environment-Specific Configuration
Create environment-specific configurations:
```bash
# Development
cp config/config.yml config/config.dev.yml
# Production
cp config/config.yml config/config.prod.yml
```
Specify configuration file:
```bash
export FIXFEEDER_CONFIG=config/config.prod.yml
python main.py
```
## 🔌 API Reference
### REST API Endpoints
**Get Recent Messages:**
```http
GET /api/messages?limit=50
```
**Get Specific Message:**
```http
GET /api/message/12345
```
**Response Format:**
```json
{
"id": 12345,
"msg_seq_num": 42,
"sending_time": "2025-01-07T10:30:00Z",
"message": {
"raw": {"8": "FIX.4.2", "35": "D"},
"enriched": {...}
}
}
```
### Metrics Endpoints
**Prometheus Metrics:**
```http
GET :8000/metrics
```
**Health Check:**
```http
GET :5001/health
```
## 🛣️ Future Roadmap
### Phase 1: Enhanced Protocol Support
- **ITCH Protocol**: NASDAQ market data feeds
- **OUCH Protocol**: Order entry system integration
- **FIX/FAST**: Compressed FIX message support
- **WebSocket Sources**: Modern real-time data feeds
### Phase 2: Advanced Processing
- **Message Validation**: Schema-based validation engine
- **Complex Event Processing**: Multi-message pattern detection
- **Machine Learning**: Anomaly detection and prediction
- **Real-time Analytics**: Stream processing with Apache Flink
### Phase 3: Enterprise Features
- **Multi-tenancy**: Isolated environments per organization
- **Authentication**: OAuth2/SAML integration
- **Encryption**: End-to-end message encryption
- **Compliance**: SOX/RegT audit trail capabilities
### Phase 4: Cloud & Scale
- **Kubernetes**: Native cloud deployment
- **Auto-scaling**: Dynamic resource allocation
- **Global Distribution**: Multi-region deployment
- **Edge Processing**: Latency-optimized edge nodes
### Phase 5: Advanced Analytics
- **Time Series Database**: InfluxDB integration
- **Data Lake**: S3/HDFS long-term storage
- **Business Intelligence**: Tableau/PowerBI connectors
- **Algorithmic Trading**: Strategy backtesting framework
## 🤝 Contributing
We welcome contributions from the community! Here's how to get started:
### Development Setup
1. **Fork the repository**
2. **Clone your fork:**
```bash
git clone https://github.com/your-username/FixFeeder.git
```
3. **Create a feature branch:**
```bash
git checkout -b feature/your-feature-name
```
4. **Install development dependencies:**
```bash
pip install -r requirements.txt
pip install -r requirements-dev.txt
```
### Running Tests
```bash
# Unit tests
pytest tests/
# Integration tests
pytest tests/test_integration.py
# Coverage report
pytest --cov=. tests/
```
### Code Quality
We maintain high code quality standards:
```bash
# Code formatting
black .
# Linting
flake8 .
# Type checking
mypy .
```
### Contribution Guidelines
- **Code Style**: Follow PEP 8 with Black formatting
- **Testing**: Maintain 80%+ test coverage
- **Documentation**: Update docs for new features
- **Performance**: Benchmark performance-critical changes
- **Security**: Follow security best practices
### Submitting Changes
1. **Ensure tests pass:**
```bash
pytest tests/
```
2. **Update documentation**
3. **Submit pull request with:**
- Clear description of changes
- Test coverage for new features
- Performance impact analysis
- Breaking change notifications
## 📄 License
This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.
## 🆘 Support
- **GitHub Issues**: Bug reports and feature requests
- **Discussions**: Community Q&A and ideas
- **Documentation**: Comprehensive guides and API docs
- **Examples**: Sample configurations and use cases
---
*FixFeeder is designed by financial engineers, for financial engineers. We understand the unique challenges of market data processing and are building a solution that scales with your business.*