An open API service indexing awesome lists of open source software.

https://github.com/couchbase-examples/cpg-manufacturing-edge-ai-assistant


https://github.com/couchbase-examples/cpg-manufacturing-edge-ai-assistant

Last synced: 30 days ago
JSON representation

Awesome Lists containing this project

README

          

# CPG Manufacturing AI Assistant

A comprehensive AI-powered system for Consumer Packaged Goods (CPG) manufacturing operations, featuring real-time telemetry generation, intelligent troubleshooting, and edge computing capabilities.

## πŸ—οΈ System Architecture

![System Architecture Overview](asserts/architecture.png)

### Detailed Component Architecture

**Device Layer (Raspberry Pi):**
The IoT device runs the `CPGIoTTelemetryGenerator` application, which simulates a production machine by generating real-time sensor readings, alerts, and quality metrics. This data is stored locally in a **Couchbase Lite** database.

**Edge Layer (On-prem Gateway, plant storage):**
This on-prem layer runs Couchbase Edge Server in Docker at each plant, providing low-latency plant-level storage and acting as a gateway/cache. It replicates data upstream exactly as generatedβ€”no aggregation or transformation.

- **Couchbase Edge Server:** Runs on-prem in Docker at each plant.
β€’ **Pushes** operational data (`telemetry`, `alerts`, `quality_metrics`, `device_status`) upstream to Capella via Sync Gateway.
β€’ **Pulls** configuration (`plants`, `production_lines`, `machines`) **and solution** documents down from Capella so they are available locally even when offline.
β€’ Stores operational records locally for a configurable retention period; configuration & solution documents are cached permanently.
β€’ Exposes a REST API (`:59840`) that the `CPGCLIAgent` uses to retrieve real-time operational and hierarchical information.

**Sync Layer (Cloud Data Synchronization):**
This cloud-hosted layer (Capella App Services) bridges the Edge and Cloud tiers, handling secure replication.

- **App Services (Sync Gateway):** Runs in Couchbase Capella Cloud and acts as the secure synchronization intermediary between Edge Servers and Capella. It uses channel-based replication on port `4984` to route data.

**Cloud Layer (Central & AI):**

- **Couchbase Capella:** The central cloud database. It serves as the master source of truth for knowledge data (`manuals`, `solutions`), configuration data (`plants`, `machines`), and operational data for analytics.
- **Python Setup System:** A set of scripts used for one-time data initialization, including processing manuals and generating sample data in Capella.
- **Google Gemini AI:** An external Large Language Model, integrated via **LangChain4j**, that provides natural language understanding and reasoning for the CLI agent.

**CPG CLI Agent (Mobile/Field Application):**
This application represents a user-facing tool (like a mobile or tablet app for a field technician). It does *not* run on the Edge Server. It has two primary toolsets for interacting with the system:

- **EdgeServerTools:** Makes API calls to the **Couchbase Edge Server** to query real-time operational data, alerts, solutions and hierarchical information like plant and machine details.
- **ManualSearchTools:** Uses a Couchbase Lite **replicator** to pull `manuals` from Capella (via App Services) into a local CBLite database, where an **on-device vector index is built during initialization**. This enables fast, offline-capable semantic search of documentation.

### Data Flow Summary

- **Operational Data (Upward Push):** `telemetry`, `alerts`, `quality_metrics`, and `device_status` flow from the Raspberry Pi's CBLite DB β†’ Edge Server β†’ App Services, and are stored in Capella.
- **Knowledge Data (Downward Pull):**
- `solutions` are pulled from Capella to the Edge Server.
- `manuals` are pulled from Capella all the way to the CPG CLI Agent's local CBLite database for vector search.
- **Configuration Data (Downward Pull):** `plants`, `production_lines`, and `machines` are pulled from Capella to the Edge Server so the `CPGCLIAgent` can query them.

## πŸ“Š Data Collections & Relationships

```mermaid
erDiagram
PLANTS {
string plant_id PK
string name
string location
string plant_code
string facility_type
string timezone
string status
string created_date
}

PRODUCTION_LINES {
string production_line_id PK
string plant_id FK
string name
string line_type
string primary_product
string status
string created_date
}

MACHINES {
string machine_id PK
string device_id
string production_line_id FK
string plant_id FK
string machine_type
string product_type
string model
string serial_number
string status
string created_date
}

TELEMETRY {
string device_id
string machine_id FK
string machine_type
string sensor_type
number value
string unit
string timestamp
string location
string status
number batch_number
string product_code
string production_line
string product_type
}

ALERTS {
string alert_id PK
string device_id
string machine_id FK
string machine_type
string error_code
string severity
string status
string timestamp
string production_line
string product_type
number batch_number
string product_code
number current_value
number threshold
boolean escalation_required
object context
}

QUALITY_METRICS {
string device_id
string machine_id FK
string machine_type
number fill_accuracy_percentage
number label_alignment_percentage
number cap_torque_inch_pounds
string quality_grade
string timestamp
number batch_number
string product_code
string production_line
string product_type
}

DEVICE_STATUS {
string device_id PK
string timestamp
string status
number uptime_seconds
number memory_usage_mb
number cpu_usage_percent
number disk_usage_percent
string network_status
string sync_status
number documents_pending_sync
}

SOLUTIONS {
string solution_id PK
string error_code
string solution_comment
array embeddings
number success_rate
string machine_type
string created_at
string updated_at
}

MANUALS {
string chunk_id PK
string content
array embedding
object metadata
string created_at
string last_updated
}

PLANTS ||--o{ PRODUCTION_LINES : contains
PRODUCTION_LINES ||--o{ MACHINES : contains
MACHINES ||--o{ TELEMETRY : generates
MACHINES ||--o{ ALERTS : produces
MACHINES ||--o{ QUALITY_METRICS : measures
ALERTS ||--o{ SOLUTIONS : references
MANUALS ||--|| MACHINES : documents
```

## πŸš€ Quick Start

### Prerequisites

- **Docker & Docker Compose** (20.10+)
- **Java 21+** (for local development)
- **Python 3.8+** (for setup scripts)
- **Google Gemini API Key** (for AI features)
- **Couchbase Capella Account** (for cloud sync)

### 1. Cloud Setup (Optional but Recommended)

#### Setup Couchbase Capella

1. **Create Capella Cluster:**

```bash
# Visit https://cloud.couchbase.com/
# Sign up/in and create a new cluster
# Choose "Developer Pro" or higher for App Services
```

2. **Create Database and Initialize Data:**

In the Capella UI, create a bucket named `cpg_manufacturing` and enable App Services.

After creating the bucket, run the setup script to create the required collections, scope, and synthetic data. This script requires Python dependencies.

```bash
# Install Python dependencies
pip install -r setup/requirements.txt

# Run system setup (creates collections, scope, indexes, and sample data)
cd setup
python setup_system.py
```

3. **Configure App Services:**

```bash
# In App Services section:
# 1. Create App Service endpoint in Capella.
# 2. Link all the collections to be synced: plants, production_lines, machines, solutions, telemetry, alerts, quality_metrics, device_status.
# 3. Create an App User under App Users.
# 4. In Access Control Validation, set custom sync scripts for plants, production_lines, and machines collections under custom scripts. For other collections, use the default script. These can be found in the "custom_scripts" directory in the project.
# 5. Go to App Users, select the user you created, and set the admin channel for the solutions collection to "solutions" and manuals collection to "manuals".
# 6. Update the App Services configuration to include these settings.
# 7. Resync all the collections
```

For a step-by-step video guide, see:

4. **Update Configuration:**

```bash
# Edit edge-server-config.json with your Capella details:
# - Replace "your-cluster.apps.cloud.couchbase.com" with actual URL
# - Update username/password in replication config
# - Configure channels and collection sync
```

### 2. Environment Setup

```bash
# Clone the repository
git clone
cd cpg-manufacturing-edge-ai-assistant

# Set up environment variables (setup.sh)
export GOOGLE_GEMINI_API_KEY="your-api-key-here"
export APP_SERVICES_ENDPOINT="your-cluster.apps.cloud.couchbase.com"
export APP_SERVICES_USERNAME="your-app-services-username"
export APP_SERVICES_PASSWORD="your-app-services-password"
export COUCHBASE_CONNECTION_STRING="your-couchbase-connection-string"
export COUCHBASE_USERNAME="your-couchbase-username"
export COUCHBASE_PASSWORD="your-couchbase-password"
export COUCHBASE_BUCKET_NAME="cpg_manufacturing"
export APP_NAME="CPG Manufacturing AI Assistant"

# Make scripts executable
chmod +x run-edge-server.sh run-java-app.sh setup.sh
# Set setup.sh as source to set environment variables
source setup.sh
```

### 3. Start Edge Server

```bash
# Start Couchbase Edge Server
./run-edge-server.sh --build

# Verify Edge Server is running
curl http://localhost:59840/
```

### 4. Start IoT Data Generation

```bash
# Start telemetry data generation
./run-java-app.sh

# Or run locally with Maven
./run-java-app.sh --local
```

### 5. Use CLI Agent

```bash
# Interactive mode
mvn exec:java -Dexec.mainClass="com.couchbase.example.cli_agent.CPGCLIAgent"

```

## πŸ“‹ Usage Examples

### CLI Agent Commands

**Operational Data Queries:**

```bash
# Get machine telemetry
"Why the MACHINE_001 stopped"

# Check production alerts
"Show active alerts for beverage filling line"

# Device status check
"Check device status for rpi-cpg-line1-01"

# Quality metrics
"Get quality metrics for MACHINE_001"
```

**Documentation Search:**

```bash
# Search procedures
"Search manuals for CO2 pressure calibration"

# Find solutions
"Find solutions for fill level problems"

# Troubleshooting guides
"Get troubleshooting procedures for beverage_filling_line"
```

### REST API Endpoints

**Edge Server APIs:**

```bash
# Database info
GET http://localhost:59840/cpg_manufacturing

# Query telemetry
POST http://localhost:59840/cpg_manufacturing/_query
{
"query": "SELECT * FROM manufacturing.telemetry WHERE machine_id = 'MACHINE_001' ORDER BY timestamp DESC LIMIT 10"
}

# Get active alerts
POST http://localhost:59840/cpg_manufacturing/_query
{
"query": "SELECT * FROM manufacturing.alerts WHERE status = 'active' ORDER BY severity DESC"
}
```

## πŸ”§ Configuration

### Edge Server Configuration

Key configuration files:

- `edge-server-config.json` - Edge server settings
- `docker-compose.yml` - Container orchestration
- `setup/config.py` - Python setup configuration

### Cloud Configuration (Capella & App Services)

#### Couchbase Capella Setup

1. **Create Capella Cluster:**

```bash
# Visit https://cloud.couchbase.com/
# Create a new cluster with CPG Manufacturing bucket
```

2. **Configure App Services:**
- Database Name: `cpg_manufacturing`
- Sync Gateway Port: `:4984`
- Admin Interface: `:4985`

3. **Replication Configuration in `edge-server-config.json`:**

```json
"replications": [
{
"source": "wss://your-cluster.apps.cloud.couchbase.com:4984/cpg_manufacturing",
"target": "cpg_manufacturing",
"continuous": true,
"collections": {
"manufacturing.plants": { "channels": ["CPG_PLANT_001"] },
"manufacturing.production_lines": { "channels": ["CPG_PLANT_001"] },
"manufacturing.machines": { "channels": ["CPG_PLANT_001"] },
"manufacturing.solutions": { "channels": ["solutions"] }
},
"auth": {
"user": "admin",
"password": "Password@12345"
}
}
]
```

#### Data Flow Architecture

- **Edge β†’ Cloud:** Telemetry, alerts, device status, quality metrics
- **Cloud β†’ Edge:** Plants, production lines, machines, solutions
- **Bidirectional Sync:** Real-time synchronization with conflict resolution

### Environment Variables

| Variable | Description | Default |
|----------|-------------|---------|
| `GOOGLE_GEMINI_API_KEY` | AI service API key | Required |
| `EDGE_SERVER_HOST` | Edge server hostname | `couchbase-edge-server` |
| `EDGE_SERVER_PORT` | Edge server port | `59840` |
| `APP_SERVICES_CLUSTER` | Capella App Services endpoint | `your-cluster.apps.cloud.couchbase.com` |
| `APP_SERVICES_API_KEY` | App Services API key | Required for cloud sync |
| `DEVICE_ID` | IoT device identifier | `rpi-cpg-line1-01` |
| `MACHINE_ID` | Machine identifier | `MACHINE_001` |
| `PRODUCTION_LINE` | Production line name | `BEVERAGE_LINE_A` |

## πŸ“Š Monitoring & Observability

### Health Checks

```bash
# Edge Server health
curl http://localhost:59840/

# Container status
docker compose ps

# Logs
docker compose logs -f couchbase-edge-server
docker compose logs -f java-cpg-app
```

### Data Verification

```bash
# Check document counts
curl http://localhost:59840/cpg_manufacturing/manufacturing.telemetry/_count

# View recent telemetry
curl http://localhost:59840/cpg_manufacturing/manufacturing.telemetry/_all_docs?limit=5&descending=true
```

### Cloud Monitoring

#### Capella Dashboard

1. **Access Capella Console:**

```bash
# Visit https://cloud.couchbase.com/
# Navigate to your cluster dashboard
```

2. **Monitor Data Sync:**
- Check bucket document counts
- Monitor App Services sync status
- View replication activity logs

#### App Services Monitoring

```bash
# App Services Admin Interface
curl -u admin:password https://your-cluster.apps.cloud.couchbase.com:4985/

# Check sync gateway status
curl https://your-cluster.apps.cloud.couchbase.com:4985/_db/cpg_manufacturing

# Monitor replication activity
curl https://your-cluster.apps.cloud.couchbase.com:4985/_db/cpg_manufacturing/_active_tasks
```

#### Replication Health

```bash
# Check Edge Server replication status
curl http://localhost:59840/cpg_manufacturing/_replications

# View sync metrics
docker compose logs java-cpg-app | grep -i "sync\|replication"
```

## πŸ—‚οΈ Project Structure

```text
cpg-manufacturing-edge-ai-assistant/
β”œβ”€β”€ asserts/ # Diagrams and images
β”‚ └── architecture.png
β”œβ”€β”€ config/ # (Currently empty except for __pycache__)
β”œβ”€β”€ custom_scripts/ # JavaScript scripts for capella app services custom script
β”‚ β”œβ”€β”€ machines.js
β”‚ β”œβ”€β”€ plants.js
β”‚ └── production_lines.js
β”œβ”€β”€ setup/ # Python setup and utility scripts to create collections and synthetic data
β”‚ β”œβ”€β”€ __init__.py
β”‚ β”œβ”€β”€ alert_templates.json
β”‚ β”œβ”€β”€ check_similarity.py
β”‚ β”œβ”€β”€ COLLECTIONS.md
β”‚ β”œβ”€β”€ config.py
β”‚ β”œβ”€β”€ couchbase_client.py
β”‚ β”œβ”€β”€ document_processor.py
β”‚ β”œβ”€β”€ document_utils.py
β”‚ β”œβ”€β”€ list_machine_alerts.py
β”‚ β”œβ”€β”€ manual.txt
β”‚ β”œβ”€β”€ requirements.txt
β”‚ β”œβ”€β”€ sample_data_generator.py
β”‚ β”œβ”€β”€ setup_system.py
β”‚ β”œβ”€β”€ solution_optimizer.py
β”œβ”€β”€ src/
β”‚ └── main/
β”‚ β”œβ”€β”€ java/
β”‚ β”‚ └── com/
β”‚ β”‚ └── couchbase/
β”‚ β”‚ └── example/
β”‚ β”‚ β”œβ”€β”€ cli_agent/ # CLI agent and tools
β”‚ β”‚ β”‚ β”œβ”€β”€ CPGCLIAgent.java
β”‚ β”‚ β”‚ β”œβ”€β”€ EdgeServerTools.java
β”‚ β”‚ β”‚ β”œβ”€β”€ ManualSearchTools.java
β”‚ β”‚ β”‚ β”œβ”€β”€ ToolTraceLogger.java
β”‚ β”‚ β”‚ └── VectorSearchUtil.java
β”‚ β”‚ └── iot_device/ # IoT telemetry and simulation
β”‚ β”‚ β”œβ”€β”€ AlertGenerator.java
β”‚ β”‚ β”œβ”€β”€ CPGIoTTelemetryGenerator.java
β”‚ β”‚ β”œβ”€β”€ IoTDeviceConfig.java
β”‚ β”‚ β”œβ”€β”€ SensorDataGenerator.java
β”‚ β”‚ └── SimulationUtils.java
β”‚ └── resources/
β”‚ └── simplelogger.properties
β”œβ”€β”€ c4_container.puml # C4 model diagram
β”œβ”€β”€ docker-compose.yml # Container orchestration
β”œβ”€β”€ Dockerfile.edge-server # Dockerfile for Edge Server
β”œβ”€β”€ Dockerfile.java # Dockerfile for Java app
β”œβ”€β”€ edge-server-config.json # Edge server configuration
β”œβ”€β”€ LANGCHAIN4J_INTEGRATION.md # LangChain4j integration notes
β”œβ”€β”€ pom.xml # Maven build file
β”œβ”€β”€ README.md # Project documentation
β”œβ”€β”€ run-edge-server.sh # Edge server startup script
└── run-java-app.sh # Java app startup script

```

## 🚨 Troubleshooting

### Common Issues

**Edge Server Won't Start:**

```bash
# Check port availability
lsof -i :59840

# View logs
docker compose logs couchbase-edge-server

# Restart with clean data
./run-edge-server.sh --clean
```

**Java App Compilation Errors:**

```bash
# Clean rebuild
mvn clean compile

# Check Java version
java -version

# Verify dependencies
mvn dependency:tree
```

**Replication Issues:**

```bash
# Check Edge Server connectivity
curl http://localhost:59840/cpg_manufacturing

# View replication status
docker compose logs java-cpg-app | grep -i replication
```

**Cloud Sync Issues:**

```bash
# Test Capella connectivity
curl -I https://your-cluster.apps.cloud.couchbase.com:4984/

# Check App Services authentication
curl -u admin:password https://your-cluster.apps.cloud.couchbase.com:4984/cpg_manufacturing

# Verify replication configuration
cat edge-server-config.json | jq '.replications'

# Check sync gateway logs in Capella Console
# Navigate to App Services β†’ Logs in Capella dashboard
```

**Authentication Errors:**

```bash
# Update credentials in edge-server-config.json
# Check Capella user permissions
# Verify App Services database access
# Test with curl:
curl -u username:password https://your-cluster.apps.cloud.couchbase.com:4984/cpg_manufacturing
```

## Data Retention:**

```bash
# Adjust retention in IoTDeviceConfig.java
TELEMETRY_EXPIRATION_DAYS = 30
ALERT_EXPIRATION_DAYS = 90
```

## πŸ“ License

This project is licensed under the MIT License - see the [LICENSE](LICENSE) file for details.

## πŸ”— References

- [Couchbase Edge Server Documentation](https://docs.couchbase.com/edge-server/current/)
- [Couchbase Capella Documentation](https://docs.couchbase.com/cloud/current/)
- [App Services Documentation](https://docs.couchbase.com/couchbase-lite/current/android/replication.html)
- [Couchbase Mobile Sync](https://docs.couchbase.com/sync-gateway/current/)
- [LangChain4j Documentation](https://docs.langchain4j.dev/)
- [Google Gemini AI API](https://ai.google.dev/docs)
- [Couchbase Cloud Console](https://cloud.couchbase.com/)