https://github.com/ajitashwath/researcher
A web based AI application that creates comprehensive reports on any topic using AI agents.
https://github.com/ajitashwath/researcher
crewai docker gcp streamlit
Last synced: 10 months ago
JSON representation
A web based AI application that creates comprehensive reports on any topic using AI agents.
- Host: GitHub
- URL: https://github.com/ajitashwath/researcher
- Owner: ajitashwath
- Created: 2025-07-04T07:24:19.000Z (12 months ago)
- Default Branch: main
- Last Pushed: 2025-08-14T11:11:21.000Z (10 months ago)
- Last Synced: 2025-08-14T11:31:42.018Z (10 months ago)
- Topics: crewai, docker, gcp, streamlit
- Language: Python
- Homepage: https://create-report-kar.streamlit.app/
- Size: 344 KB
- Stars: 2
- Watchers: 0
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# Researcher Agent
A web based AI application that creates comprehensive reports on any topic using AI agents.
## Features
- 🤖 **Multi-Agent AI System**: Research, analysis, writing, and review agents working together
- 📊 **Comprehensive Reports**: Generate detailed reports on any topic
- 🌐 **Multiple Interfaces**: Web UI (Streamlit), REST API (FastAPI), and CLI
- ☁️ **Cloud-Ready**: Docker containerization with GCP deployment support
- 🔧 **Configurable**: Customizable agents, tasks, and report formats
- 🔍 **Advanced Research**: Integration with search tools and OpenAI for comprehensive research
## Installation
Ensure you have Python >=3.10 <3.14 installed on your system. This project uses [UV](https://docs.astral.sh/uv/) for dependency management and package handling, offering a seamless setup and execution experience.
First, if you haven't already, install uv:
```bash
pip install uv
```
Next, navigate to your project directory and install the dependencies:
```bash
# Install using pip
pip install -r requirements.txt
# Or using crewai CLI (optional)
crewai install
```
## Environment Configuration
Create a `.env` file in the root directory with the following variables:
```env
# Required API Keys
OPENAI_API_KEY=your_openai_api_key_here
GROQ_API_KEY=your_groq_api_key_here
SERPER_API_KEY=your_serper_api_key_here
# Optional Configuration
PYTHONDONTWRITEBYTECODE=1
PYTHONUNBUFFERED=1
```
**Required API Keys:**
- `OPENAI_API_KEY`: For AI content generation and analysis
- `GROQ_API_KEY`: For enhanced AI processing capabilities
- `SERPER_API_KEY`: For web search functionality
## Usage Options
### 1. Streamlit Web Interface
Run the interactive web application:
```bash
streamlit run app.py
```
Access the application at `http://localhost:8501`
### 2. FastAPI REST API
Start the API server:
```bash
# Development
uvicorn api:app --reload --host 0.0.0.0 --port 8000
# Production
gunicorn -w 1 --threads 2 -k uvicorn.workers.UvicornWorker --bind 0.0.0.0:8080 api:app
```
**API Endpoints:**
- `GET /`: Health check
- `POST /research`: Generate reports
**Example API Usage:**
```bash
curl -X POST "http://localhost:8000/research" \
-H "Content-Type: application/json" \
-d '{
"topic": "How to improve infrastructure in Bangalore?",
"user_personalization": "Focus on transportation and urban planning"
}'
```
### 3. Command Line Interface
Run directly from the command line:
```bash
crewai run
```
This will generate a `report.md` file with research on LLMs (default example).
## Docker Deployment
### Building the Docker Image
The project includes a production-ready Dockerfile optimized for deployment:
```bash
# Build the image
docker build -t createreport-crew .
# Run locally
docker run -p 8080:8080 \
-e OPENAI_API_KEY=your_key_here \
-e GROQ_API_KEY=your_key_here \
-e SERPER_API_KEY=your_key_here \
createreport-crew
```
### Docker Configuration
The Dockerfile is configured with:
- **Base Image**: Python 3.11 slim (Debian Bullseye)
- **Port**: 8080 (Cloud Run compatible)
- **Server**: Gunicorn with Uvicorn workers
- **Optimization**: Multi-threaded, production-ready configuration
## Google Cloud Platform (GCP) Deployment
### Prerequisites
1. Install Google Cloud SDK:
```bash
# macOS
brew install google-cloud-sdk
# Ubuntu/Debian
sudo apt-get install google-cloud-sdk
# Or download from: https://cloud.google.com/sdk/docs/install
```
2. Authenticate with GCP:
```bash
gcloud auth login
```
### Step-by-Step GCP Deployment
#### 1. Project Setup
```bash
# List available projects
gcloud projects list
# Set your project ID
gcloud config set project YOUR_PROJECT_ID
# Enable required services
gcloud services enable cloudbuild.googleapis.com artifactregistry.googleapis.com run.googleapis.com
```
#### 2. Configure Variables
```powershell
# PowerShell (Windows)
$REPO_NAME = "createreport-crew"
$REGION = "us-central1" # or your preferred region
$SERVICE_NAME = "createreport-api"
```
```bash
# Bash (Linux/macOS)
REPO_NAME="createreport-crew"
REGION="us-central1" # or your preferred region
SERVICE_NAME="createreport-api"
```
#### 3. Create Artifact Registry Repository
```powershell
# PowerShell
gcloud artifacts repositories create $REPO_NAME `
--repository-format=docker `
--location=$REGION `
--description="CreateReport Crew Docker Repository"
```
```bash
# Bash
gcloud artifacts repositories create $REPO_NAME \
--repository-format=docker \
--location=$REGION \
--description="CreateReport Crew Docker Repository"
```
#### 4. Build and Push Docker Image
```powershell
# PowerShell
$PROJECT_ID = $(gcloud config get-value project)
$IMAGE_TAG = "$($REGION)-docker.pkg.dev/$($PROJECT_ID)/$($REPO_NAME)/createreport-api:latest"
# Build and push image
gcloud builds submit --tag $IMAGE_TAG
```
```bash
# Bash
PROJECT_ID=$(gcloud config get-value project)
IMAGE_TAG="$REGION-docker.pkg.dev/$PROJECT_ID/$REPO_NAME/createreport-api:latest"
# Build and push image
gcloud builds submit --tag $IMAGE_TAG
```
#### 5. Deploy to Cloud Run
```powershell
# PowerShell
gcloud run deploy $SERVICE_NAME `
--image=$IMAGE_TAG `
--platform=managed `
--region=$REGION `
--allow-unauthenticated `
--port=8080 `
--memory=2Gi `
--cpu=1 `
--timeout=900 `
--set-env-vars="PYTHONDONTWRITEBYTECODE=1,PYTHONUNBUFFERED=1"
```
```bash
# Bash
gcloud run deploy $SERVICE_NAME \
--image=$IMAGE_TAG \
--platform=managed \
--region=$REGION \
--allow-unauthenticated \
--port=8080 \
--memory=2Gi \
--cpu=1 \
--timeout=900 \
--set-env-vars="PYTHONDONTWRITEBYTECODE=1,PYTHONUNBUFFERED=1"
```
#### 6. Set Environment Variables (Secrets)
For production deployment, set your API keys as environment variables:
```bash
# Set environment variables with secrets
gcloud run services update $SERVICE_NAME \
--region=$REGION \
--set-env-vars="OPENAI_API_KEY=your_openai_key_here,GROQ_API_KEY=your_groq_key_here,SERPER_API_KEY=your_serper_key_here"
```
**Better approach using Secret Manager:**
```bash
# Create secrets
echo "your_openai_key_here" | gcloud secrets create openai-api-key --data-file=-
echo "your_groq_key_here" | gcloud secrets create groq-api-key --data-file=-
echo "your_serper_key_here" | gcloud secrets create serper-api-key --data-file=-
# Deploy with secrets
gcloud run deploy $SERVICE_NAME \
--image=$IMAGE_TAG \
--region=$REGION \
--set-secrets="OPENAI_API_KEY=openai-api-key:latest,GROQ_API_KEY=groq-api-key:latest,SERPER_API_KEY=serper-api-key:latest"
```
### Post-Deployment
After successful deployment, you'll receive a service URL. You can:
1. **Test the API**:
```bash
curl -X GET "https://your-service-url.run.app/"
```
2. **View logs**:
```bash
gcloud run services logs tail $SERVICE_NAME --region=$REGION
```
3. **Monitor the service**:
```bash
gcloud run services describe $SERVICE_NAME --region=$REGION
```
## Customization
### Agents Configuration
Modify `src/create_report/config/agents.yaml` to define your agents:
```yaml
researcher:
role: "Senior Research Analyst"
goal: "Conduct comprehensive research..."
backstory: "You are a senior research analyst..."
```
### Tasks Configuration
Modify `src/create_report/config/tasks.yaml` to define your tasks:
```yaml
research_task:
description: "Conduct comprehensive research on..."
expected_output: "A comprehensive research summary..."
agent: researcher
```
### Custom Logic
- **Modify `src/create_report/crew.py`**: Add custom logic, tools, and specific arguments
- **Modify `src/create_report/main.py`**: Add custom inputs and orchestration logic
- **Modify `api.py`**: Customize API endpoints and request handling
- **Modify `app.py`**: Customize the Streamlit interface
## Project Structure
```
create_report/
├── .dockerignore # Docker ignore patterns
├── .gitignore # Git ignore patterns
├── Dockerfile # Production Docker configuration
├── README.md # This file
├── api.py # FastAPI REST API
├── app.py # Streamlit web interface
├── requirements.txt # Python dependencies
├── pyproject.toml # Project configuration
├── knowledge/ # Knowledge base files
│ └── user_preference.txt
├── src/create_report/ # Main package
│ ├── __init__.py
│ ├── main.py # Main orchestration logic
│ ├── crew.py # Crew management
│ ├── config/ # Configuration files
│ │ ├── agents.yaml # Agent definitions
│ │ └── tasks.yaml # Task definitions
│ └── tools/ # Custom tools
│ ├── __init__.py
│ └── custom_tool.py
```
## Understanding Your Crew
The create-report Crew is composed of multiple AI agents, each with unique roles, goals, and tools:
- **🔍 Researcher**: Conducts comprehensive research and gathers information
- **📊 Analyst**: Analyzes data and identifies trends and insights
- **✍️ Writer**: Creates well-structured, engaging reports
- **🔍 Reviewer**: Reviews and improves report quality and accuracy
- **📋 Strategist**: Develops strategic recommendations and implementation plans
These agents collaborate on a series of tasks, leveraging their collective skills to achieve complex objectives.
## Troubleshooting
### Common Issues
1. **API Key Errors**: Ensure all required API keys are set in your environment
2. **Docker Build Failures**: Check that all dependencies are properly listed in `requirements.txt`
3. **GCP Deployment Issues**: Verify that all required GCP services are enabled
4. **Memory Issues**: Increase Cloud Run memory allocation if processing large reports
### Docker Debugging
```bash
# Build and run locally for debugging
docker build -t createreport-crew-debug .
docker run -it --entrypoint /bin/bash createreport-crew-debug
# Check logs
docker logs
```
### GCP Debugging
```bash
# View detailed logs
gcloud run services logs tail $SERVICE_NAME --region=$REGION --format="value(textPayload)"
# Check service status
gcloud run services list --region=$REGION
# Describe service configuration
gcloud run services describe $SERVICE_NAME --region=$REGION
```
## Performance Optimization
For better performance in production:
1. **Increase Resources**: Adjust CPU and memory in Cloud Run deployment
2. **Enable Caching**: Implement caching for frequently requested reports
3. **Database Integration**: Add database support for storing reports and user data
4. **Load Balancing**: Use multiple Cloud Run instances for high traffic
## Support
For support, questions, or feedback regarding the CreateReport Crew or crewAI:
- Visit our [documentation](https://docs.crewai.com)
- Reach out to us through our [GitHub repository](https://github.com/joaomdmoura/crewai)
- [Join our Discord](https://discord.com/invite/X4JWnZnxPb)
- [Chat with our docs](https://chatg.pt/DWjSBZn)