{"id":31228765,"url":"https://github.com/rjurney/florian","last_synced_at":"2025-10-14T00:59:49.227Z","repository":{"id":312965417,"uuid":"1049480559","full_name":"rjurney/florian","owner":"rjurney","description":"Experiments in Graph RAG for agent memory","archived":false,"fork":false,"pushed_at":"2025-09-04T00:46:34.000Z","size":141347,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-10-13T04:11:29.424Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/rjurney.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2025-09-03T03:37:23.000Z","updated_at":"2025-09-16T18:36:22.000Z","dependencies_parsed_at":"2025-09-03T05:28:23.435Z","dependency_job_id":"055f2d9e-790e-4eaf-948e-295fdb5eccff","html_url":"https://github.com/rjurney/florian","commit_stats":null,"previous_names":["rjurney/florian"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/rjurney/florian","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rjurney%2Fflorian","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rjurney%2Fflorian/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rjurney%2Fflorian/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rjurney%2Fflorian/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/rjurney","download_url":"https://codeload.github.com/rjurney/florian/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/rjurney%2Fflorian/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":279017356,"owners_count":26086054,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-10-13T02:00:06.723Z","response_time":61,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2025-09-22T07:01:54.171Z","updated_at":"2025-10-14T00:59:49.222Z","avatar_url":"https://github.com/rjurney.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"# Florian\n\nThis is a Python project that uses the GMail API to download threads of messages and index them for Retrieval Augmented Generation (RAG). It includes advanced graph database capabilities through an integrated OpenSearch plugin for modeling and querying email communication networks.\n\n## System Requirements\n\n- Python 3.12\n- Poetry for dependency management\n- Docker and Docker Compose (for OpenSearch)\n\n## Quick Start\n\n1. Clone the repository:\n\n```bash\ngit clone https://github.com/yourusername/florian.git\ncd florian\n```\n\n2. Create a virtual environment:\n\n```bash\nconda create -n florian python=3.12 -y\nconda activate florian\n```\n\n```bash\npython -m venv venv\nsource venv/bin/activate\n```\n\n3. Install dependencies:\n\n```bash\npoetry install\n```\n\nSee [POETRY.md](assets/POETRY.md) for poetry installation instructions.\n\n## Gmail API Setup\n\n### 1. Enable Gmail API\n\n1. Go to [Google Cloud Console](https://console.cloud.google.com/)\n2. Create a new project or select an existing one\n3. Enable the Gmail API for your project\n4. Configure the OAuth consent screen\n5. Create OAuth 2.0 credentials (Desktop application type)\n6. Download the credentials and save as `auth/credentials.json`\n\n### 2. Authenticate\n\nFirst-time authentication:\n\n```bash\nflo gmail auth\n```\n\nThis will open a browser window for you to authorize the application. The access token will be saved to `auth/token.json` for future use.\n\n## Usage\n\n### CLI Commands\n\nThe Florian CLI provides several commands for interacting with Gmail:\n\n```bash\n# Show help\nflo --help\nflo gmail --help\n\n# Authenticate with Gmail API\nflo gmail auth\n\n# List Gmail threads\nflo gmail list --max-results 50\nflo gmail list --query \"is:unread\"\n\n# Search for threads you've participated in\nflo gmail search  # Default: finds threads you've sent (from:me)\nflo gmail search --query \"from:me after:2024/1/1\"  # Your replies after Jan 1, 2024\nflo gmail search --max-results 1000 --output data/my_thread_ids.json\n\n# Fetch full content for searched threads\nflo gmail fetch-searched  # Uses data/thread_ids.json by default\nflo gmail fetch-searched --limit 50  # Fetch only first 50 threads\nflo gmail fetch-searched --input data/my_thread_ids.json --output data/my_threads.json\n\n# Fetch Gmail threads with full content (direct method)\nflo gmail fetch --max-threads 10\nflo gmail fetch --query \"from:important@example.com\" --output data/threads.json\n\n# Fetch a specific thread\nflo gmail thread THREAD_ID --output data/thread.json\n```\n\n### Python API\n\n```python\nfrom florian.gmail.auth import GmailAuth\nfrom florian.gmail.client import GmailClient\n\n# Create authentication instance\nauth = GmailAuth()\n\n# Create Gmail client\nclient = GmailClient(auth)\n\n# Fetch threads\nthreads = client.fetch_threads(\n    query=\"is:unread\",\n    max_threads=10,\n    save_to_file=\"data/gmail_threads.json\"\n)\n\n# Process threads\nfor thread in threads:\n    print(f\"Thread ID: {thread['id']}\")\n    for message in thread['messages']:\n        subject = message['headers'].get('subject', 'No subject')\n        print(f\"  - {subject}\")\n```\n\n## Gmail Search Queries\n\nYou can use Gmail's search operators in the `--query` parameter:\n\n- `is:unread` - Unread messages\n- `from:sender@example.com` - Messages from specific sender\n- `to:me` - Messages sent to you\n- `subject:meeting` - Messages with \"meeting\" in subject\n- `has:attachment` - Messages with attachments\n- `after:2024/1/1` - Messages after a date\n- `label:important` - Messages with specific label\n\nCombine operators: `is:unread from:boss@company.com`\n\n## OpenSearch Integration\n\nFlorian includes full OpenSearch integration for indexing and searching your email threads.\n\n### Setup OpenSearch\n\n```bash\n# Start OpenSearch with Docker Compose\nflo opensearch setup --start\n\n# Check status\nflo opensearch setup --status\n\n# Stop OpenSearch\nflo opensearch setup --stop\n```\n\nOpenSearch will be available at:\n\n- OpenSearch API: http://localhost:9200\n- OpenSearch Dashboards: http://localhost:5601\n\n### Index Email Threads\n\n```bash\n# Index threads from default location (data/gmail_threads.json)\nflo opensearch index\n\n# Index from custom file\nflo opensearch index --input data/my_threads.json\n\n# Recreate index (delete existing data)\nflo opensearch index --recreate\n```\n\n### Search Indexed Emails\n\n```bash\n# Search for emails\nflo opensearch search \"meeting\"\nflo opensearch search \"project deadline\" --size 20\n```\n\n### Complete Workflow\n\n1. **Fetch emails from Gmail:**\n   ```bash\n   flo gmail search\n   flo gmail fetch-searched\n   ```\n\n2. **Start OpenSearch:**\n   ```bash\n   flo opensearch setup --start\n   ```\n\n3. **Index emails:**\n   ```bash\n   flo opensearch index\n   ```\n\n4. **Search your emails:**\n   ```bash\n   flo opensearch search \"important topic\"\n   ```\n\n## Graph Database Plugin\n\nFlorian includes an OpenSearch graph database plugin (located in `src/open-graph-plugin/`) that extends OpenSearch with graph database capabilities for Graph RAG. \n\nThis plugin brings simple graph traversals to OpenSearch.\n\nThe plugin is integrated as a git subtree from the [Graphlet-AI/open-graph](https://github.com/Graphlet-AI/open-graph) repository, allowing for independent development while maintaining integration with Florian's email processing pipeline.\n\n## Data Storage\n\nFetched threads are saved as JSON files with the following structure:\n\n```json\n{\n  \"id\": \"thread_id\",\n  \"messages\": [\n    {\n      \"id\": \"message_id\",\n      \"headers\": {\n        \"from\": \"sender@example.com\",\n        \"to\": \"recipient@example.com\",\n        \"subject\": \"Email subject\",\n        \"date\": \"Mon, 1 Jan 2024 12:00:00 -0000\"\n      },\n      \"body\": {\n        \"text\": \"Plain text content\",\n        \"html\": \"HTML content\"\n      }\n    }\n  ]\n}\n```\n\nThese JSON structures are then processed and indexed into both the OpenSearch text index for full-text search and the graph database for relationship-based queries.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frjurney%2Fflorian","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Frjurney%2Fflorian","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Frjurney%2Fflorian/lists"}