{"id":29663116,"url":"https://github.com/auth0r-c0dez/mailvoyager","last_synced_at":"2026-04-15T20:02:15.129Z","repository":{"id":305041251,"uuid":"1021723983","full_name":"Auth0r-C0dez/MailVoyager","owner":"Auth0r-C0dez","description":"MailVoyager is a conversational AI agent that controls a web browser to perform tasks like sending emails through Gmail's UI. It demonstrates true browser automation by interacting with web elements like a human, capturing screenshots on completion, and providing visual feedback within a chat interface.","archived":false,"fork":false,"pushed_at":"2025-07-17T21:42:33.000Z","size":44988,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-07-18T01:27:08.255Z","etag":null,"topics":["aiagent","fastapi","llm","python"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Auth0r-C0dez.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-07-17T21:03:53.000Z","updated_at":"2025-07-17T21:46:51.000Z","dependencies_parsed_at":"2025-07-18T03:15:02.374Z","dependency_job_id":null,"html_url":"https://github.com/Auth0r-C0dez/MailVoyager","commit_stats":null,"previous_names":["auth0r-c0dez/mailvoyager"],"tags_count":null,"template":false,"template_full_name":null,"purl":"pkg:github/Auth0r-C0dez/MailVoyager","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Auth0r-C0dez%2FMailVoyager","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Auth0r-C0dez%2FMailVoyager/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Auth0r-C0dez%2FMailVoyager/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Auth0r-C0dez%2FMailVoyager/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Auth0r-C0dez","download_url":"https://codeload.github.com/Auth0r-C0dez/MailVoyager/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Auth0r-C0dez%2FMailVoyager/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":31857625,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-15T15:24:51.572Z","status":"ssl_error","status_checked_at":"2026-04-15T15:24:39.138Z","response_time":63,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["aiagent","fastapi","llm","python"],"created_at":"2025-07-22T11:08:25.308Z","updated_at":"2026-04-15T20:02:15.094Z","avatar_url":"https://github.com/Auth0r-C0dez.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Conversational Browser Control Agent\n\n## Project Overview\nThis solution implements a conversational AI agent that controls a web browser to send emails through Gmail's web interface. Unlike API-based solutions, this agent:\n- Opens a real browser instance\n- Navigates to gmail.com\n- Interacts with UI elements directly\n- Captures screenshots at each step\n- Embeds visual feedback in the chat interface\n\n## Architecture Diagram\n```mermaid\ngraph TD\n    A[User Interface] \u003c--\u003e B[Backend: FastAPI]\n    B \u003c--\u003e C[Conversation Manager]\n    C \u003c--\u003e D[Browser Automation Engine]\n    C \u003c--\u003e E[AI Content Generator]\n    D --\u003e F[Playwright]\n    E --\u003e G[OpenRouter API]\n```\n\n## Critical Requirements\n- ✅ **NO APIs USED**: Solution uses browser automation only\n- ✅ **Real Browser Control**: Playwright controls Chromium\n- ✅ **Visual Feedback**: Screenshots embedded in chat\n- ✅ **Natural Language Processing**: Understands user requests\n- ✅ **Extensible Architecture**: Separated layers for easy modification\n\n## User Journey\n1. User requests email sending via natural language\n2. Agent collects necessary information\n3. Agent opens browser and navigates to Gmail\n4. Step-by-step interaction with Gmail UI\n5. Screenshots captured and displayed in chat\n6. Email sent confirmation\n\n## Technical Implementation\n### Natural Language Understanding\n- Intent extraction from conversational inputs\n- Contextual question generation\n- Memory management for conversation flow\n\n### Browser Automation Engine\n- Playwright for browser control\n- Robust element selectors\n- Error handling for dynamic content\n- Screenshot capture at each step\n- Headless/headful mode support\n\n### Conversational Interface\n- FastAPI backend\n- WebSocket for real-time updates\n- Base64 image embedding\n- Responsive chat UI\n\n### AI-Powered Content Generation\n- OpenRouter API integration\n- Dynamic email content generation\n- Context-aware subject lines\n- Professional tone adaptation\n\n## Setup Instructions\n1. **Install dependencies**:\n```bash\npip install -r requirements.txt\nplaywright install chromium\n```\n\n2. **Configure environment variables**:\nCreate `.env` file with:\n```\nOPENROUTER_API_KEY=your_api_key\n```\n\n3. **Run the application**:\n```bash\nuvicorn main:app --reload\n```\n\n4. **Access the UI**:\nOpen `http://localhost:8000` in your browser\n\n## Screenshots\n![Login Step](screenshots/login.png)\n![Compose Email](screenshots/compose.png)\n![Email Sent](screenshots/sent.png)\n\n## Proof of Functionality\n- Email sent to: reportinsurebuzz@gmail.com\n- Subject: \"AI Agent Task - Rana Talukdar\"\n- Sent via Gmail web interface (no APIs used)\n\n## Technology Stack\n- **Browser Automation**: Playwright\n- **Backend Framework**: FastAPI\n- **Frontend**: HTML/CSS/JavaScript\n- **AI Integration**: OpenRouter API\n- **Conversation Management**: Custom state machine\n\n## Challenges and Solutions\n1. **Dynamic Element Handling**:\n   - Implemented robust selectors with fallbacks\n   - Added explicit waits for element visibility\n   \n2. **Screenshot Integration**:\n   - Base64 encoding for inline display\n   - Compression to reduce payload size\n   \n3. **Session Management**:\n   - Isolated browser contexts per session\n   - Proper resource cleanup\n\n4. **Python compatability with Playwright**\n   - Had to shift between multiple versions of python in order to find the right version.\n   - Finally python python-3.10.11 was the right fit\n\n## Future Extensions\n- Multi-website support\n- Voice command integration (to be added)\n- Cross-browser compatibility\n- Plugin system for new actions\n\n---\n*This solution demonstrates true browser automation - no email APIs were used in accordance with assignment requirements.*\n## Made with HaRd WoRk by Rana Talukdar\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fauth0r-c0dez%2Fmailvoyager","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fauth0r-c0dez%2Fmailvoyager","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fauth0r-c0dez%2Fmailvoyager/lists"}