{"id":28900267,"url":"https://github.com/sing1ee/a2a_llama_index_file_chat","last_synced_at":"2025-07-30T23:35:31.278Z","repository":{"id":296862802,"uuid":"994762782","full_name":"sing1ee/a2a_llama_index_file_chat","owner":"sing1ee","description":"This sample demonstrates a conversational agent built with LlamaIndex Workflows and exposed through the A2A protocol. It showcases file upload and parsing, conversational interactions with support for multi-turn dialogue, streaming responses/updates, and in-line citations.","archived":false,"fork":false,"pushed_at":"2025-06-03T01:53:58.000Z","size":252,"stargazers_count":0,"open_issues_count":0,"forks_count":0,"subscribers_count":0,"default_branch":"main","last_synced_at":"2025-06-21T09:46:15.503Z","etag":null,"topics":["a2a","a2a-client","a2a-protocol","a2a-server","llama-index","openrout"],"latest_commit_sha":null,"homepage":"https://a2aprotocol.ai/blog/a2a-samples-llama-index-file-chat-openrouter","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/sing1ee.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2025-06-02T12:50:34.000Z","updated_at":"2025-06-03T01:54:01.000Z","dependencies_parsed_at":null,"dependency_job_id":"752e6fec-7e0b-4ef0-8d14-7b1fa3b331dc","html_url":"https://github.com/sing1ee/a2a_llama_index_file_chat","commit_stats":null,"previous_names":["sing1ee/a2a_llama_index_file_chat"],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/sing1ee/a2a_llama_index_file_chat","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing1ee%2Fa2a_llama_index_file_chat","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing1ee%2Fa2a_llama_index_file_chat/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing1ee%2Fa2a_llama_index_file_chat/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing1ee%2Fa2a_llama_index_file_chat/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/sing1ee","download_url":"https://codeload.github.com/sing1ee/a2a_llama_index_file_chat/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/sing1ee%2Fa2a_llama_index_file_chat/sbom","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":267960873,"owners_count":24172509,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-07-30T02:00:09.044Z","response_time":70,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["a2a","a2a-client","a2a-protocol","a2a-server","llama-index","openrout"],"created_at":"2025-06-21T09:39:25.395Z","updated_at":"2025-07-30T23:35:31.269Z","avatar_url":"https://github.com/sing1ee.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"\n# LlamaIndex File Chat Workflow with A2A Protocol\n\nThis sample demonstrates a conversational agent built with [LlamaIndex Workflows](https://docs.llamaindex.ai/en/stable/understanding/workflows/) and exposed through the A2A protocol. It showcases file upload and parsing, conversational interactions with support for multi-turn dialogue, streaming responses/updates, and in-line citations.\n\n## source code\n[a2a llama index file chat with openrouter](https://github.com/sing1ee/a2a_llama_index_file_chat)\n\n## How It Works\n\nThis agent uses LlamaIndex Workflows with OpenRouter to provide a conversational agent that can upload files, parse them, and answer questions about the content. The A2A protocol enables standardized interaction with the agent, allowing clients to send requests and receive real-time updates.\n\n```mermaid\nsequenceDiagram\n    participant Client as A2A Client\n    participant Server as A2A Server\n    participant Workflow as ParseAndChat Workflow\n    participant Services as External APIs\n\n    Client-\u003e\u003eServer: Send message (with or without attachment)\n    Server-\u003e\u003eWorkflow: Forward as InputEvent\n\n    alt Has Attachment\n        Workflow--\u003e\u003eServer: Stream LogEvent \"Parsing document...\"\n        Server--\u003e\u003eClient: Stream status update\n        Workflow-\u003e\u003eServices: Parse document\n        Workflow--\u003e\u003eServer: Stream LogEvent \"Document parsed successfully\"\n        Server--\u003e\u003eClient: Stream status update\n    end\n\n    Workflow--\u003e\u003eServer: Stream LogEvent about chat processing\n    Server--\u003e\u003eClient: Stream status update\n    \n    Workflow-\u003e\u003eServices: LLM Chat (with document context if available)\n    Services-\u003e\u003eWorkflow: Structured LLM Response\n    Workflow--\u003e\u003eServer: Stream LogEvent about response processing\n    Server--\u003e\u003eClient: Stream status update\n    \n    Workflow-\u003e\u003eServer: Return final ChatResponseEvent\n    Server-\u003e\u003eClient: Return response with citations (if available)\n\n    Note over Server: Context is maintained for follow-up questions\n```\n\n## Key Features\n\n- **File Upload**: Clients can upload files and parse them to provide context to the chat\n- **Multi-turn Conversations**: Agent can request additional information when needed\n- **Real-time Streaming**: Provides status updates during processing\n- **Push Notifications**: Support for webhook-based notifications\n- **Conversational Memory**: Maintains context across interactions in the same session\n- **LlamaParse Integration**: Uses LlamaParse to parse files accurately\n\n**NOTE:** This sample agent accepts multimodal inputs, but at the time of writing, the sample UI only supports text inputs. The UI will become multimodal in the future to handle this and other use cases.\n\n## Prerequisites\n\n- Python 3.12 or higher\n- [UV](https://docs.astral.sh/uv/)\n- Access to an LLM and API Key (Current code assumes using OpenRouter API)\n- A LlamaParse API key ([get one for free](https://cloud.llamaindex.ai))\n\n## Setup \u0026 Running\n\n1. Clone and navigate to the project directory:\n\n   ```bash\n   git clone \u003crepository-url\u003e\n   cd a2a_llama_index_file_chat\n   ```\n\n2. Create a virtual environment and install dependencies:\n\n   ```bash\n   uv venv\n   uv sync\n   ```\n\n3. Create an environment file with your API keys:\n\n   ```bash\n   echo \"OPENROUTER_API_KEY=your_api_key_here\" \u003e\u003e .env\n   echo \"LLAMA_CLOUD_API_KEY=your_api_key_here\" \u003e\u003e .env\n   ```\n\n   **Getting API Keys:**\n   - **OpenRouter API Key**: Sign up at [https://openrouter.ai](https://openrouter.ai) to get your free API key\n   - **LlamaCloud API Key**: Get one for free at [https://cloud.llamaindex.ai](https://cloud.llamaindex.ai)\n\n4. Run the agent:\n\n   ```bash\n   # Using uv\n   uv run a2a-file-chat\n\n   # Or activate the virtual environment and run directly\n   source .venv/bin/activate  # On Windows: .venv\\Scripts\\activate\n   python -m a2a_file_chat\n\n   # With custom host/port\n   uv run a2a-file-chat --host 0.0.0.0 --port 8080\n   ```\n\n4. In a separate terminal, run an A2A client cli:\n\n  Download a file to parse, or link to your own file. For example:\n\n   ```bash\n   curl -L https://arxiv.org/pdf/1706.03762 -o attention.pdf\n   ```\n\n   ```bash\n   git clone https://github.com/google-a2a/a2a-samples.git\n   cd a2a-samples/samples/python/hosts/cli\n   uv run . --agent http://localhost:10010\n   ```\n\n   And enter something like the following:\n\n   ```bash\n   ======= Agent Card ========\n   {\"name\":\"Parse and Chat\",\"description\":\"Parses a file and then chats with a user using the parsed content as context.\",\"url\":\"http://localhost:10010/\",\"version\":\"1.0.0\",\"capabilities\":{\"streaming\":true,\"pushNotifications\":true,\"stateTransitionHistory\":false},\"defaultInputModes\":[\"text\",\"text/plain\"],\"defaultOutputModes\":[\"text\",\"text/plain\"],\"skills\":[{\"id\":\"parse_and_chat\",\"name\":\"Parse and Chat\",\"description\":\"Parses a file and then chats with a user using the parsed content as context.\",\"tags\":[\"parse\",\"chat\",\"file\",\"llama_parse\"],\"examples\":[\"What does this file talk about?\"]}]}\n   =========  starting a new task ======== \n\n   What do you want to send to the agent? (:q or quit to exit): What does this file talk about?\n   Select a file path to attach? (press enter to skip): ./attention.pdf\n   ```\n\n## Technical Implementation\n\n- **LlamaIndex Workflows**: Uses a custom workflow to parse the file and then chat with the user\n- **Streaming Support**: Provides incremental updates during processing\n- **Serializable Context**: Maintains conversation state between turns, can optionally be persisted to redis, mongodb, to disk, etc.\n- **Push Notification System**: Webhook-based updates with JWK authentication\n- **A2A Protocol Integration**: Full compliance with A2A specifications\n\n## Limitations\n\n- Only supports text-based output\n- LlamaParse is free for the first 10K credits (~3333 pages with basic settings)\n- Memory is session-based and in-memory, and therefore not persisted between server restarts\n- Inserting the entire document into the context window is not scalable for larger files. You may want to deploy a vector DB or use a cloud DB to run retrieval over one or more files for effective RAG. LlamaIndex integrates with a [ton of vector DBs and cloud DBs](https://docs.llamaindex.ai/en/stable/examples/#vector-stores).\n\n## Examples\n\n**Synchronous request**\n\nRequest:\n\n```\nPOST http://localhost:10010\nContent-Type: application/json\n\n{\n  \"jsonrpc\": \"2.0\",\n  \"id\": 11,\n  \"method\": \"tasks/send\",\n  \"params\": {\n    \"id\": \"129\",\n    \"sessionId\": \"8f01f3d172cd4396a0e535ae8aec6687\",\n    \"acceptedOutputModes\": [\n      \"text\"\n    ],\n    \"message\": {\n      \"role\": \"user\",\n      \"parts\": [\n        {\n          \"type\": \"text\",\n          \"text\": \"What does this file talk about?\"\n        },\n        {\n            \"type\": \"file\",\n            \"file\": {\n                \"bytes\": \"...\",\n                \"name\": \"attention.pdf\"\n            }\n        }\n      ]\n    }\n  }\n}\n```\n\nResponse:\n\n```\n{\n  \"jsonrpc\": \"2.0\",\n  \"id\": 11,\n  \"result\": {\n    \"id\": \"129\",\n    \"status\": {\n      \"state\": \"completed\",\n      \"timestamp\": \"2025-04-02T16:53:29.301828\"\n    },\n    \"artifacts\": [\n      {\n        \"parts\": [\n          {\n            \"type\": \"text\",\n            \"text\": \"This file is about XYZ... [1]\"\n          }\n        ],\n        \"metadata\": {\n            \"1\": [\"Text for citation 1\"]\n        }\n        \"index\": 0,\n      }\n    ],\n  }\n}\n```\n\n**Multi-turn example**\n\nRequest - Seq 1:\n\n```\nPOST http://localhost:10010\nContent-Type: application/json\n\n{\n  \"jsonrpc\": \"2.0\",\n  \"id\": 11,\n  \"method\": \"tasks/send\",\n  \"params\": {\n    \"id\": \"129\",\n    \"sessionId\": \"8f01f3d172cd4396a0e535ae8aec6687\",\n    \"acceptedOutputModes\": [\n      \"text\"\n    ],\n    \"message\": {\n      \"role\": \"user\",\n      \"parts\": [\n        {\n          \"type\": \"text\",\n          \"text\": \"What does this file talk about?\"\n        },\n        {\n            \"type\": \"file\",\n            \"file\": {\n                \"bytes\": \"...\",\n                \"name\": \"attention.pdf\"\n            }\n        }\n      ]\n    }\n  }\n}\n```\n\nResponse - Seq 2:\n\n```\n{\n  \"jsonrpc\": \"2.0\",\n  \"id\": 11,\n  \"result\": {\n    \"id\": \"129\",\n    \"status\": {\n      \"state\": \"completed\",\n      \"timestamp\": \"2025-04-02T16:53:29.301828\"\n    },\n    \"artifacts\": [\n      {\n        \"parts\": [\n          {\n            \"type\": \"text\",\n            \"text\": \"This file is about XYZ... [1]\"\n          }\n        ],\n        \"metadata\": {\n            \"1\": [\"Text for citation 1\"]\n        }\n        \"index\": 0,\n      }\n    ],\n  }\n}\n```\n\nRequest - Seq 3:\n\n```\nPOST http://localhost:10010\nContent-Type: application/json\n\n{\n  \"jsonrpc\": \"2.0\",\n  \"id\": 11,\n  \"method\": \"tasks/send\",\n  \"params\": {\n    \"id\": \"130\",\n    \"sessionId\": \"8f01f3d172cd4396a0e535ae8aec6687\",\n    \"acceptedOutputModes\": [\n      \"text\"\n    ],\n    \"message\": {\n      \"role\": \"user\",\n      \"parts\": [\n        {\n          \"type\": \"text\",\n          \"text\": \"What about thing X?\"\n        }\n      ]\n    }\n  }\n}\n```\n\nResponse - Seq 4:\n\n```\n{\n  \"jsonrpc\": \"2.0\",\n  \"id\": 11,\n  \"result\": {\n    \"id\": \"130\",\n    \"status\": {\n      \"state\": \"completed\",\n      \"timestamp\": \"2025-04-02T16:53:29.301828\"\n    },\n    \"artifacts\": [\n      {\n        \"parts\": [\n          {\n            \"type\": \"text\",\n            \"text\": \"Thing X is ... [1]\"\n          }\n        ],\n        \"metadata\": {\n            \"1\": [\"Text for citation 1\"]\n        }\n        \"index\": 0,\n      }\n    ],\n  }\n}\n```\n\n**Streaming example**\n\nRequest:\n\n```\n{\n  \"jsonrpc\": \"2.0\",\n  \"id\": 11,\n  \"method\": \"tasks/send\",\n  \"params\": {\n    \"id\": \"129\",\n    \"sessionId\": \"8f01f3d172cd4396a0e535ae8aec6687\",\n    \"acceptedOutputModes\": [\n      \"text\"\n    ],\n    \"message\": {\n      \"role\": \"user\",\n      \"parts\": [\n        {\n          \"type\": \"text\",\n          \"text\": \"What does this file talk about?\"\n        },\n        {\n            \"type\": \"file\",\n            \"file\": {\n                \"bytes\": \"...\",\n                \"name\": \"attention.pdf\"\n            }\n        }\n      ]\n    }\n  }\n}\n```\n\nResponse:\n\n```\nstream event =\u003e {\"jsonrpc\":\"2.0\",\"id\":\"367d0ba9af97457890261ac29a0f6f5b\",\"result\":{\"id\":\"373b26d64c5a4f0099fa906c6b7342d9\",\"status\":{\"state\":\"working\",\"message\":{\"role\":\"agent\",\"parts\":[{\"type\":\"text\",\"text\":\"Parsing document...\"}]},\"timestamp\":\"2025-04-15T16:05:18.283682\"},\"final\":false}}\n\nstream event =\u003e {\"jsonrpc\":\"2.0\",\"id\":\"367d0ba9af97457890261ac29a0f6f5b\",\"result\":{\"id\":\"373b26d64c5a4f0099fa906c6b7342d9\",\"status\":{\"state\":\"working\",\"message\":{\"role\":\"agent\",\"parts\":[{\"type\":\"text\",\"text\":\"Document parsed successfully.\"}]},\"timestamp\":\"2025-04-15T16:05:24.200133\"},\"final\":false}}\n\nstream event =\u003e {\"jsonrpc\":\"2.0\",\"id\":\"367d0ba9af97457890261ac29a0f6f5b\",\"result\":{\"id\":\"373b26d64c5a4f0099fa906c6b7342d9\",\"status\":{\"state\":\"working\",\"message\":{\"role\":\"agent\",\"parts\":[{\"type\":\"text\",\"text\":\"Chatting with 1 initial messages.\"}]},\"timestamp\":\"2025-04-15T16:05:24.204757\"},\"final\":false}}\n\nstream event =\u003e {\"jsonrpc\":\"2.0\",\"id\":\"367d0ba9af97457890261ac29a0f6f5b\",\"result\":{\"id\":\"373b26d64c5a4f0099fa906c6b7342d9\",\"status\":{\"state\":\"working\",\"message\":{\"role\":\"agent\",\"parts\":[{\"type\":\"text\",\"text\":\"Inserting system prompt...\"}]},\"timestamp\":\"2025-04-15T16:05:24.204810\"},\"final\":false}}\n\nstream event =\u003e {\"jsonrpc\":\"2.0\",\"id\":\"367d0ba9af97457890261ac29a0f6f5b\",\"result\":{\"id\":\"373b26d64c5a4f0099fa906c6b7342d9\",\"status\":{\"state\":\"working\",\"message\":{\"role\":\"agent\",\"parts\":[{\"type\":\"text\",\"text\":\"LLM response received, parsing citations...\"}]},\"timestamp\":\"2025-04-15T16:05:26.084829\"},\"final\":false}}\n\nstream event =\u003e {\"jsonrpc\":\"2.0\",\"id\":\"367d0ba9af97457890261ac29a0f6f5b\",\"result\":{\"id\":\"373b26d64c5a4f0099fa906c6b7342d9\",\"artifact\":{\"parts\":[{\"type\":\"text\",\"text\":\"This file discusses the Transformer, a novel neural network architecture based solely on attention mechanisms, dispensing with recurrence and convolutions entirely [1]. The document compares the Transformer to recurrent and convolutional layers [2], details the model architecture [3], and presents results from machine translation and English constituency parsing tasks [4].\"}],\"metadata\":{\"1\":[\"The dominant sequence transduction models are based on complex recurrent or convolutional neural networks that include an encoder and a decoder. The best performing models also connect the encoder and decoder through an attention mechanism. We propose a new simple network architecture, the Transformer, based solely on attention mechanisms, dispensing with recurrence and convolutions entirely. Experiments on two machine translation tasks show these models to be superior in quality while being more parallelizable and requiring significantly less time to train. Our model achieves 28.4 BLEU on the WMT 2014 English-to-German translation task, improving over the existing best results, including ensembles, by over 2 BLEU. On the WMT 2014 English-to-French translation task, our model establishes a new single-model state-of-the-art BLEU score of 41.8 after training for 3.5 days on eight GPUs, a small fraction of the training costs of the best models from the literature. We show that the Transformer generalizes well to other tasks by applying it successfully to English constituency parsing both with large and limited training data.\"],\"2\":[\"In this section we compare various aspects of self-attention layers to the recurrent and convolutional layers commonly used for mapping one variable-length sequence of symbol representations (x1, ..., xn) to another sequence of equal length (z1, ..., zn), with xi, zi ∈ Rd, such as a hidden layer in a typical sequence transduction encoder or decoder. Motivating our use of self-attention we consider three desiderata.\",\"\"],\"3\":[\"# 3 Model Architecture\"],\"4\":[\"# 6   Results\"]},\"index\":0,\"append\":false}}}\n\nstream event =\u003e {\"jsonrpc\":\"2.0\",\"id\":\"367d0ba9af97457890261ac29a0f6f5b\",\"result\":{\"id\":\"373b26d64c5a4f0099fa906c6b7342d9\",\"status\":{\"state\":\"completed\",\"timestamp\":\"2025-04-15T16:05:26.111314\"},\"final\":true}}\n```\n\nYou can see that the workflow produced an artifact with in-line citations, and the source text of those citations is included in the metadata of the artifact. If we send more responses in the same session, the agent will remember the previous messages and continue the conversation.\n\n## Learn More\n\n- [A2A Protocol Documentation](https://google.github.io/A2A/#/documentation)\n- [LlamaIndex Workflow Documentation](https://docs.llamaindex.ai/en/stable/understanding/workflows/)\n- [LlamaIndex Workflow Examples](https://docs.llamaindex.ai/en/stable/examples/#agentic-workflows)\n- [LlamaParse Documentation](https://github.com/run-llama/llama_cloud_services/blob/main/parse.md)\n- [OpenRouter API](https://openrouter.ai)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsing1ee%2Fa2a_llama_index_file_chat","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fsing1ee%2Fa2a_llama_index_file_chat","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fsing1ee%2Fa2a_llama_index_file_chat/lists"}