{"id":24719136,"url":"https://github.com/kyopark2014/agentic-rag","last_synced_at":"2025-03-22T11:41:48.325Z","repository":{"id":272686781,"uuid":"917434952","full_name":"kyopark2014/agentic-rag","owner":"kyopark2014","description":"It shows how to realize agentic RAG.","archived":false,"fork":false,"pushed_at":"2025-03-19T01:36:57.000Z","size":1242,"stargazers_count":11,"open_issues_count":0,"forks_count":13,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-19T02:34:14.525Z","etag":null,"topics":["agentic-rag","aws","rag","streamlit","workflow"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kyopark2014.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2025-01-16T01:20:52.000Z","updated_at":"2025-03-19T01:37:00.000Z","dependencies_parsed_at":"2025-01-16T18:49:42.214Z","dependency_job_id":"209f7541-bacb-401b-b526-62690782c11b","html_url":"https://github.com/kyopark2014/agentic-rag","commit_stats":null,"previous_names":["kyopark2014/nova-agentic-rag","kyopark2014/agentic-rag"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kyopark2014%2Fagentic-rag","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kyopark2014%2Fagentic-rag/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kyopark2014%2Fagentic-rag/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kyopark2014%2Fagentic-rag/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kyopark2014","download_url":"https://codeload.github.com/kyopark2014/agentic-rag/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":244952554,"owners_count":20537467,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agentic-rag","aws","rag","streamlit","workflow"],"created_at":"2025-01-27T11:16:50.747Z","updated_at":"2025-03-22T11:41:48.318Z","avatar_url":"https://github.com/kyopark2014.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Agentic RAG 구현하기\n\n\u003cp align=\"left\"\u003e\n    \u003ca href=\"https://hits.seeyoufarm.com\"\u003e\u003cimg src=\"https://hits.seeyoufarm.com/api/count/incr/badge.svg?url=https%3A%2F%2Fgithub.com%2Fkyopark2014%2Fagentic-rag\u0026count_bg=%2379C83D\u0026title_bg=%23555555\u0026icon=\u0026icon_color=%23E7E7E7\u0026title=hits\u0026edge_flat=false\")](https://hits.seeyoufarm.com\"/\u003e\u003c/a\u003e\n    \u003cimg alt=\"License\" src=\"https://img.shields.io/badge/license-Apache%202.0-blue\"\u003e\n\u003c/p\u003e\n\n여기에서는 RAG의 성능 향상 기법인 Agentic RAG, Corrective RAG, Self RAG를 구현하는 방법을 설명합니다. 또한 RAG의 데이터 수집에 필요한 PDF의 header/footer의 처리, 이미지의 추출 및 분석과 함께 contextual retrieval을 활용하는 방법을 설명합니다. 이를 통해서 생성형 AI 애플리케이션을 위한  데이터를 효과적으로 수집하여 활용할 수 있습니다. 여기서는 오픈소스 LLM Framework인 [LangGraph](https://langchain-ai.github.io/langgraph/)을 이용하고, 구현된 workflow들은 [Streamlit](https://streamlit.io/)을 이용해 개발 및 테스트를 수행할 수 있습니다. [AWS CDK](https://docs.aws.amazon.com/cdk/api/v2/docs/aws-construct-library.html)를 이용하고 한번에 배포할 수 있고, [CloudFront](https://docs.aws.amazon.com/AmazonCloudFront/latest/DeveloperGuide/Introduction.html) - ALB 구조를 이용해 HTTPS로 안전하게 접속할 수 있습니다. \n\n## System Architecture \n\n전체적인 architecture는 아래와 같습니다. Streamlit이 설치된 EC2는 private subnet에 있고, CloudFront-ALB를 이용해 외부와 연결됩니다. RAG는 OpenSearch를 활용하고 있습니다. 인터넷 검색은 tavily를 사용하고 날씨 API를 추가로 활용합니다.\n\n\u003cimg width=\"800\" alt=\"image\" src=\"https://github.com/user-attachments/assets/3353ade1-db6e-4d30-baa5-be78d9820418\" /\u003e\n\n여기에서는 Lambda-Document를 이용해 입력된 문서를 parsing하여 OpenSearch에 push합니다. 이를 위한 event driven 방식의 데이터 처리 방식은 아래와 같습니다.\n\n![image](https://github.com/user-attachments/assets/f89ac20b-91e3-490c-a34f-703c2957b022)\n\n\n## 상세 구현\n\nAgentic workflow (tool use)는 아래와 같이 구현할 수 있습니다. 상세한 내용은 [chat.py](./application/chat.py)을 참조합니다.\n\n### Basic Chat\n\n일반적인 대화는 아래와 같이 stream으로 결과를 얻을 수 있습니다. 여기에서는 LangChain의 ChatBedrock을 이용합니다. Model ID로 사용할 모델을 지정합니다. 아래 예제에서는 Nova Pro의 모델명인 \"us.amazon.nova-pro-v1:0\"을 활용하고 있습니다. Nova 모델는 동급 모델대비 빠르고, 높은 가성비와 함께 훌륭한 멀티모달 성능을 가지고 있습니다. 만약 Claude Sonnet 3.5을 사용한 다면 \"anthropic.claude-3-5-sonnet-20240620-v1:0\"을 입력합니다.\n \n```python\nmodelId = \"us.amazon.nova-pro-v1:0\"\nbedrock_region = \"us-west-2\"\nboto3_bedrock = boto3.client(\n    service_name='bedrock-runtime',\n    region_name=bedrock_region,\n    config=Config(\n        retries = {\n            'max_attempts': 30\n        }\n    )\n)\nparameters = {\n    \"max_tokens\":maxOutputTokens,     \n    \"temperature\":0.1,\n    \"top_k\":250,\n    \"top_p\":0.9,\n    \"stop_sequences\": [\"\\n\\n\u003cthinking\u003e\", \"\\n\u003cthinking\u003e\", \" \u003cthinking\u003e\"]\n}\n\nchat = ChatBedrock(  \n    model_id=modelId,\n    client=boto3_bedrock, \n    model_kwargs=parameters,\n    region_name=bedrock_region\n)\n\nsystem = (\n    \"당신의 이름은 서연이고, 질문에 대해 친절하게 답변하는 사려깊은 인공지능 도우미입니다.\"\n    \"상황에 맞는 구체적인 세부 정보를 충분히 제공합니다.\" \n    \"모르는 질문을 받으면 솔직히 모른다고 말합니다.\"\n)\n\nhuman = \"Question: {input}\"\n\nprompt = ChatPromptTemplate.from_messages([\n    (\"system\", system), \n    MessagesPlaceholder(variable_name=\"history\"), \n    (\"human\", human)\n])\n            \nhistory = memory_chain.load_memory_variables({})[\"chat_history\"]\n\nchain = prompt | chat | StrOutputParser()\nstream = chain.stream(\n    {\n        \"history\": history,\n        \"input\": query,\n    }\n)  \nprint('stream: ', stream)\n```\n\n### Basic RAG\n\n여기에서는 RAG 구현을 위하여 OpenSearch를 이용합니다. \n\nLangChain의 [OpenSearchVectorSearch](https://sj-langchain.readthedocs.io/en/latest/vectorstores/langchain.vectorstores.opensearch_vector_search.OpenSearchVectorSearch.html)을 이용하여 관련된 문서를 가져옵니다.\n\n```python\nvectorstore_opensearch = OpenSearchVectorSearch(\n    index_name = index_name,\n    is_aoss = False,\n    ef_search = 1024,\n    m=48,\n    embedding_function = bedrock_embedding,\n    opensearch_url=opensearch_url,\n    http_auth=(opensearch_account, opensearch_passwd)\n)\nrelevant_documents = vectorstore_opensearch.similarity_search_with_score(\n    query = query,\n    k = top_k\n)\nfor i, document in enumerate(relevant_documents):\n    name = document[0].metadata['name']\n    url = document[0].metadata['url']\n    content = document[0].page_content\n```\n\n얻어온 문서가 적절한지를 판단하기 위하여 아래와 같이 prompt를 이용해 관련도를 평가하고 [structured output](https://github.com/kyopark2014/langgraph-agent/blob/main/structured-output.md)을 이용해 관련도를 평가합니다.\n\n```python\nsystem = (\n    \"You are a grader assessing relevance of a retrieved document to a user question.\"\n    \"If the document contains keyword(s) or semantic meaning related to the question, grade it as relevant.\"\n    \"Give a binary score 'yes' or 'no' score to indicate whether the document is relevant to the question.\"\n)\n\ngrade_prompt = ChatPromptTemplate.from_messages(\n    [\n        (\"system\", system),\n        (\"human\", \"Retrieved document: \\n\\n {document} \\n\\n User question: {question}\"),\n    ]\n)\n    \nstructured_llm_grader = chat.with_structured_output(GradeDocuments)\nretrieval_grader = grade_prompt | structured_llm_grader\n\nfiltered_docs = []\nfor i, doc in enumerate(documents):\n    score = retrieval_grader.invoke({\"question\": question, \"document\": doc.page_content})\n                \n    grade = score.binary_score\n    if grade.lower() == \"yes\":\n        print(\"---GRADE: DOCUMENT RELEVANT---\")\n        filtered_docs.append(doc)\n    else:\n        print(\"---GRADE: DOCUMENT NOT RELEVANT---\")\n        continue\n```\n\n이후 아래와 같이 RAG를 활용하여 원하는 응답을 얻습니다.\n\n```python\nsystem = (\n  \"당신의 이름은 서연이고, 질문에 대해 친절하게 답변하는 사려깊은 인공지능 도우미입니다.\"\n  \"다음의 Reference texts을 이용하여 user의 질문에 답변합니다.\"\n  \"모르는 질문을 받으면 솔직히 모른다고 말합니다.\"\n  \"답변의 이유를 풀어서 명확하게 설명합니다.\"\n)\nhuman = (\n    \"Question: {input}\"\n\n    \"Reference texts: \"\n    \"{context}\"\n)    \nprompt = ChatPromptTemplate.from_messages([(\"system\", system), (\"human\", human)])\nchain = prompt | chat\nstream = chain.invoke(\n    {\n        \"context\": context,\n        \"input\": revised_question,\n    }\n)\nprint(stream.content)    \n```\n\n### RAG의 성능 향상 방법\n\n#### Hiearchical Chunking (Parent-Child Chunking)\n\n문서를 크기에 따라 parent chunk와 child chunk로 나누어서 child chunk를 찾은 후에 LLM의 context에는 parent chunk를 사용하면, 검색의 정확도는 높이고 충분한 문서를 context로 활용할 수 있습니다. 아래에서는 parent doc을 생성후에 다시 child doc을 생성합니다. child doc은 metadata에 parent doc의 id를 가지고 있습니다. parent, child의 문서 id는 저장하여 문서 삭제, 업데이트시에 활용됩니다. 세부 코드는 [Lambda-Document](https://github.com/kyopark2014/agentic-rag/blob/main/lambda-document-manager/lambda_function.py)을 참조합니다.\n\n```python\nparent_splitter = RecursiveCharacterTextSplitter(\n    chunk_size=2000,\n    chunk_overlap=100,\n    separators=[\"\\n\\n\", \"\\n\", \".\", \" \", \"\"],\n    length_function = len,\n)\nchild_splitter = RecursiveCharacterTextSplitter(\n    chunk_size=400,\n    chunk_overlap=50,\n    # separators=[\"\\n\\n\", \"\\n\", \".\", \" \", \"\"],\n    length_function = len,\n)\nparent_docs = parent_splitter.split_documents(docs)\nparent_doc_ids = vectorstore.add_documents(parent_docs, bulk_size = 10000)\nids = parent_doc_ids\n\nfor i, doc in enumerate(parent_docs):\n    _id = parent_doc_ids[i]\n    sub_docs = child_splitter.split_documents([doc])\n    for _doc in sub_docs:\n        _doc.metadata[\"parent_doc_id\"] = _id\n        _doc.metadata[\"doc_level\"] = \"child\"\n\n    child_doc_ids = vectorstore.add_documents(sub_docs, bulk_size = 10000)\n    ids += child_doc_ids\n```\n\n[chat.py](https://github.com/kyopark2014/agentic-rag/blob/main/application/chat.py)와 같이 pre_filter를 이용해 child 문서를 검색하여, parent_doc_id를 이용해 parent 문서를 context로 활용합니다. 하나의 parent doc에서 여러개의 child doc이 선택될 수 있으로 parent_doc_id를 이용해 중복을 확인하여 제거합니다.\n\n```python\nresult = vectorstore_opensearch.similarity_search_with_score(\n    query = query,\n    k = top_k*2,  \n    search_type=\"script_scoring\",\n    pre_filter={\"term\": {\"metadata.doc_level\": \"child\"}}\n)\nrelevant_documents = []\ndocList = []\nfor re in result:\n    if 'parent_doc_id' in re[0].metadata:\n        parent_doc_id = re[0].metadata['parent_doc_id']\n        doc_level = re[0].metadata['doc_level']                \n        if doc_level == 'child':\n            if parent_doc_id in docList:\n                print('duplicated!')\n            else:\n                relevant_documents.append(re)\n                docList.append(parent_doc_id)                        \n                if len(relevant_documents)\u003e=top_k:\n                    break\n```\n\n검색된 child 문서에서 parent_doc_id를 추출하여 parent 문서를 가져와 활용합니다.\n\n```python\nfor i, document in enumerate(relevant_documents):\n    parent_doc_id = document[0].metadata['parent_doc_id']\n    doc_level = document[0].metadata['doc_level']    \n    content, name, url = get_parent_content(parent_doc_id)\n\ndef get_parent_content(parent_doc_id):\n    response = os_client.get(\n        index = index_name, \n        id = parent_doc_id\n    )    \n    source = response['_source']                                \n    return source['text']\n```\n\n#### Header / Footer의 제거\n\n[Header/Footer](https://github.com/kyopark2014/ocean-agent/blob/main/lambda-document-manager/lambda_function.py)에서는 이미지의 header/footer를 제거하는 것을 보여주고 있습니다. 문서의 header/footer는 문서마다 다를 수 있으므로 문서마다 header/footer의 높이를 지정하여야 합니다.\n\n```python\nimage_obj = s3_client.get_object(Bucket=s3_bucket, Key=key)\n                    \nimage_content = image_obj['Body'].read()\nimg = Image.open(BytesIO(image_content))\n                    \nwidth, height = img.size     \npos = key.rfind('/')\nprefix = key[pos+1:pos+5]\nprint('img_prefix: ', prefix)    \nif pdf_profile=='ocean' and prefix == \"img_\":\n    area = (0, 175, width, height-175)\n    img = img.crop(area)\n        \n    width, height = img.size \n            \nif width \u003c 100 or height \u003c 100:  # skip small size image\n    return []\n```\n\n#### Multimodal을 이용해 이미지/표를 활용\n\n문서의 이미지나 표에는 본문에 없는 중요한 정보가 있을 수 있습니다. PDF와 같은 문서에서 이미지를 추출하여 RAG에서 활용합니다. 이미지는 LLM에서 처리할 수 있도록 resize후에 텍스트를 추출합니다. \n\n```python\nisResized = False\nwhile(width*height \u003e 5242880):\n    width = int(width/2)\n    height = int(height/2)\n    isResized = True\n       \nif isResized:\n    img = img.resize((width, height))\n                     \nbuffer = BytesIO()\nimg.save(buffer, format=\"PNG\")\nimg_base64 = base64.b64encode(buffer.getvalue()).decode(\"utf-8\")\n                                                                \nchat = get_multimodal()\nsummary = summary_image(img_base64, subject_company)\n```\n\n텍스트 추출시 아래와 같이 prompt를 이용해 이미지의 내용을 활용합니다.\n\n```python\ndef summary_image(img_base64):\n    chat = get_chat()\n    query = \"이미지가 의미하는 내용을 풀어서 자세히 알려주세요. markdown 포맷으로 답변을 작성합니다.\"\n        \n    messages = [\n        HumanMessage(\n            content=[\n                {\n                    \"type\": \"image_url\",\n                    \"image_url\": {\n                        \"url\": f\"data:image/png;base64,{img_base64}\", \n                    },\n                },\n                {\n                    \"type\": \"text\", \"text\": query\n                },\n            ]\n        )\n    ]\n    \n    result = chat.invoke(messages)\n    return result.content\n```\n\n#### Contextual Embedding \n\n[Contextual Retrieval](https://www.anthropic.com/news/contextual-retrieval)와 같이 contextual embedding을 이용하여 chunk에 대한 설명을 추가하면, 검색의 정확도를 높일 수 있습니다. 또한 BM25(keyword) 검색은 OpenSearch의 hybrid 검색을 통해 구현할 수 있습니다. 상세한 코드는 [lambda_function.py](./lambda-document-manager/lambda_function.py)를 참조합니다.\n\n```python\ndef get_contexual_docs(whole_doc, splitted_docs):\n    contextual_template = (\n        \"\u003cdocument\u003e\"\n        \"{WHOLE_DOCUMENT}\"\n        \"\u003c/document\u003e\"\n        \"Here is the chunk we want to situate within the whole document.\"\n        \"\u003cchunk\u003e\"\n        \"{CHUNK_CONTENT}\"\n        \"\u003c/chunk\u003e\"\n        \"Please give a short succinct context to situate this chunk within the overall document for the purposes of improving search retrieval of the chunk.\"\n        \"Answer only with the succinct context and nothing else.\"\n        \"Put it in \u003cresult\u003e tags.\"\n    )          \n    \n    contextual_prompt = ChatPromptTemplate([\n        ('human', contextual_template)\n    ])\n\n    docs = []\n    for i, doc in enumerate(splitted_docs):        \n        chat = get_contexual_retrieval_chat()\n        \n        contexual_chain = contextual_prompt | chat\n            \n        response = contexual_chain.invoke(\n            {\n                \"WHOLE_DOCUMENT\": whole_doc.page_content,\n                \"CHUNK_CONTENT\": doc.page_content\n            }\n        )\n        output = response.content\n        contextualized_chunk = output[output.find('\u003cresult\u003e')+8:len(output)-9]\n        \n        docs.append(\n            Document(\n                page_content=contextualized_chunk+\"\\n\\n\"+doc.page_content,\n                metadata=doc.metadata\n            )\n        )\n    return docs\n```\n\n### Code Interpreter\n\n\"Strawberry의 'r'은 몇개인가요?\"의 질문을 하면 code interpreter가 생성한 코드를 실행하여 아래와 같은 결과를 얻을 수 있습니다.\n\n\u003cimg width=\"550\" alt=\"image\" src=\"https://github.com/user-attachments/assets/0b1f6ccd-618a-453b-8d63-f71bcf7ffa0b\" /\u003e\n\n이때 실행된 코드는 아래와 같습니다.\n\n```python\nos.environ[ 'MPLCONFIGDIR' ] = '/tmp/'\nword = \"Strawberry\"\nr_count = word.lower().count(\\'r\\')\nprint(f\"\\'Strawberry\\'에서 \\'r\\'의 개수는 {r_count}개 입니다.\")\n```\n\nLangSmith에서 확인한 동작은 아래와 같습니다.\n\n![noname](https://github.com/user-attachments/assets/ef3f4d7b-620b-4257-a0b9-975598b27783)\n\n\n### Agentic RAG\n\n아래와 같이 activity diagram을 이용하여 node/edge/conditional edge로 구성되는 tool use 방식의 agent를 구현할 수 있습니다.\n\n\u003cimg width=\"300\" alt=\"image\" src=\"https://github.com/user-attachments/assets/59c8dc05-c79c-4f63-b1ab-964dec259203\"/\u003e\n\n\nTool use 방식 agent의 workflow는 아래와 같습니다. Fuction을 선택하는 call model 노드과 실행하는 tool 노드로 구성됩니다. 선택된 tool의 결과에 따라 cycle형태로 추가 실행을 하거나 종료하면서 결과를 전달할 수 있습니다.\n\n```python\nworkflow = StateGraph(State)\n\nworkflow.add_node(\"agent\", call_model)\nworkflow.add_node(\"action\", tool_node)\nworkflow.add_edge(START, \"agent\")\nworkflow.add_conditional_edges(\n    \"agent\",\n    should_continue,\n    {\n        \"continue\": \"action\",\n        \"end\": END,\n    },\n)\nworkflow.add_edge(\"action\", \"agent\")\n\napp = workflow.compile()\ninputs = [HumanMessage(content=query)]\nconfig = {\n    \"recursion_limit\": 50\n}\nmessage = app.invoke({\"messages\": inputs}, config)\n```\n\nTool use 패턴의 agent는 정의된 tool 함수의 docstring을 이용해 목적에 맞는 tool을 선택합니다. 아래의 search_by_opensearch는 OpenSearch를 데이터 저장소로 사용하여 관련된 문서를 얻어오는 tool의 예입니다. \"Search technical information by keyword\"로 정의하였으므로 질문이 기술적인 내용이라면 search_by_opensearch가 호출되게 됩니다.\n\n```python\n@tool    \ndef search_by_opensearch(keyword: str) -\u003e str:\n    \"\"\"\n    Search technical information by keyword and then return the result as a string.\n    keyword: search keyword\n    return: the technical information of keyword\n    \"\"\"    \n    \n    keyword = keyword.replace('\\'','')\n    keyword = keyword.replace('|','')\n    keyword = keyword.replace('\\n','')\n    \n    # retrieve\n    relevant_docs = rag.retrieve_documents_from_opensearch(keyword, top_k=2)                            \n\n    # grade  \n    filtered_docs = chat.grade_documents(keyword, relevant_docs)\n\n    global reference_docs\n    if len(filtered_docs):\n        reference_docs += filtered_docs\n        \n    for i, doc in enumerate(filtered_docs):\n        if len(doc.page_content)\u003e=100:\n            text = doc.page_content[:100]\n        else:\n            text = doc.page_content            \n       \n    relevant_context = \"\" \n    for doc in filtered_docs:\n        content = doc.page_content        \n        relevant_context = relevant_context + f\"{content}\\n\\n\"\n        \n    return relevant_context  \n```\n\n\n\n아래와 같이 tool들로 tools를 정의한 후에 [bind_tools](https://python.langchain.com/docs/how_to/chat_models_universal_init/#using-a-configurable-model-declaratively)을 이용하여 call_model 노드를 정의합니다. \n\n```python\ntools = [get_current_time, get_book_list, get_weather_info, search_by_tavily, search_by_knowledge_base]        \n\ndef call_model(state: State, config):\n    system = (\n        \"당신의 이름은 서연이고, 질문에 친근한 방식으로 대답하도록 설계된 대화형 AI입니다.\"\n        \"상황에 맞는 구체적인 세부 정보를 충분히 제공합니다.\"\n        \"모르는 질문을 받으면 솔직히 모른다고 말합니다.\"\n    )\n    \n    prompt = ChatPromptTemplate.from_messages(\n        [\n            (\"system\", system),\n            MessagesPlaceholder(variable_name=\"messages\"),\n        ]\n    )\n\n    model = chat.bind_tools(tools)\n    chain = prompt | model\n        \n    response = chain.invoke(state[\"messages\"])\n\n    return {\"messages\": [response]}\n```\n\n또한, tool 노드는 아래와 같이 [ToolNode](https://langchain-ai.github.io/langgraph/reference/prebuilt/#langgraph.prebuilt.tool_node.ToolNode)을 이용해 정의합니다.\n\n```python\nfrom langgraph.prebuilt import ToolNode\n\ntool_node = ToolNode(tools)\n```\n\n\n### Corrective RAG\n\n[Corrective RAG(CRAG)](https://github.com/kyopark2014/langgraph-agent/blob/main/corrective-rag-agent.md)는 retrieval/grading 후에 질문을 rewrite한 후 인터넷 검색에서 얻어진 결과로 RAG의 성능을 강화하는 방법입니다. \n\n![image](https://github.com/user-attachments/assets/27228159-b307-4588-8a8a-61d8deaa90e3)\n\nCRAG의 workflow는 아래와 같습니다. \n\n```python\nworkflow = StateGraph(State)\n    \n# Define the nodes\nworkflow.add_node(\"retrieve\", retrieve_node)  \nworkflow.add_node(\"grade_documents\", grade_documents_node)\nworkflow.add_node(\"generate\", generate_node)\nworkflow.add_node(\"rewrite\", rewrite_node)\nworkflow.add_node(\"websearch\", web_search_node)\n\n# Build graph\nworkflow.set_entry_point(\"retrieve\")\nworkflow.add_edge(\"retrieve\", \"grade_documents\")\nworkflow.add_conditional_edges(\n    \"grade_documents\",\n    decide_to_generate,\n    {\n        \"rewrite\": \"rewrite\",\n        \"generate\": \"generate\",\n    },\n)\nworkflow.add_edge(\"rewrite\", \"websearch\")\nworkflow.add_edge(\"websearch\", \"generate\")\nworkflow.add_edge(\"generate\", END)\n```\n\n### Self RAG\n\n[Self RAG](https://github.com/kyopark2014/langgraph-agent/blob/main/self-rag.md)는 retrieve/grading 후에 generation을 수행하는데, grading의 결과에 따라 필요시 rewtire후 retrieve를 수행하며, 생성된 결과가 hallucination인지, 답변이 적절한지를 판단하여 필요시 rewtire / retrieve를 반복합니다. \n\n![image](https://github.com/user-attachments/assets/b1f2db6c-f23f-4382-86f6-0fa7d3fe0595)\n\nSelf RAG의 workflow는 아래와 같습니다.\n\n```python\nworkflow = StateGraph(State)\n            \n# Define the nodes\nworkflow.add_node(\"retrieve\", retrieve_node)  \nworkflow.add_node(\"grade_documents\", grade_documents_node)\nworkflow.add_node(\"generate\", generate_node)\nworkflow.add_node(\"rewrite\", rewrite_node)\n\n# Build graph\nworkflow.set_entry_point(\"retrieve\")\nworkflow.add_edge(\"retrieve\", \"grade_documents\")\nworkflow.add_conditional_edges(\n    \"grade_documents\",\n    decide_to_generate,\n    {\n        \"no document\": \"rewrite\",\n        \"document\": \"generate\",\n        \"not available\": \"generate\",\n    },\n)\nworkflow.add_edge(\"rewrite\", \"retrieve\")\nworkflow.add_conditional_edges(\n    \"generate\",\n    grade_generation,\n    {\n        \"not supported\": \"generate\",\n        \"useful\": END,\n        \"not useful\": \"rewrite\",\n        \"not available\": END,\n    },\n)\n```\n\n### Self Corrective RAG\n\nSelf Corrective RAG는 Self RAG처럼 retrieve / generate 후에 hallucination인지 답변이 적절한지 확인후 필요시 질문을 rewrite하거나 인터넷 검색을 통해 RAG의 성능을 향상시키는 방법입니다. \n\n![image](https://github.com/user-attachments/assets/9a18f7f9-0249-42f7-983e-c5a7f9d18682)\n\nSelf Corrective RAG의 workflow는 아래와 같습니다. \n\n```python\nworkflow = StateGraph(State)\n            \n# Define the nodes\nworkflow.add_node(\"retrieve\", retrieve_node)  \nworkflow.add_node(\"generate\", generate_node) \nworkflow.add_node(\"rewrite\", rewrite_node)\nworkflow.add_node(\"websearch\", web_search_node)\nworkflow.add_node(\"finalize_response\", finalize_response_node)\n\n# Build graph\nworkflow.set_entry_point(\"retrieve\")\nworkflow.add_edge(\"retrieve\", \"generate\")\nworkflow.add_edge(\"rewrite\", \"retrieve\")\nworkflow.add_edge(\"websearch\", \"generate\")\nworkflow.add_edge(\"finalize_response\", END)\n\nworkflow.add_conditional_edges(\n    \"generate\",\n    grade_generation,\n    {\n        \"generate\": \"generate\",\n        \"websearch\": \"websearch\",\n        \"rewrite\": \"rewrite\",\n        \"finalize_response\": \"finalize_response\",\n    },\n)\n```\n\n\n### 활용 방법\n\nEC2는 Private Subnet에 있으므로 SSL로 접속할 수 없습니다. 따라서, [Console-EC2](https://us-west-2.console.aws.amazon.com/ec2/home?region=us-west-2#Instances:)에 접속하여 \"app-for-llm-streamlit\"를 선택한 후에 Connect에서 sesseion manager를 선택하여 접속합니다. \n\nGithub에서 app에 대한 코드를 업데이트 하였다면, session manager에 접속하여 아래 명령어로 업데이트 합니다. \n\n```text\nsudo runuser -l ec2-user -c 'cd /home/ec2-user/agentic-rag \u0026\u0026 git pull'\n```\n\nStreamlit의 재시작이 필요하다면 아래 명령어로 service를 stop/start 시키고 동작을 확인할 수 있습니다.\n\n```text\nsudo systemctl stop streamlit\nsudo systemctl start streamlit\nsudo systemctl status streamlit -l\n```\n\nLocal에서 디버깅을 빠르게 진행하고 싶다면 [Local에서 실행하기](https://github.com/kyopark2014/agentic-rag/blob/main/deployment.md#local%EC%97%90%EC%84%9C-%EC%8B%A4%ED%96%89%ED%95%98%EA%B8%B0)에 따라서 Local에 필요한 패키지와 환경변수를 업데이트 합니다. 이후 아래 명령어서 실행합니다.\n\n```text\nstreamlit run application/app.py\n```\n\nEC2에서 debug을 하면서 개발할때 사용하는 명령어입니다.\n\n먼저, 시스템에 등록된 streamlit을 종료합니다.\n\n```text\nsudo systemctl stop streamlit\n```\n\n이후 EC2를 session manager를 이용해 접속한 이후에 아래 명령어를 이용해 실행하면 로그를 보면서 수정을 할 수 있습니다. \n\n```text\nsudo runuser -l ec2-user -c \"/home/ec2-user/.local/bin/streamlit run /home/ec2-user/agentic-rag/application/app.py\"\n```\n\n\n## 직접 실습 해보기\n\n### 사전 준비 사항\n\n이 솔루션을 사용하기 위해서는 사전에 아래와 같은 준비가 되어야 합니다.\n\n- [AWS Account 생성](https://repost.aws/ko/knowledge-center/create-and-activate-aws-account)에 따라 계정을 준비합니다.\n\n### CDK를 이용한 인프라 설치\n\n본 실습에서는 us-west-2 리전을 사용합니다. [인프라 설치](./deployment.md)에 따라 CDK로 인프라 설치를 진행합니다. \n\n## 실행 결과\n\n메뉴에서는 아래와 항목들을 제공하고 있습니다.\n\n![image](https://github.com/user-attachments/assets/53049a34-68c8-4506-8b1e-d34ca70a6a7f)\n\n\n메뉴에서 [이미지 분석]과 모델로 [Claude 3.5 Sonnet]을 선택한 후에 [기다리는 사람들 사진](./contents/waiting_people.jpg)을 다운받아서 업로드합니다. 이후 \"사진속에 있는 사람들은 모두 몇명인가요?\"라고 입력후 결과를 확인하면 아래와 같습니다.\n\n\u003cimg width=\"600\" alt=\"image\" src=\"https://github.com/user-attachments/assets/3e1ea017-4e46-4340-87c6-4ebf019dae4f\" /\u003e\n\n\n\n### RAG (Knowledge Base)\n\n\n\n### Reference \n\n[Nova Pro User Guide](https://docs.aws.amazon.com/pdfs/nova/latest/userguide/nova-ug.pdf)\n\n[7 Agentic RAG System Architectures](https://www.linkedin.com/posts/greg-coquillo_7-agentic-rag-system-architectures-ugcPost-7286098967434534912-F2Ek/?utm_source=share\u0026utm_medium=member_android)\n\n[how multi-agent agentic RAG systems work](https://www.linkedin.com/posts/pavan-belagatti_lets-understand-how-multi-agent-agentic-activity-7286068649101008896-DmDB/?utm_source=share\u0026utm_medium=member_android)\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkyopark2014%2Fagentic-rag","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkyopark2014%2Fagentic-rag","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkyopark2014%2Fagentic-rag/lists"}