{"id":13813567,"url":"https://github.com/index-labs/evalgpt","last_synced_at":"2025-04-09T17:25:34.740Z","repository":{"id":192810353,"uuid":"687362389","full_name":"index-labs/evalgpt","owner":"index-labs","description":"EvalGPT is an code interpreter framework that utilizes large language models to automate the process of code-writing and execution, delivering precise results for user-defined tasks.","archived":false,"fork":false,"pushed_at":"2023-09-17T18:14:52.000Z","size":365,"stargazers_count":250,"open_issues_count":0,"forks_count":12,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-04-02T10:50:28.579Z","etag":null,"topics":["agent","chagpt","claude","code-interpreter","gpt-4","llm"],"latest_commit_sha":null,"homepage":"https://copilothub.ai","language":"Go","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/index-labs.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-09-05T07:50:41.000Z","updated_at":"2024-12-15T04:48:55.000Z","dependencies_parsed_at":null,"dependency_job_id":"fb5d4a5a-99a4-478f-ac11-544a1eb6bf26","html_url":"https://github.com/index-labs/evalgpt","commit_stats":null,"previous_names":["index-labs/evalgpt"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/index-labs%2Fevalgpt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/index-labs%2Fevalgpt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/index-labs%2Fevalgpt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/index-labs%2Fevalgpt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/index-labs","download_url":"https://codeload.github.com/index-labs/evalgpt/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248075928,"owners_count":21043668,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["agent","chagpt","claude","code-interpreter","gpt-4","llm"],"created_at":"2024-08-04T04:01:21.648Z","updated_at":"2025-04-09T17:25:34.721Z","avatar_url":"https://github.com/index-labs.png","language":"Go","funding_links":[],"categories":["Go"],"sub_categories":[],"readme":"\u003cdiv align=\"center\"\u003e\n\n# EvalGPT\n\n\u003c/div\u003e\n\n## What is EvalGPT\n\n🧩 This project is still in the early stages of development, and we are actively working on it. If you have any questions or suggestions, please submit an issue or PR.\n\nEvalGPT is an code interpreter framework, leveraging the power of large language models such as GPT-4, CodeLlama, and Claude 2. This powerful tool allows users to write tasks, and EvalGPT will assist in writing the code, executing it, and delivering the results.\n\n![](images/architecture.png)\n\nEvalGPT's architecture draws inspiration from [Google's Borg system](https://research.google/pubs/pub43438/). It includes a master node, known as EvalGPT, composed of three components: planning, scheduler, and memory.\n\nWhen EvalGPT receives a request, it starts planning the task using a Large Language Model (LLM), dividing larger tasks into smaller, manageable ones. For each sub-task, EvalGPT will spawn a new node known as an EvalAgent.\n\nEach EvalAgent is responsible for generating the code based on the assigned small task. Once the code is generated, the EvalAgent initiates a runtime to execute the code, even harnessing external tools when necessary. The results are then collected by the EvalAgent.\n\nEvalAgent nodes can access the memory from the EvalGPT master node, allowing for efficient and effective communication. If an EvalAgent encounters any errors during the process, it reports the error to the EvalGPT master node, which then replans the task to avoid the error.\n\nFinally, the EvalGPT master node collates all results from the EvalAgent nodes and generates the final answer for the request.\n\n## Benefits\n\n1. **Automated Code Writing**: EvalGPT leverages advanced language models to auto-generate code, reducing manual effort and increasing productivity.\n2. **Efficient Task Execution**: By breaking down complex tasks into manageable sub-tasks, EvalGPT ensures efficient and parallel execution, speeding up the overall process.\n3. **Robust Error Handling**: With its ability to replan tasks in case of errors, EvalGPT ensures reliable operation and accurate results.\n4. **Scalability**: EvalGPT is built to handle tasks of varying complexity, making it a scalable solution for a wide range of coding needs.\n5. **Resource Optimization**: Inspired by Google Borg's resource management, EvalGPT optimally utilizes computational resources, leading to improved performance.\n6. **Extensibility**: With the ability to incorporate external tools into its runtime, EvalGPT is highly adaptable and can be extended to handle a diverse range of tasks.\n\n## Demo\n\nhttps://github.com/index-labs/evalgpt/assets/7857126/73417c1f-8866-47fb-951a-7fd03c9dbf41\n\n## Quick Start 🚀\n\n### Install `evalgpt`\n\nYou can install evalgpt using the following command:\n\n```bash\ngo install github.com/index-labs/evalgpt@latest\n```\n\nYou could verify the installation by running the following command:\n\n```bash\nevalgpt -h\n```\n\n### Build it from source code\n\n```bash\ngit clone https://github.com/index-labs/evalgpt.git\n\ncd evalgpt\n\ngo mod tidy \u0026\u0026 go mod vendor\n\nmkdir -p ./bin\n\ngo build -o ./bin/evalgpt ./*.go\n\n./bin/evalgpt -h\n```\n\nThen you can find it on bin directory.\n\n### Configuration\n\nAfter you install evalgpt command line, before execute it, you must config below options:\n\n**Configure Openai API Key**\n\n```bash\nexport OPENAI_API_KEY=sk_******\n```\n\nalso, you can config openai api key by command args, but it's not recommend:\n\n```bash\nevalgpt --openai-api-key sk_***** -q \u003cquery\u003e\n\n```\n\n**Configure Python Interpreter**\n\nBy default, the code interpreter uses the system's Python interpreter. However, you can create a completely new Python\ninterpreter using Python's virtual environment tools and configure it accordingly.\n\n```bash\npython3 -m venv /path/evalgpt/venv\n# install third python libraries\n/path/evalgpt/venv/bin/pip3 install -r requirements.txt\n\n# config python interpreter\nexport PYTHON_INTERPRETER=/path/evalgpt/venv/bin/python3\n```\n\nor\n\n```bash\nevalgpt --python-interpreter /path/evalgpt/venv/bin/python3 -q \u003cquery\u003e\n```\n\n**Note:**\n\nBefore tackling complex tasks, ensure to install necessary Python third-party libraries. This equips your code\ninterpreter to handle corresponding tasks, boosting efficiency and ensuring smooth operation.\n\n## Usage\n\n**Help**\n\n```bash\n\u003e evalgpt -h\nNAME:\n   evalgpt help - A new cli application\n\nUSAGE:\n   evalgpt help [global options] command [command options] [arguments...]\n\nDESCRIPTION:\n   description\n\nCOMMANDS:\n   help, h  Shows a list of commands or help for one command\n\nGLOBAL OPTIONS:\n   --openai-api-key value         Openai Api Key, if you use open ai model gpt3 or gpt4, you must set this flag [$OPENAI_API_KEY]\n   --model value                  LLM name (default: \"gpt-4-0613\") [$MODEL]\n   --python-interpreter value     python interpreter path (default: \"/usr/bin/python3\") [$PYTHON_INTERPRETER]\n   --verbose, -v                  print verbose log (default: false) [$VERBOSE]\n   --query value, -q value        what you want to ask\n   --file value [ --file value ]  the path to the file to be parsed and processed, eg. --file /tmp/a.txt --file /tmp/b.txt\n   --help, -h                     show help\n```\n\n**Note:**\n\nRemember to configure the OpenAI API key and Python interpreter before executing the code interpreter, The following\nexamples have already been configured with environment variables for the OpenAI API key and the Python interpreter.\n\n**Simple Query**\n\nGet the public IP address of the machine:\n\n```bash\n❯ evalgpt -q 'get the public IP of my computer'\nYour public IP is: 104.28.240.133\n```\n\nCalculate the sha256 hash of a string:\n\n```bash\n❯ evalgpt -q 'calculate the sha256 of the \"hello,world\"'\n77df263f49123356d28a4a8715d25bf5b980beeeb503cab46ea61ac9f3320eda\n```\n\nGet the title of a website:\n\n```bash\n❯ evalgpt -q \"get the title of a website: https://arxiv.org/abs/2302.04761\" -v\n[2302.04761] Toolformer: Language Models Can Teach Themselves to Use Tools\n```\n\n**Pipeline**\n\nYou can user pipeline to input context data and query on it:\n\n```bash\n\u003e cat a.csv\ndate,dau\n2023-08-20,1000\n2023-08-21,900\n2023-08-22,1100\n2023-08-23,2000\n2023-08-24,1800\n\n\u003e cat a.csv | evalgpt -q 'calculate the average dau'\nAverage DAU:  1360.0\n```\n\n**Interact with files**\n\nconvert png file to webp file:\n\n```bash\n\u003e ls\na.png\n\n\u003e evalgpt -q 'convert this png file to webp' --file ./a.png\ncreated file: a.webp\n\n\u003e ls\na.png a.webp\n```\n\nDraw a line graph based on the data from the CSV\n\n```bash\n\u003e cat a.csv\ndate,dau\n2023-08-20,1000\n2023-08-21,900\n2023-08-22,1100\n2023-08-23,2000\n2023-08-24,1800\n\n\u003e evalgpt -q 'draw a line graph based on the data from the CSV' --file ./a.csv\n```\n\noutput:\n\n![](images/example_dau.png)\n\n## Architecture Details\n\n### EvalGPT Master Node\n\nThe EvalGPT master node serves as the control center of the framework. It houses three critical components: planning, scheduler, and memory.\n\nThe planning component leverages large language models to plan tasks based on the user's request. It breaks down complex tasks into smaller, manageable sub-tasks, each of which is handled by an individual EvalAgent node.\n\nThe scheduler component is responsible for task distribution. It assigns each sub-task to an EvalAgent node, ensuring efficient utilization of resources and parallel execution of tasks for optimal performance.\n\nThe memory component serves as the shared memory space for all EvalAgent nodes. It stores the results of executed tasks and provides a platform for data exchange between different nodes. This shared memory model facilitates complex computations and aids in error handling by allowing for task replanning in case of errors.\n\nIn the event of an error during code execution, the master node replans the task to avoid the error, thereby ensuring robust and reliable operation.\n\nFinally, the EvalGPT master node collects the results from all EvalAgent nodes, compiles them, and generates the final answer for the user's request. This centralized control and coordination make the EvalGPT master node a crucial part of the EvalGPT framework.\n\n### EvalAgent Node\n\nEvalAgent nodes are the workhorses of the EvalGPT framework. Spawned by the master node for each sub-task, they're responsible for code generation, execution, and result collection.\n\nThe code generation process in an EvalAgent node is guided by the specific task it's assigned. Using the large language model, it produces the necessary code to accomplish the task, ensuring it's suited to the task's requirements and complexity.\n\nOnce the code is generated, the EvalAgent node initiates a runtime environment to execute the code. This runtime is flexible, capable of incorporating external tools as needed, and provides a robust platform for code execution.\n\nDuring execution, the EvalAgent node collects the results and can access the shared memory from the EvalGPT master node. This allows for efficient data exchange and facilitates complex computations requiring significant data manipulation or access to previously computed results.\n\nIn case of any errors during code execution, the EvalAgent node reports these back to the EvalGPT master node. The master node then replans the task to avoid the error, ensuring a robust and reliable operation.\n\nIn essence, EvalAgent nodes are autonomous units within the EvalGPT framework, capable of generating and executing code, handling errors, and communicating results efficiently.\n\n### Runtime\n\nThe runtime of EvalGPT is managed by EvalAgent nodes. Each EvalAgent node generates code for a specific task and initiates a runtime to execute the code. The runtime environment is flexible and can incorporate external tools as necessary, providing a highly adaptable execution context.\n\nThe runtime also includes error handling mechanisms. If an EvalAgent node encounters any errors during code execution, it reports these back to the EvalGPT master node. The master node then replans the task to avoid the error, ensuring robust and reliable code execution.\n\nThe runtime can interact with the EvalGPT master node's memory, enabling efficient data exchange and facilitating complex computations. This shared memory model allows for the execution of tasks that require significant data manipulation or access to previously computed results.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Findex-labs%2Fevalgpt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Findex-labs%2Fevalgpt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Findex-labs%2Fevalgpt/lists"}