{"id":13435442,"url":"https://github.com/jina-ai/thinkgpt","last_synced_at":"2025-04-08T02:42:40.983Z","repository":{"id":153470373,"uuid":"627959104","full_name":"jina-ai/thinkgpt","owner":"jina-ai","description":"Agent techniques to augment your LLM and push it beyong its limits","archived":false,"fork":false,"pushed_at":"2024-05-23T13:04:46.000Z","size":59,"stargazers_count":1570,"open_issues_count":16,"forks_count":137,"subscribers_count":26,"default_branch":"main","last_synced_at":"2025-04-02T23:54:53.362Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"apache-2.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/jina-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-14T15:21:18.000Z","updated_at":"2025-04-02T11:16:23.000Z","dependencies_parsed_at":"2024-11-21T02:03:48.884Z","dependency_job_id":null,"html_url":"https://github.com/jina-ai/thinkgpt","commit_stats":null,"previous_names":["alaeddine-13/thinkgpt"],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jina-ai%2Fthinkgpt","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jina-ai%2Fthinkgpt/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jina-ai%2Fthinkgpt/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/jina-ai%2Fthinkgpt/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/jina-ai","download_url":"https://codeload.github.com/jina-ai/thinkgpt/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247767232,"owners_count":20992538,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-07-31T03:00:35.723Z","updated_at":"2025-04-08T02:42:40.951Z","avatar_url":"https://github.com/jina-ai.png","language":"Python","funding_links":[],"categories":["LangChain Agents List","Python","A01_文本生成_文本对话","Tools and Frameworks"],"sub_categories":["Open Source","大语言对话模型及数据","Reflection and Meta-Cognition"],"readme":"# ThinkGPT 🧠🤖\n\u003ca href=\"https://discord.jina.ai\"\u003e\u003cimg src=\"https://img.shields.io/discord/1106542220112302130?logo=discord\u0026logoColor=white\u0026style=flat-square\"\u003e\u003c/a\u003e\n\n\nThinkGPT is a Python library aimed at implementing Chain of Thoughts for Large Language Models (LLMs), prompting the model to think, reason, and to create generative agents. \nThe library aims to help with the following:\n* solve limited context with long memory and compressed knowledge\n* enhance LLMs' one-shot reasoning with higher order reasoning primitives\n* add intelligent decisions to your code base\n\n\n## Key Features ✨\n* Thinking building blocks 🧱:\n    * Memory 🧠: GPTs that can remember experience\n    * Self-refinement 🔧: Improve model-generated content by addressing critics\n    * Compress knowledge 🌐: Compress knowledge and fit it in LLM's context either by anstracring rules out of observations or summarize large content\n    * Inference 💡️: Make educated guesses based on available information\n    * Natural Language Conditions 📝: Easily express choices and conditions in natural language\n* Efficient and Measurable GPT context length 📐\n* Extremely easy setup and pythonic API 🎯 thanks to [DocArray](https://github.com/docarray/docarray)\n\n## Installation 💻\nYou can install ThinkGPT using pip:\n\n```shell\npip install git+https://github.com/alaeddine-13/thinkgpt.git\n```\n\n## API Documentation 📚\n### Basic usage:\n```python\nfrom thinkgpt.llm import ThinkGPT\nllm = ThinkGPT(model_name=\"gpt-3.5-turbo\")\n# Make the llm object learn new concepts\nllm.memorize(['DocArray is a library for representing, sending and storing multi-modal data.'])\nllm.predict('what is DocArray ?', remember=llm.remember('DocArray definition'))\n```\n\n### Memorizing and Remembering information\n```python\nllm.memorize([\n    'DocArray allows you to send your data, in an ML-native way.',\n    'This means there is native support for Protobuf and gRPC, on top of HTTP and serialization to JSON, JSONSchema, Base64, and Bytes.',\n])\n\nprint(llm.remember('Sending data with DocArray', limit=1))\n```\n```text\n['DocArray allows you to send your data, in an ML-native way.']\n```\n\nUse the `limit` parameter to specify the maximum number of documents to retrieve.\nIn case you want to fit documents into a certain context size, you can also use the `max_tokens` parameter to specify the maximum number of tokens to retrieve.\nFor instance:\n```python\nfrom examples.knowledge_base import knowledge\nfrom thinkgpt.helper import get_n_tokens\n\nllm.memorize(knowledge)\nresults = llm.remember('hello', max_tokens=1000, limit=1000)\nprint(get_n_tokens(''.join(results)))\n```\n```text\n1000\n```\nHowever, keep in mind that concatenating documents with a separator will add more tokens to the final result.\nThe `remember` method does not account for those tokens.\n\n### Predicting with context from long memory\n```python\nfrom examples.knowledge_base import knowledge\nllm.memorize(knowledge)\nllm.predict('Implement a DocArray schema with 2 fields: image and TorchTensor', remember=llm.remember('DocArray schemas and types'))\n```\n\n### Self-refinement\n\n```python\nprint(llm.refine(\n    content=\"\"\"\nimport re\n    print('hello world')\n        \"\"\",\n    critics=[\n        'File \"/Users/user/PyCharm2022.3/scratches/scratch_166.py\", line 2',\n        \"  print('hello world')\",\n        'IndentationError: unexpected indent'\n    ],\n    instruction_hint=\"Fix the code snippet based on the error provided. Only provide the fixed code snippet between `` and nothing else.\"))\n\n```\n\n```text\nimport re\nprint('hello world')\n```\n\nOne of the applications is self-healing code generation implemented by projects like [gptdeploy](https://github.com/jina-ai/gptdeploy) and [wolverine](https://github.com/biobootloader/wolverine)\n\n### Compressing knowledge\nIn case you want your knowledge to fit into the LLM's context, you can use the following techniques to compress it:\n#### Summarize content\nSummarize content using the LLM itself.\nWe offer 2 methods\n1. one-shot summarization using the LLM\n```python\nllm.summarize(\n  large_content,\n  max_tokens= 1000,\n  instruction_hint= 'Pay attention to code snippets, links and scientific terms.'\n)\n```\nSince this technique relies on summarizing using a single LLM call, you can only pass content that does not exceed the LLM's context length.\n\n2. Chunked summarization\n```python\nllm.chunked_summarize(\n  very_large_content,\n  max_tokens= 4096,\n  instruction_hint= 'Pay attention to code snippets, links and scientific terms.'\n)\n```\nThis technique relies on splitting the content into different chunks, summarizing each of those chunks and then combining them all together using an LLM.\n\n#### Induce rules from observations\nAmount to higher level and more general observations from current observations:\n```python\nllm.abstract(observations=[\n    \"in tunisian, I did not eat is \\\"ma khditech\\\"\",\n    \"I did not work is \\\"ma khdemtech\\\"\",\n    \"I did not go is \\\"ma mchitech\\\"\",\n])\n```\n\n```text\n['Negation in Tunisian Arabic uses \"ma\" + verb + \"tech\" where \"ma\" means \"not\" and \"tech\" at the end indicates the negation in the past tense.']\n```\n\nThis can help you end up with compressed knowledge that fits better the limited context length of LLMs.\nFor instance, instead of trying to fit code examples in the LLM's context, use this to prompt it to understand high level rules and fit them in the context.\n\n### Natural language condition\nIntroduce intelligent conditions to your code and let the LLM make decisions\n```python\nllm.condition(f'Does this represent an error message ? \"IndentationError: unexpected indent\"')\n```\n```text\nTrue\n```\n### Natural language select\nAlternatively, let the LLM choose among a list of options:\n```python\nllm.select(\n    question=\"Which animal is the king of the jungle?\",\n    options=[\"Lion\", \"Elephant\", \"Tiger\", \"Giraffe\"]\n)\n```\n```text\n['Lion']\n```\n\nYou can also prompt the LLM to choose an exact number of answers using `num_choices`. By default, it's set to `None` which means the LLM will select any number he thinks it's correct.\n## Use Cases 🚀\nFind out below example demos you can do with `thinkgpt`\n### Teaching ThinkGPT a new language\n```python\nfrom thinkgpt.llm import ThinkGPT\n\nllm = ThinkGPT(model_name=\"gpt-3.5-turbo\")\n\nrules = llm.abstract(observations=[\n    \"in tunisian, I did not eat is \\\"ma khditech\\\"\",\n    \"I did not work is \\\"ma khdemtech\\\"\",\n    \"I did not go is \\\"ma mchitech\\\"\",\n], instruction_hint=\"output the rule in french\")\nllm.memorize(rules)\n\nllm.memorize(\"in tunisian, I studied is \\\"9rit\\\"\")\n\ntask = \"translate to Tunisian: I didn't study\"\nllm.predict(task, remember=llm.remember(task))\n```\n```text\nThe translation of \"I didn't study\" to Tunisian language would be \"ma 9ritech\".\n```\n\n### Teaching ThinkGPT how to code with `thinkgpt` library\n```python\nfrom thinkgpt.llm import ThinkGPT\nfrom examples.knowledge_base import knowledge\n\nllm = ThinkGPT(model_name=\"gpt-3.5-turbo\")\n\nllm.memorize(knowledge)\n\ntask = 'Implement python code that uses thinkgpt to learn about docarray v2 code and then predict with remembered information about docarray v2. Only give the code between `` and nothing else'\nprint(llm.predict(task, remember=llm.remember(task, limit=10, sort_by_order=True)))\n```\n\nCode generated by the LLM:\n```text\nfrom thinkgpt.llm import ThinkGPT\nfrom docarray import BaseDoc\nfrom docarray.typing import TorchTensor, ImageUrl\n\nllm = ThinkGPT(model_name=\"gpt-3.5-turbo\")\n\n# Memorize information\nllm.memorize('DocArray V2 allows you to represent your data, in an ML-native way')\n\n\n# Predict with the memory\nmemory = llm.remember('DocArray V2')\nllm.predict('write python code about DocArray v2', remember=memory)\n```\n### Replay Agent memory and infer new observations\nRefer to the following script for an example of an Agent that replays its memory and induces new observations.\nThis concept was introduced in [the Generative Agents: Interactive Simulacra of Human Behavior paper](https://arxiv.org/abs/2304.03442).\n\n```shell\npython -m examples.replay_expand_memory\n```\n```text\nnew thoughts:\nKlaus Mueller is interested in multiple topics\nKlaus Mueller may have a diverse range of interests and hobbies\n```\n\n### Replay Agent memory, criticize and refine the knowledge in memory\nRefer to the following script for an example of an Agent that replays its memory, performs self-criticism and adjusts its memory knowledge based on the criticism.\n```shell\npython -m examples.replay_criticize_refine\n```\n```text\nrefined \"the second number in Fibonacci sequence is 2\" into \"Observation: The second number in the Fibonacci sequence is actually 1, not 2, and the sequence starts with 0, 1.\"\n...\n```\nThis technique was mainly implemented in the [the Self-Refine: Iterative Refinement with Self-Feedback paper](https://arxiv.org/abs/2303.17651)\n\n\nFor more detailed usage and code examples check `./examples`.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjina-ai%2Fthinkgpt","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fjina-ai%2Fthinkgpt","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fjina-ai%2Fthinkgpt/lists"}