{"id":13441538,"url":"https://github.com/lucidrains/toolformer-pytorch","last_synced_at":"2025-05-14T20:08:03.367Z","repository":{"id":65803280,"uuid":"600229707","full_name":"lucidrains/toolformer-pytorch","owner":"lucidrains","description":"Implementation of Toolformer, Language Models That Can Use Tools, by MetaAI","archived":false,"fork":false,"pushed_at":"2024-07-22T06:42:35.000Z","size":165,"stargazers_count":2018,"open_issues_count":11,"forks_count":125,"subscribers_count":37,"default_branch":"main","last_synced_at":"2025-04-03T03:08:29.340Z","etag":null,"topics":["api-calling","artificial-intelligence","attention-mechanisms","deep-learning","transformers"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lucidrains.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-02-10T21:59:25.000Z","updated_at":"2025-04-03T02:48:12.000Z","dependencies_parsed_at":null,"dependency_job_id":"9cfe290b-d16f-4a37-baad-621a5aaa088b","html_url":"https://github.com/lucidrains/toolformer-pytorch","commit_stats":null,"previous_names":[],"tags_count":24,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Ftoolformer-pytorch","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Ftoolformer-pytorch/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Ftoolformer-pytorch/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lucidrains%2Ftoolformer-pytorch/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lucidrains","download_url":"https://codeload.github.com/lucidrains/toolformer-pytorch/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":248161255,"owners_count":21057553,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["api-calling","artificial-intelligence","attention-mechanisms","deep-learning","transformers"],"created_at":"2024-07-31T03:01:35.210Z","updated_at":"2025-04-10T04:54:49.305Z","avatar_url":"https://github.com/lucidrains.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"\u003cimg src=\"./toolformer.png\" width=\"500px\"\u003e\u003c/img\u003e\n\n## Toolformer - Pytorch (wip)\n\nImplementation of \u003ca href=\"https://arxiv.org/abs/2302.04761\"\u003eToolformer\u003c/a\u003e, Language Models That Can Use Tools, by MetaAI\n\n## Appreciation\n\n- \u003ca href=\"https://stability.ai/\"\u003eStability.ai\u003c/a\u003e for the generous sponsorship to work and open source cutting edge artificial intelligence research\n\n- \u003ca href=\"https://github.com/conceptofmind\"\u003eEnrico\u003c/a\u003e for getting the ball rolling with the initial commit of different tools!\n\n- Thanks goes out to ChatGPT for doing all the regular expressions in this repository for parsing the functions and parameters for the API calls. I am terrible at regular expressions, so this was enormous help from the AI (with no hitches, it was perfect).\n\n## Install\n\n```bash\n$ pip install toolformer-pytorch\n```\n\n## Usage\n\nExample usage with giving language models awareness of current date and time.\n\n```python\nimport torch\nfrom toolformer_pytorch import Toolformer, PaLM\n\n# simple calendar api call - function that returns a string\n\ndef Calendar():\n    import datetime\n    from calendar import day_name, month_name\n    now = datetime.datetime.now()\n    return f'Today is {day_name[now.weekday()]}, {month_name[now.month]} {now.day}, {now.year}.'\n\n# prompt for teaching it to use the Calendar function from above\n\nprompt = f\"\"\"\nYour task is to add calls to a Calendar API to a piece of text.\nThe API calls should help you get information required to complete the text.\nYou can call the API by writing \"[Calendar()]\"\nHere are some examples of API calls:\nInput: Today is the first Friday of the year.\nOutput: Today is the first [Calendar()] Friday of the year.\nInput: The president of the United States is Joe Biden.\nOutput: The president of the United States is [Calendar()] Joe Biden.\nInput: [input]\nOutput: \n\"\"\"\n\ndata = [\n    \"The store is never open on the weekend, so today it is closed.\",\n    \"The number of days from now until Christmas is 30\",\n    \"The current day of the week is Wednesday.\"\n]\n\n# model - here using PaLM, but any nn.Module that returns logits in the shape (batch, seq, num_tokens) is fine\n\nmodel = PaLM(\n    dim = 512,\n    depth = 2,\n    heads = 8,\n    dim_head = 64\n).cuda()\n\n# toolformer\n\ntoolformer = Toolformer(\n    model = model,\n    model_seq_len = 256,\n    teach_tool_prompt = prompt,\n    tool_id = 'Calendar',\n    tool = Calendar,\n    finetune = True\n)\n\n# invoking this will\n# (1) prompt the model with your inputs (data), inserted into [input] tag\n# (2) with the sampled outputs, filter out the ones that made proper API calls\n# (3) execute the API calls with the `tool` given\n# (4) filter with the specialized filter function (which can be used independently as shown in the next section)\n# (5) fine-tune on the filtered results\n\nfiltered_stats = toolformer(data)\n\n# then, once you see the 'finetune complete' message\n\nresponse = toolformer.sample_model_with_api_calls(\"How many days until the next new years?\")\n\n# hopefully you see it invoke the calendar and utilize the response of the api call...\n\n```\n\nThe main novelty of the paper is defining a fitness score for the outputs from a transformer instructed to insert API calls. The score is used to filter the sampled outputs for finetuning the transformer to make API calls that decreases perplexity of the text that follows it.\n\n```python\nimport torch\n\nfrom toolformer_pytorch import (\n    Toolformer,\n    PaLM,\n    filter_tokens_with_api_response\n)\n\n# model\n\npalm = PaLM(\n    dim = 512,\n    num_tokens = 20000,\n    depth = 2,\n    heads = 8,\n    dim_head = 64\n).cuda()\n\n# mock some tokens\n\nmock_start_pos = 512\nmock_api_call_length = 10\nmock_api_start_id = 19998\nmock_api_stop_id = 19999\n\ntokens = torch.randint(0, 20000, (10, 1024)).cuda()\ntokens_with_api_response = torch.randint(0, 20000, (10, 1024)).cuda()\ntokens_without_api_response = torch.randint(0, 20000, (10, 1024)).cuda()\n\ntokens_with_api_response[:, mock_start_pos] = mock_api_start_id\ntokens_with_api_response[:, mock_start_pos + mock_api_call_length] = mock_api_stop_id\n\ntokens_without_api_response[:, mock_start_pos] = mock_api_start_id\ntokens_without_api_response[:, mock_start_pos + mock_api_call_length] = mock_api_stop_id\n\n# filter\n\nfiltered_results = filter_tokens_with_api_response(\n    model = palm,\n    tokens = tokens,\n    tokens_with_api_response = tokens_with_api_response,\n    tokens_without_api_response = tokens_without_api_response,\n    filter_threshold = 1.,\n    api_start_token_id = mock_api_start_id,\n    api_end_token_id = mock_api_stop_id\n)\n```\n\nTo invoke the tools on a string generated by the language model, use `invoke_tools`\n\n```python\nfrom toolformer_pytorch import invoke_tools\n\ndef inc(i):\n    return i + 1\n\ndef dec(i):\n    return i - 1\n\nfunction_registry = dict(\n    inc = inc,\n    dec = dec\n)\n\ntext = 'make the following api calls: [inc(1)] and [dec(2)] and [ignored(3)]'\n\ninvoke_tools(function_registry, text)\n\n# make the following api calls: [inc(1) → 2] and [dec(2) → 1] and [ignored(3)]\n```\n\n## Todo\n\n- [x] create custom generate function for palm that can do external API calls\n    - [x] allow for generating tokens at different cursor indices\n    - [x] api token (which was left and right brackets in paper) needs to be customizable\n    - [ ] allow for customizing how to fine handling errors in function name, parameters, or execution and output\n- [ ] Toolformer should eventually calculate all statistics (how many properly sampled, filtered out by different criterias, the distribution of scores as well as how many were rejected) before the final fine-tuning\n- [ ] do end-to-end training in `Toolformer`\n    - [x] doing the prompting and bootstrapping the data\n    - [x] prefiltering of bootstrapped data followed by api calls and then another round of filtering\n        - [ ] keep track of all stats\n    - [x] take care of fine-tuning\n        - [ ] interleaving of datasets + optimizer hyperparams\n- [ ] hook up gpt-j\n- [ ] test for a simple calculator eval dataset\n- [ ] add a default callback within the Toolformer that automatically aligns the text and checks for validity before the filtering step - if the text was not copied correctly, the filtering step is not valid.\n- [ ] make sure final model, trained on many `Toolformer` instances, can be invoked with multiple tools  - start with batch size of 1 and work way up\n\n## Citations\n\n```bibtex\n@inproceedings{Schick2023ToolformerLM,\n    title   = {Toolformer: Language Models Can Teach Themselves to Use Tools},\n    author  = {Timo Schick and Jane Dwivedi-Yu and Roberto Dessi and Roberta Raileanu and Maria Lomeli and Luke Zettlemoyer and Nicola Cancedda and Thomas Scialom},\n    year    = {2023}\n}\n```\n\n```bibtex\n@article{Gao2022PALPL,\n    title   = {PAL: Program-aided Language Models},\n    author  = {Luyu Gao and Aman Madaan and Shuyan Zhou and Uri Alon and Pengfei Liu and Yiming Yang and Jamie Callan and Graham Neubig},\n    journal = {ArXiv},\n    year    = {2022},\n    volume  = {abs/2211.10435}\n}\n```\n\n*Reality is that which, when you stop believing it, doesn't go away.* – Philip K. Dick.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Ftoolformer-pytorch","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flucidrains%2Ftoolformer-pytorch","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flucidrains%2Ftoolformer-pytorch/lists"}