https://github.com/stevenrice99/llm-ik
Solving Inverse Kinematics using Large Language Models
https://github.com/stevenrice99/llm-ik
ai artificial-intelligence deepseek deepseek-r1 ik inverse-kinematics large-language-model large-language-models llm llms machine-learning ml o1 o3-mini openai openai-api prompt-engineering robot robotics serial-manipulator
Last synced: about 2 months ago
JSON representation
Solving Inverse Kinematics using Large Language Models
- Host: GitHub
- URL: https://github.com/stevenrice99/llm-ik
- Owner: StevenRice99
- License: mit
- Created: 2024-05-21T21:50:49.000Z (11 months ago)
- Default Branch: master
- Last Pushed: 2025-03-03T15:47:12.000Z (about 2 months ago)
- Last Synced: 2025-03-03T16:41:16.168Z (about 2 months ago)
- Topics: ai, artificial-intelligence, deepseek, deepseek-r1, ik, inverse-kinematics, large-language-model, large-language-models, llm, llms, machine-learning, ml, o1, o3-mini, openai, openai-api, prompt-engineering, robot, robotics, serial-manipulator
- Language: Python
- Homepage: https://stevenrice.ca
- Size: 47.4 MB
- Stars: 2
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: readme.md
- License: LICENSE
Awesome Lists containing this project
README
# Solving Inverse Kinematics with Large Language Models
This repository is for generating and testing the [inverse kinematics](https://en.wikipedia.org/wiki/Inverse_kinematics "Inverse Kinematics Wikipedia") solutions generated by [large language models (LLMs)](https://wikipedia.org/wiki/Large_language_model "Large Language Models Wikipedia") for kinematic chains with a single "end effector".
# Features
- Can load URDF files.
- Supports both Chat-based and API-based LLMs.
- Can solve in five modes.
- **Normal:** Directly attempts to solve the chain.
- **Extend:** Tries to extend an existing solution for a chain one link smaller than the current.
- **Dynamic:** Tries to base the solution off of already solved sub-chains.
- **Cumulative:** Like dynamic but passes all possible solved-sub-chains.
- **Transfer:** Tries to base the solution for a position and orientation solver off of a position only solver.
- Model-inheriting where more expensive models can extend or dynamically build from the solutions of cheaper models.# Setup
1. **Recommended:** Create a virtual environment ``python3 -m venv .venv``.
- Activate the virtual environment.
- Windows: ``.venv\Scripts\activate.bat``.
- Linux: ``.venv\Scripts\activate.ps1``.
- Mac: ``source .venv/bin/activate``.
2. Install all requirements with ``pip install -r requirements.txt``.
3. If doing your own experiments, it is recommended to delete all folders in this project, except for the ``Robots`` or ``Models``, and ``Providers`` folders if you wish to use some of the same robots or LLMs as we have.
4. In the root directory, ensure there is a folder named ``Robots``, and place the URDF files of the robots you wish to use inside.
5. In the root directory, ensure there is a folder named ``Models``, and place all your LLM specification files you wish to use inside as detailed in the [Models](#models "Models") section.
6. **Optional:** These steps only apply if you wish to use OpenAI API compatible APIs.
- In the root directory, ensure there is a folder named ``Providers``, and place your OpenAI API compatible specification files you wish to use inside as detailed in the [Providers](#providers "Providers") section.
- In the root directory, ensure there is a folder named ``Keys``, and make ``.txt`` files named the same as the OpenAI API compatible specification files in the ``Providers`` folder and paste the appropriate API keys into each.
7. Run ``llm_ik`` with the parameters outlined in the [Usage](#usage "Usage") section.
8. View the results in the ``Results`` folder in the root directory.# Models
- Models are specified in ``.txt`` files in the ``Models`` folder in the root directory.
- The name of the file is what will appear in results.
- Each line of the file represents information about the file, with only the first line being needed for non-API models.## Format
1. If the model is a reasoning model or not, specified by either ``True`` or ``False`` and defaulting to ``False``. If not a reasoning model, the prompts will include a statement to "think step by step and show all your work" to elicit some benefits from chain-of-thought thinking. Otherwise, this is omitted, as reasoning already do a process like this internally.
2. The name of the "provider" of the model being the name of the OpenAI API compatible specification file (without the ``.txt`` extension) to use from the ``Providers`` folder. See the [Providers](#providers "Providers") section for how to configure these files themselves.
3. The input cost per token of this model. If unspecified, this model cannot be inherited by other API models.
4. The output cost per token of this model. If unspecified, this model cannot be inherited by other API models.
5. If this model supports function calling via the OpenAI API, specified by either ``True`` or ``False`` and defaulting to whether its provider supports functions. This is useful as some providers, such as OpenRouter, supports function calling, but, not all models they provide do as well, thus giving you an option to perform a per-model override. However, if the provider does not support function calls and this is set to ``True``, the provider's configuration will override this to ``False``, so this can only be used to disable function calling and not enable it. If this is ``False``, additional details are added to the prompt so models can still call methods, just not through the OpenAI API functions and instead the regular message response is parsed.
6. The API name to use for this model. If omitted, the file name (without the ``.txt`` extension) will be used.# Providers
- OpenAI API compatible providers are specified in ``.txt`` files in the ``Providers`` folder in the root directory.
## Format
1. The API endpoint of the provider.
2. If this model supports function calling via the OpenAI API, specified by either ``True`` or ``False`` and defaulting to ``False``. If the provider supports methods but a model does not as explained in the [Models](#models "Models") section, this will be overwritten to ``False`` for that model only.# Usage
## Arguments
- ``-r`` or ``--robots`` - The names of the robots. Defaults to ``None`` which will load all robot URDF files in the ``Robots`` folder.
- ``-m`` or ``--max`` - The maximum chain length to run. Defaults to ``0`` which means there is no limit.
- ``-o`` or ``--orientation`` - If we want to solve for orientation in addition to position. Defaults to ``True``.
- ``-t`` or ``--types`` - The highest solving type to run. Defaults to ``Transfer``, meaning all are run.
- ``-f`` or ``--feedbacks`` - The max number of times to give feedback. Defaults to ``5``.
- ``-e`` or ``--examples`` - The number of examples to give with feedbacks. Defaults to ``10``.
- ``-a`` or ``--training`` - The number of training samples. Defaults to ``1000``.
- ``-v`` or ``--evaluating`` - The number of evaluating samples. Defaults to ``1000``.
- ``-s`` or ``--seed`` - The samples generation seed. Defaults to ``42``.
- ``-d`` or ``--distance`` - The acceptable distance error. Defaults to ``0.001``.
- ``-n`` or ``--angle`` - The acceptable angle error. Defaults to ``0.001``.
- ``-c`` or ``--cwd`` - The working directory. Defaults to ``None`` which gets the current working directory.
- ``-l`` or ``--logging`` - The logging level. Defaults to ``INFO``.
- ``-w`` or ``--wait`` - How long to wait between API calls. Defaults to ``1`` second.
- ``-u`` or ``--run`` - **Flag** - Enable API running .
- ``-b`` or ``--bypass`` - **Flag** - Bypass the confirmation for API running.## Manual Chat
- If manually chatting with an LLM, after running, look in the ``Interactions`` folder until you find the robot, model, and solving you are looking for.
- Copy the last ``X-Prompt.txt``, ``X-Feedback.txt``, ``X-Forward.txt``, or ``X-Test.txt`` into your chat interface and wait for a response where ``X`` is a number.
- **Copy the entire response, not just the code.** The program will look for a Python code block to extract from the response, so if you manually extract this code, the program will not recognize it.
- Once a response was received, make a text file named ``X-Response.txt`` where ``X`` is the next number for the chat history and run the program again. Repeat the previous step and this until a file named ``X-Done.txt`` appears where ``X`` is a number.