https://github.com/michaelhirn/cmod
https://github.com/michaelhirn/cmod
Last synced: 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/michaelhirn/cmod
- Owner: MichaelHirn
- Created: 2024-03-04T13:32:27.000Z (about 1 year ago)
- Default Branch: master
- Last Pushed: 2024-03-06T11:11:53.000Z (about 1 year ago)
- Last Synced: 2025-01-30T06:44:18.358Z (4 months ago)
- Language: Rust
- Size: 9.77 KB
- Stars: 1
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# README
This is an early POC to probe the possibility to use tree-sitter to automate
vendor migrations. As an example codebase we use exhaustive scenarios of the OpenAI TypeScript SDK and aim to automate the migration of all GPT-3.5 calls to Mistral's SDK.The goal is that for >80% of cases we can automatically integrate Mistral correctly and for the remaining 20% we can provide a clear and correct todo list for the developer to complete the migration.
> The project is in a very, very early stage to find obvious roadblocks early on. See the Development section for more details on this.
## Usage
The repo exposes a CLI to run the parser on a given directoy. To use it on the example codebase, run:
```sh
cargo build && target/debug/cmod -p ./examples/openai-starter/src
```For a more systematic probing of the functionality, you can run the tests:
```sh
cargo test
```> Note: The example codebase(s) to vendor-integrate are in the [`example`](./examples) directory.
## Development
The to-do list to achieve the above stated objective (migrate GPT3.5 to Mistral) is as follows:
1. [x] find all target files, i.e. importing 'openai' (see: get_target_files)
2. [ ] search for all `createCompletion` calls (probably doable)
3. [ ] rewrite to use mistral's version of that (more research needed)
- [ ] Investigate how to use tree-sitter to rewrite the AST; check ast-grep
4. [ ] insert the import statement for mistral sdk file (trivial)
5. [ ] create mistral sdk file (trivial)
6. [ ] add mistral as a dependency and prompt to run `npm i`. (trivial)#### Notes on Step 1
Finding the target files is rather trivial (excluding extrem edge cases like dynamic imports) and are essentially compiler correct. Core implementation in the parser.rs file via the `get_target_files` function.
#### Notes on Step 2
Once you have the target files, targeting the correct function calls for change like `createCompletion` has one additional complexity: you need to resolve by entity from the imported `openai` module. This should be achievable via tree-sitters [tagging](https://tree-sitter.github.io/tree-sitter/code-navigation-systems) system. This is not implemented yet though. Alternative approaches would be LSP/LSIF or sourcegraph's improvement over LSIF aka. [SCIP](https://sourcegraph.com/blog/announcing-scip), which, unlike tree-sitter, were designed for codebase-level (instead of file-level) traversal. However, tree-sitter is significantly faster, efficient, and easier extensible than SCIP/LSIF.
#### Notes on Step 3
This step is the most uncertain. Not so much if it is doable (liberaries like ast-grep implemented this in a general manner with a <1k LOC) but more so if you try to bend tree-setter too much into doing something that it is not designed for. A lot of inspiration can be taken from other codemod/rewrite tools. For our use-case a declarative API might be sufficient and preferred.
#### Notes on Step 4-6
This is trivial stuff and essentially basic codegen (rather than codemod) stuff. For a fully working POC this would be necessary to implement though.