Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/microsoft/coml
Interactive coding assistant for data scientists and machine learning developers, empowered by large language models.
https://github.com/microsoft/coml
automated-machine automl copilot data-science hyperparameter-optimization jupyter jupyter-lab large-language-models llm machine-learning
Last synced: about 1 month ago
JSON representation
Interactive coding assistant for data scientists and machine learning developers, empowered by large language models.
- Host: GitHub
- URL: https://github.com/microsoft/coml
- Owner: microsoft
- License: mit
- Created: 2023-04-27T07:46:20.000Z (over 1 year ago)
- Default Branch: main
- Last Pushed: 2024-04-28T06:46:14.000Z (7 months ago)
- Last Synced: 2024-10-07T10:40:59.133Z (about 1 month ago)
- Topics: automated-machine, automl, copilot, data-science, hyperparameter-optimization, jupyter, jupyter-lab, large-language-models, llm, machine-learning
- Language: Python
- Homepage:
- Size: 12.6 MB
- Stars: 75
- Watchers: 5
- Forks: 14
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
- Code of conduct: CODE_OF_CONDUCT.md
- Security: SECURITY.md
- Support: SUPPORT.md
Awesome Lists containing this project
README
# CoML
CoML (formerly MLCopilot) is an interactive coding assistant for data scientists and machine learning developers, empowered on large language models.
Highlight features:
* Out-of-the-box interactive natural language programming interface for data mining and machine learning tasks.
* Integration with Jupyter lab and Jupyter notebook.
* Built-in large knowledge base of machine learning, enhancing the ability of solving complex tasks.## Installation
```bash
pip install mlcopilot
```(We can't have the name `coml` on PyPI, so we use `mlcopilot` instead.)
## CoML in Jupyter Lab
We recommend trying CoML in a Jupyter Lab environment. Before using CoML, please make sure that:
1. You have exported `OPENAI_API_KEY=sk-xxxx` in your environment. Alternatively, you can also use a `.env` file.
2. Use `%load_ext coml` in your notebook to active CoML extension.Then we have provided several commands to assist your journey with interactive coding in Jupyter Lab.
1. `%coml ` to prompt CoML to write a cell for your task.
![](assets/demo_coml.gif)
2. `%comlfix` to fix the cell just above the current cell. You can also use `%comlfix ` to provide details for what's wrong.
![](assets/demo_comlfix.gif)
3. `%comlinspire` to inspire you with a cell describing what to do next.
![](assets/demo_comlinspire.gif)
**Limitations:**
* Currently, CoML only supports Jupyter Lab and classical Jupyter notebook (nbclassic, and only on Linux platforms). We are still working on supports of newer Jupyter notebook, Jupyter-vscode and Google Colab.
* CoML uses gpt-3.5-turbo-16k model in its implementation. There is no way to change the model for now. The cost of using this model is around $0.04 per request. Please be aware of this cost.## CoML Config Agent
CoML config agent is the implementation of [MLCopilot]((https://arxiv.org/abs/2304.14979)), which can suggest a ML configuration within a specific task, for a specific task. Currently, it is an independent component residing in `coml.configagent`. In the future, we will integrate it into the CoML system.
![](assets/demo.gif)
(TODO: The demo needs an update.)
#### Extra preparation steps
Please follow the steps to use CoML config agent:
1. Clone this repo: `git clone REPO_URL; cd coml`
2. Put assets/coml.db in your home directory: `cp assets/coml.db ~/.coml/coml.db`
3. Copy `coml/.env.template` to `~/.coml/.env` and put your API keys in the file.#### Command line utility
Currently, it can only be invoked independently. You can use the following command line:
```
coml-configagent --space --task
```If you feel uncertain about what to put into `` or ``, see the demo above, or try the interactive usage below:
```
coml-configagent --interactive
```#### API Usage
```python
from coml.configagent.suggest import suggestspace = import_space("YOUR_SPACE_ID")
task_desc = "YOUR_TASK_DESCRIPTION_FOR_NEW_TASK"
suggest_configs, knowledge = suggest(space, task_desc)
```## Development
Development documentation stays here for now. It shall be moved to a separate document later.
### Project structure
Important files and folders:
```
CoML
├── assets # data, examples, demos
├── coml # Python package
├── examples # example scripts
├── install.json # Jupyter lab extension installation file
├── package.json # Jupyter lab extension package file
├── pyproject.toml # Python package configuration
├── src # Jupyter lab extension source code
├── test # Python package tests
└── tsconfig.json # Jupyter lab extension TypeScript configuration
```### Installation and uninstallation
You can use the following command for development installation:
```
pip install -e .[dev]
```If you are to develop Jupyter lab extension, you will also need to install NodeJS and npm, and run the following command:
```
# Link your development version of the extension with JupyterLab
jupyter labextension develop . --overwrite
# Rebuild extension Typescript source after making changes
jlpm run build
```To uninstall, you can run the following commands:
```bash
# Server extension must be manually disabled in develop mode
jupyter server extension disable coml# Uninstall the Python package
pip uninstall mlcopilot
```In development mode, you will also need to remove the symlink created by `jupyter labextension develop` command.
To find its location, you can run `jupyter labextension list` to figure out where the `labextensions` folder is located.
Then you can remove the symlink named `coml` within that folder.### Packaging
```bash
hatch build
```## Citation
If you find this work useful, please cite the paper as below:
@article{zhang2023mlcopilot,
title={MLCopilot: Unleashing the Power of Large Language Models in Solving Machine Learning Tasks},
author={Zhang, Lei and Zhang, Yuge and Ren, Kan and Li, Dongsheng and Yang, Yuqing},
journal={arXiv preprint arXiv:2304.14979},
year={2023}
}## License
The entire codebase is under [MIT license](LICENSE).