{"id":15012458,"url":"https://github.com/microsoft/coml","last_synced_at":"2025-04-04T17:11:04.328Z","repository":{"id":200548338,"uuid":"633298050","full_name":"microsoft/CoML","owner":"microsoft","description":"Interactive coding assistant for data scientists and machine learning developers, empowered by large language models.","archived":false,"fork":false,"pushed_at":"2024-10-08T04:54:10.000Z","size":13175,"stargazers_count":91,"open_issues_count":0,"forks_count":15,"subscribers_count":5,"default_branch":"main","last_synced_at":"2025-03-28T16:07:04.583Z","etag":null,"topics":["automated-machine","automl","copilot","data-science","hyperparameter-optimization","jupyter","jupyter-lab","large-language-models","llm","machine-learning"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/microsoft.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":"CODE_OF_CONDUCT.md","threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":"SECURITY.md","support":"SUPPORT.md","governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-04-27T07:46:20.000Z","updated_at":"2025-02-21T00:03:14.000Z","dependencies_parsed_at":null,"dependency_job_id":"ca59850f-9ded-49c4-86ca-43933b81c4b7","html_url":"https://github.com/microsoft/CoML","commit_stats":{"total_commits":23,"total_committers":6,"mean_commits":"3.8333333333333335","dds":0.5652173913043479,"last_synced_commit":"4e0cfb449d2b5989e10a2e0f6255996398670aaf"},"previous_names":["microsoft/coml"],"tags_count":1,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FCoML","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FCoML/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FCoML/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/microsoft%2FCoML/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/microsoft","download_url":"https://codeload.github.com/microsoft/CoML/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247217222,"owners_count":20903009,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["automated-machine","automl","copilot","data-science","hyperparameter-optimization","jupyter","jupyter-lab","large-language-models","llm","machine-learning"],"created_at":"2024-09-24T19:42:39.930Z","updated_at":"2025-04-04T17:11:04.304Z","avatar_url":"https://github.com/microsoft.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# CoML\n\nCoML (formerly MLCopilot) is an interactive coding assistant for data scientists and machine learning developers, empowered on large language models.\n\nHighlight features:\n\n* Out-of-the-box interactive natural language programming interface for data mining and machine learning tasks.\n* Integration with Jupyter lab and Jupyter notebook.\n* Built-in large knowledge base of machine learning, enhancing the ability of solving complex tasks.\n\n## Installation\n\n```bash\npip install mlcopilot\n```\n\n(We can't have the name `coml` on PyPI, so we use `mlcopilot` instead.)\n\n## CoML in Jupyter Lab\n\nWe recommend trying CoML in a Jupyter Lab environment. Before using CoML, please make sure that:\n\n1. You have exported `OPENAI_API_KEY=sk-xxxx` in your environment. Alternatively, you can also use a `.env` file.\n2. Use `%load_ext coml` in your notebook to active CoML extension.\n\nThen we have provided several commands to assist your journey with interactive coding in Jupyter Lab.\n\n1. `%coml \u003ctask\u003e` to prompt CoML to write a cell for your task.\n\n![](assets/demo_coml.gif)\n\n2. `%comlfix` to fix the cell just above the current cell. You can also use `%comlfix \u003creason\u003e` to provide details for what's wrong.\n\n![](assets/demo_comlfix.gif)\n\n3. `%comlinspire` to inspire you with a cell describing what to do next.\n\n![](assets/demo_comlinspire.gif)\n\n**Limitations:**\n\n* Currently, CoML only supports Jupyter Lab and classical Jupyter notebook (nbclassic, and only on Linux platforms). We are still working on supports of newer Jupyter notebook, Jupyter-vscode and Google Colab.\n* CoML uses gpt-3.5-turbo-16k model in its implementation. There is no way to change the model for now. The cost of using this model is around $0.04 per request. Please be aware of this cost.\n\n## CoML Config Agent\n\nCoML config agent is the implementation of [MLCopilot]((https://arxiv.org/abs/2304.14979)), which can suggest a ML configuration within a specific task, for a specific task. Currently, it is an independent component residing in `coml.configagent`. In the future, we will integrate it into the CoML system.\n\n![](assets/demo.gif)\n\n(TODO: The demo needs an update.)\n\n#### Extra preparation steps\n\nPlease follow the steps to use CoML config agent:\n\n1. Clone this repo: `git clone REPO_URL; cd coml`\n2. Put assets/coml.db in your home directory: `cp assets/coml.db ~/.coml/coml.db`\n3. Copy `coml/.env.template` to `~/.coml/.env` and put your API keys in the file.\n\n#### Command line utility\n\nCurrently, it can only be invoked independently. You can use the following command line:\n\n```\ncoml-configagent --space \u003cspace\u003e --task \u003ctask\u003e\n```\n\nIf you feel uncertain about what to put into `\u003cspace\u003e` or `\u003ctask\u003e`, see the demo above, or try the interactive usage below:\n\n```\ncoml-configagent --interactive\n```\n\n#### API Usage\n\n```python\nfrom coml.configagent.suggest import suggest\n\nspace = import_space(\"YOUR_SPACE_ID\")\ntask_desc = \"YOUR_TASK_DESCRIPTION_FOR_NEW_TASK\"\nsuggest_configs, knowledge = suggest(space, task_desc)\n```\n\n## Development\n\nDevelopment documentation stays here for now. It shall be moved to a separate document later.\n\n### Project structure\n\nImportant files and folders:\n\n```\nCoML\n├── assets          # data, examples, demos\n├── coml            # Python package\n├── examples        # example scripts\n├── install.json    # Jupyter lab extension installation file\n├── package.json    # Jupyter lab extension package file\n├── pyproject.toml  # Python package configuration\n├── src             # Jupyter lab extension source code\n├── test            # Python package tests\n└── tsconfig.json   # Jupyter lab extension TypeScript configuration\n```\n\n### Installation and uninstallation\n\nYou can use the following command for development installation:\n\n```\npip install -e .[dev]\n```\n\nIf you are to develop Jupyter lab extension, you will also need to install NodeJS and npm, and run the following command:\n\n```\n# Link your development version of the extension with JupyterLab\njupyter labextension develop . --overwrite\n# Rebuild extension Typescript source after making changes\njlpm run build\n```\n\nTo uninstall, you can run the following commands:\n\n```bash\n# Server extension must be manually disabled in develop mode\njupyter server extension disable coml\n\n# Uninstall the Python package\npip uninstall mlcopilot\n```\n\nIn development mode, you will also need to remove the symlink created by `jupyter labextension develop` command.\nTo find its location, you can run `jupyter labextension list` to figure out where the `labextensions` folder is located.\nThen you can remove the symlink named `coml` within that folder.\n\n### Packaging\n\n```bash\nhatch build\n```\n\n## Citation\n\nIf you find this work useful, please cite the paper as below:\n\n    @article{zhang2023mlcopilot,\n        title={MLCopilot: Unleashing the Power of Large Language Models in Solving Machine Learning Tasks},\n        author={Zhang, Lei and Zhang, Yuge and Ren, Kan and Li, Dongsheng and Yang, Yuqing},\n        journal={arXiv preprint arXiv:2304.14979},\n        year={2023}\n    }\n\n## License\n\nThe entire codebase is under [MIT license](LICENSE).\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicrosoft%2Fcoml","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fmicrosoft%2Fcoml","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fmicrosoft%2Fcoml/lists"}