{"id":19369955,"url":"https://github.com/coqui-ai/coqpit","last_synced_at":"2025-04-05T17:06:55.827Z","repository":{"id":41950666,"uuid":"351835164","full_name":"coqui-ai/coqpit","owner":"coqui-ai","description":"Simple but maybe too simple config management through python data classes.  We use it for machine learning.","archived":false,"fork":false,"pushed_at":"2023-04-12T09:37:26.000Z","size":8006,"stargazers_count":104,"open_issues_count":11,"forks_count":35,"subscribers_count":6,"default_branch":"main","last_synced_at":"2025-03-29T16:09:09.600Z","etag":null,"topics":["config-management","dataclasses","json","machine-learning","python","python-data","serialization","typing","yaml"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/coqui-ai.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.txt","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-03-26T15:53:40.000Z","updated_at":"2025-02-08T07:44:58.000Z","dependencies_parsed_at":"2024-06-18T17:58:57.838Z","dependency_job_id":null,"html_url":"https://github.com/coqui-ai/coqpit","commit_stats":{"total_commits":147,"total_committers":6,"mean_commits":24.5,"dds":"0.38095238095238093","last_synced_commit":"47f39eaca7d7cc46deb0c54bbcff40f455261d38"},"previous_names":[],"tags_count":25,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coqui-ai%2Fcoqpit","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coqui-ai%2Fcoqpit/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coqui-ai%2Fcoqpit/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/coqui-ai%2Fcoqpit/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/coqui-ai","download_url":"https://codeload.github.com/coqui-ai/coqpit/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":247369952,"owners_count":20927928,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["config-management","dataclasses","json","machine-learning","python","python-data","serialization","typing","yaml"],"created_at":"2024-11-10T08:13:49.444Z","updated_at":"2025-04-05T17:06:55.807Z","avatar_url":"https://github.com/coqui-ai.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# 👩‍✈️ Coqpit\n\n[![CI](https://github.com/coqui-ai/coqpit/actions/workflows/main.yml/badge.svg?branch=main)](https://github.com/coqui-ai/coqpit/actions/workflows/main.yml)\n\nSimple, light-weight and no dependency config handling through python data classes with to/from JSON serialization/deserialization.\n\nCurrently it is being used by [🐸TTS](https://github.com/coqui-ai/TTS).\n## ❔ Why I need this\nWhat I need from a ML configuration library...\n\n1. Fixing a general config schema in Python to guide users about expected values.\n\n    Python is good but not universal. Sometimes you train a ML model and use it on a different platform. So, you\n    need your model configuration file importable by other programming languages.\n\n2. Simple dynamic value and type checking with default values.\n\n    If you are a beginner in a ML project, it is hard to guess the right values for your ML experiment. Therefore it is important\n    to have some default values and know what range and type of input are expected for each field.\n\n4. Ability to decompose large configs.\n\n    As you define more fields for the training dataset, data preprocessing, model parameters, etc., your config file tends\n    to get quite large but in most cases, they can be decomposed, enabling flexibility and readability.\n\n5. Inheritance and nested configurations.\n\n    Simply helps to keep configurations consistent and easier to maintain.\n\n6. Ability to override values from the command line when necessary.\n\n    For instance, you might need to define a path for your dataset, and this changes for almost every run. Then the user\n    should be able to override this value easily over the command line.\n\n    It also allows easy hyper-parameter search without changing your original code. Basically, you can run different models\n    with different parameters just using command line arguments.\n\n7. Defining dynamic or conditional config values.\n\n    Sometimes you need to define certain values depending on the other values. Using python helps to define the underlying\n    logic for such config values.\n\n8. No dependencies\n\n    You don't want to install a ton of libraries for just configuration management. If you install one, then it\n    is better to be just native python.\n\n## 🚫 Limitations\n- `Union` type dataclass fields cannot be parsed from console arguments due to the type ambiguity.\n- `JSON` is the only supported serialization format, although the others can be easily integrated.\n- `List`type with multiple item type annotations are not supported. (e.g. `List[int, str]`).\n- `dict` fields are parsed from console arguments as JSON str without type checking. (e.g `--val_dict '{\"a\":10, \"b\":100}'`).\n- `MISSING` fields cannot be avoided when parsing console arguments.\n\n## 🔍 Examples\n\n### 👉 Simple Coqpit\n```python\nimport os\nfrom dataclasses import asdict, dataclass, field\nfrom typing import List, Union\nfrom coqpit import MISSING, Coqpit, check_argument\n\n\n@dataclass\nclass SimpleConfig(Coqpit):\n    val_a: int = 10\n    val_b: int = None\n    val_d: float = 10.21\n    val_c: str = \"Coqpit is great!\"\n    # mandatory field\n    # raise an error when accessing the value if it is not changed. It is a way to define\n    val_k: int = MISSING\n    # optional field\n    val_dict: dict = field(default_factory=lambda: {\"val_aa\": 10, \"val_ss\": \"This is in a dict.\"})\n    # list of list\n    val_listoflist: List[List] = field(default_factory=lambda: [[1, 2], [3, 4]])\n    val_listofunion: List[List[Union[str,int]]] = field(default_factory=lambda: [[1, 3], [1, \"Hi!\"]])\n\n    def check_values(\n        self,\n    ):  # you can define explicit constraints on the fields using `check_argument()`\n        \"\"\"Check config fields\"\"\"\n        c = asdict(self)\n        check_argument(\"val_a\", c, restricted=True, min_val=10, max_val=2056)\n        check_argument(\"val_b\", c, restricted=True, min_val=128, max_val=4058, allow_none=True)\n        check_argument(\"val_c\", c, restricted=True)\n\n\nif __name__ == \"__main__\":\n    file_path = os.path.dirname(os.path.abspath(__file__))\n    config = SimpleConfig()\n\n    # try MISSING class argument\n    try:\n        k = config.val_k\n    except AttributeError:\n        print(\" val_k needs a different value before accessing it.\")\n    config.val_k = 1000\n\n    # try serialization and deserialization\n    print(config.serialize())\n    print(config.to_json())\n    config.save_json(os.path.join(file_path, \"example_config.json\"))\n    config.load_json(os.path.join(file_path, \"example_config.json\"))\n    print(config.pprint())\n\n    # try `dict` interface\n    print(*config)\n    print(dict(**config))\n\n    # value assignment by mapping\n    config[\"val_a\"] = -999\n    print(config[\"val_a\"])\n    assert config.val_a == -999\n```\n### 👉 Serialization\n```python\nimport os\nfrom dataclasses import asdict, dataclass, field\nfrom coqpit import Coqpit, check_argument\nfrom typing import List, Union\n\n\n@dataclass\nclass SimpleConfig(Coqpit):\n    val_a: int = 10\n    val_b: int = None\n    val_c: str = \"Coqpit is great!\"\n\n    def check_values(self,):\n        '''Check config fields'''\n        c = asdict(self)\n        check_argument('val_a', c, restricted=True, min_val=10, max_val=2056)\n        check_argument('val_b', c, restricted=True, min_val=128, max_val=4058, allow_none=True)\n        check_argument('val_c', c, restricted=True)\n\n\n@dataclass\nclass NestedConfig(Coqpit):\n    val_d: int = 10\n    val_e: int = None\n    val_f: str = \"Coqpit is great!\"\n    sc_list: List[SimpleConfig] = None\n    sc: SimpleConfig = SimpleConfig()\n    union_var: Union[List[SimpleConfig], SimpleConfig] = field(default_factory=lambda: [SimpleConfig(),SimpleConfig()])\n\n    def check_values(self,):\n        '''Check config fields'''\n        c = asdict(self)\n        check_argument('val_d', c, restricted=True, min_val=10, max_val=2056)\n        check_argument('val_e', c, restricted=True, min_val=128, max_val=4058, allow_none=True)\n        check_argument('val_f', c, restricted=True)\n        check_argument('sc_list', c, restricted=True, allow_none=True)\n        check_argument('sc', c, restricted=True, allow_none=True)\n\n\nif __name__ == '__main__':\n    file_path = os.path.dirname(os.path.abspath(__file__))\n    # init 🐸 dataclass\n    config = NestedConfig()\n\n    # save to a json file\n    config.save_json(os.path.join(file_path, 'example_config.json'))\n    # load a json file\n    config2 = NestedConfig(val_d=None, val_e=500, val_f=None, sc_list=None, sc=None, union_var=None)\n    # update the config with the json file.\n    config2.load_json(os.path.join(file_path, 'example_config.json'))\n    # now they should be having the same values.\n    assert config == config2\n\n    # pretty print the dataclass\n    print(config.pprint())\n\n    # export values to a dict\n    config_dict = config.to_dict()\n    # crate a new config with different values than the defaults\n    config2 = NestedConfig(val_d=None, val_e=500, val_f=None, sc_list=None, sc=None, union_var=None)\n    # update the config with the exported valuess from the previous config.\n    config2.from_dict(config_dict)\n    # now they should be having the same values.\n    assert config == config2\n```\n\n\n### 👉 ```argparse``` handling and parsing.\n```python\nimport argparse\nimport os\nfrom dataclasses import asdict, dataclass, field\nfrom typing import List\n\nfrom coqpit import Coqpit, check_argument\nimport sys\n\n\n@dataclass\nclass SimplerConfig(Coqpit):\n    val_a: int = field(default=None, metadata={'help': 'this is val_a'})\n\n\n@dataclass\nclass SimpleConfig(Coqpit):\n    val_req: str # required field\n    val_a: int = field(default=10,\n                       metadata={'help': 'this is val_a of SimpleConfig'})\n    val_b: int = field(default=None, metadata={'help': 'this is val_b'})\n    nested_config: SimplerConfig = SimplerConfig()\n    mylist_with_default: List[SimplerConfig] = field(\n        default_factory=lambda:\n        [SimplerConfig(val_a=100),\n         SimplerConfig(val_a=999)],\n        metadata={'help': 'list of SimplerConfig'})\n\n    # mylist_without_default: List[SimplerConfig] = field(default=None, metadata={'help': 'list of SimplerConfig'})  # NOT SUPPORTED YET!\n\n    def check_values(self, ):\n        '''Check config fields'''\n        c = asdict(self)\n        check_argument('val_a', c, restricted=True, min_val=10, max_val=2056)\n        check_argument('val_b',\n                       c,\n                       restricted=True,\n                       min_val=128,\n                       max_val=4058,\n                       allow_none=True)\n        check_argument('val_req', c, restricted=True)\n\n\ndef main():\n    # reference config that we like to match with the one parsed from argparse\n    config_ref = SimpleConfig(val_req='this is different',\n                              val_a=222,\n                              val_b=999,\n                              nested_config=SimplerConfig(val_a=333),\n                              mylist_with_default=[\n                                  SimplerConfig(val_a=222),\n                                  SimplerConfig(val_a=111)\n                              ])\n\n    # create new config object from CLI inputs\n    parsed = SimpleConfig.init_from_argparse()\n    parsed.pprint()\n\n    # check the parsed config with the reference config\n    assert parsed == config_ref\n\n\nif __name__ == '__main__':\n    sys.argv.extend(['--coqpit.val_req', 'this is different'])\n    sys.argv.extend(['--coqpit.val_a', '222'])\n    sys.argv.extend(['--coqpit.val_b', '999'])\n    sys.argv.extend(['--coqpit.nested_config.val_a', '333'])\n    sys.argv.extend(['--coqpit.mylist_with_default.0.val_a', '222'])\n    sys.argv.extend(['--coqpit.mylist_with_default.1.val_a', '111'])\n    main()\n```\n\n### 🤸‍♀️ Merging coqpits\n```python\nimport os\nfrom dataclasses import dataclass\nfrom coqpit import Coqpit, check_argument\n\n\n@dataclass\nclass CoqpitA(Coqpit):\n    val_a: int = 10\n    val_b: int = None\n    val_d: float = 10.21\n    val_c: str = \"Coqpit is great!\"\n\n\n@dataclass\nclass CoqpitB(Coqpit):\n    val_d: int = 25\n    val_e: int = 257\n    val_f: float = -10.21\n    val_g: str = \"Coqpit is really great!\"\n\n\nif __name__ == '__main__':\n    file_path = os.path.dirname(os.path.abspath(__file__))\n    coqpita = CoqpitA()\n    coqpitb = CoqpitB()\n    coqpitb.merge(coqpita)\n    print(coqpitb.val_a)\n    print(coqpitb.pprint())\n```\n\n## Development\n\nInstall the pre-commit hook to automatically check your commits for style and hinting issues:\n\n```bash\n$ python .pre-commit-2.12.1.pyz install\n```\n\n\u003cimg src=\"https://static.scarf.sh/a.png?x-pxid=cd0232a8-ead2-4f1f-87f5-0dd8ec33ee51\" /\u003e\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoqui-ai%2Fcoqpit","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcoqui-ai%2Fcoqpit","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcoqui-ai%2Fcoqpit/lists"}