{"id":15014157,"url":"https://github.com/kororo/excelcy","last_synced_at":"2025-08-21T04:30:43.086Z","repository":{"id":37677648,"uuid":"141208839","full_name":"kororo/excelcy","owner":"kororo","description":"Excel Integration with spaCy. Training NER using Excel/XLSX from PDF, DOCX, PPT, PNG or JPG.","archived":false,"fork":false,"pushed_at":"2022-12-08T11:26:10.000Z","size":479,"stargazers_count":105,"open_issues_count":9,"forks_count":11,"subscribers_count":4,"default_branch":"master","last_synced_at":"2025-07-09T20:37:18.799Z","etag":null,"topics":["entity","excel","nlp","python","python3","spacy","spacy-extensions","spacy-nlp","spacy-pipeline","training","xlsx"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/kororo.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2018-07-17T00:18:05.000Z","updated_at":"2024-03-25T10:08:13.000Z","dependencies_parsed_at":"2023-01-25T12:31:05.563Z","dependency_job_id":null,"html_url":"https://github.com/kororo/excelcy","commit_stats":null,"previous_names":[],"tags_count":11,"template":false,"template_full_name":null,"purl":"pkg:github/kororo/excelcy","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kororo%2Fexcelcy","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kororo%2Fexcelcy/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kororo%2Fexcelcy/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kororo%2Fexcelcy/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/kororo","download_url":"https://codeload.github.com/kororo/excelcy/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/kororo%2Fexcelcy/sbom","scorecard":{"id":567838,"data":{"date":"2025-08-11","repo":{"name":"github.com/kororo/excelcy","commit":"25263d16db0cda24fe66ab3d52ff08a770117dc1"},"scorecard":{"version":"v5.2.1-40-gf6ed084d","commit":"f6ed084d17c9236477efd66e5b258b9d4cc7b389"},"score":1.7,"checks":[{"name":"Token-Permissions","score":-1,"reason":"No tokens found","details":null,"documentation":{"short":"Determines if the project's workflows follow the principle of least privilege.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#token-permissions"}},{"name":"Maintained","score":0,"reason":"0 commit(s) and 0 issue activity found in the last 90 days -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project is \"actively maintained\".","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#maintained"}},{"name":"SAST","score":0,"reason":"no SAST tool detected","details":["Warn: no pull requests merged into dev branch"],"documentation":{"short":"Determines if the project uses static code analysis.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#sast"}},{"name":"Code-Review","score":0,"reason":"Found 0/30 approved changesets -- score normalized to 0","details":null,"documentation":{"short":"Determines if the project requires human code review before pull requests (aka merge requests) are merged.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#code-review"}},{"name":"Packaging","score":-1,"reason":"packaging workflow not detected","details":["Warn: no GitHub/GitLab publishing workflow detected."],"documentation":{"short":"Determines if the project is published as a package that others can easily download, install, easily update, and uninstall.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#packaging"}},{"name":"Dangerous-Workflow","score":-1,"reason":"no workflows found","details":null,"documentation":{"short":"Determines if the project's GitHub Action workflows avoid dangerous patterns.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#dangerous-workflow"}},{"name":"Pinned-Dependencies","score":-1,"reason":"no dependencies found","details":null,"documentation":{"short":"Determines if the project has declared and pinned the dependencies of its build process.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#pinned-dependencies"}},{"name":"Binary-Artifacts","score":10,"reason":"no binaries found in the repo","details":null,"documentation":{"short":"Determines if the project has generated executable (binary) artifacts in the source repository.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#binary-artifacts"}},{"name":"CII-Best-Practices","score":0,"reason":"no effort to earn an OpenSSF best practices badge detected","details":null,"documentation":{"short":"Determines if the project has an OpenSSF (formerly CII) Best Practices Badge.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#cii-best-practices"}},{"name":"Security-Policy","score":0,"reason":"security policy file not detected","details":["Warn: no security policy file detected","Warn: no security file to analyze","Warn: no security file to analyze","Warn: no security file to analyze"],"documentation":{"short":"Determines if the project has published a security policy.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#security-policy"}},{"name":"Fuzzing","score":0,"reason":"project is not fuzzed","details":["Warn: no fuzzer integrations found"],"documentation":{"short":"Determines if the project uses fuzzing.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#fuzzing"}},{"name":"License","score":10,"reason":"license file detected","details":["Info: project has a license file: LICENSE:0","Info: FSF or OSI recognized license: MIT License: LICENSE:0"],"documentation":{"short":"Determines if the project has defined a license.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#license"}},{"name":"Signed-Releases","score":-1,"reason":"no releases found","details":null,"documentation":{"short":"Determines if the project cryptographically signs release artifacts.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#signed-releases"}},{"name":"Branch-Protection","score":0,"reason":"branch protection not enabled on development/release branches","details":["Warn: branch protection not enabled for branch 'master'"],"documentation":{"short":"Determines if the default and release branches are protected with GitHub's branch protection settings.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#branch-protection"}},{"name":"Vulnerabilities","score":0,"reason":"57 existing vulnerabilities detected","details":["Warn: Project is vulnerable to: PYSEC-2022-42986 / GHSA-43fp-rhv2-5gv8","Warn: Project is vulnerable to: PYSEC-2023-135 / GHSA-xqr8-7jwr-rhp7","Warn: Project is vulnerable to: PYSEC-2024-60 / GHSA-jjg7-2v4v-x38h","Warn: Project is vulnerable to: GHSA-55x5-fj6c-h6m8","Warn: Project is vulnerable to: PYSEC-2021-19 / GHSA-jq4v-f5q6-mjqq","Warn: Project is vulnerable to: PYSEC-2020-62 / GHSA-pgww-xf46-h92r","Warn: Project is vulnerable to: PYSEC-2022-230 / GHSA-wrxv-2j5q-m38w","Warn: Project is vulnerable to: GHSA-6p56-wp2h-9hxr","Warn: Project is vulnerable to: GHSA-fpfv-jqm9-f5jm","Warn: Project is vulnerable to: GHSA-3f63-hfp8-52jq","Warn: Project is vulnerable to: PYSEC-2021-41 / GHSA-3wvg-mj6g-m9cv","Warn: Project is vulnerable to: GHSA-44wm-f244-xhp3","Warn: Project is vulnerable to: GHSA-4fx9-vc88-q2xc","Warn: Project is vulnerable to: PYSEC-2021-35 / GHSA-57h3-9rgr-c24m","Warn: Project is vulnerable to: PYSEC-2021-331 / GHSA-7534-mm45-c74v","Warn: Project is vulnerable to: PYSEC-2021-137 / GHSA-77gc-v2xv-rvvh","Warn: Project is vulnerable to: PYSEC-2021-92 / GHSA-7r7m-5h27-29hp","Warn: Project is vulnerable to: PYSEC-2023-227 / GHSA-8ghj-p4vj-mr35","Warn: Project is vulnerable to: PYSEC-2022-10 / GHSA-8vj2-vxx3-667w","Warn: Project is vulnerable to: PYSEC-2021-36 / GHSA-8xjq-8fcg-g5hw","Warn: Project is vulnerable to: PYSEC-2021-42 / GHSA-95q3-8gr9-gm8w","Warn: Project is vulnerable to: PYSEC-2021-317 / GHSA-98vv-pw6r-q6q4","Warn: Project is vulnerable to: PYSEC-2021-38 / GHSA-9hx2-hgq2-2g4f","Warn: Project is vulnerable to: PYSEC-2022-168 / GHSA-9j59-75qj-795w","Warn: Project is vulnerable to: PYSEC-2021-40 / GHSA-f4w8-cv6p-x6r5","Warn: Project is vulnerable to: PYSEC-2021-69 / GHSA-f5g8-5qq7-938w","Warn: Project is vulnerable to: PYSEC-2021-139 / GHSA-g6rj-rv7j-xwp4","Warn: Project is vulnerable to: PYSEC-2021-71 / GHSA-hf64-x4gq-p99h","Warn: Project is vulnerable to: PYSEC-2021-94 / GHSA-hjfx-8p6c-g7gx","Warn: Project is vulnerable to: GHSA-j7hp-h8jx-5ppr","Warn: Project is vulnerable to: GHSA-jgpv-4h4c-xhw3","Warn: Project is vulnerable to: PYSEC-2022-42979 / GHSA-m2vv-5vj5-2hm7","Warn: Project is vulnerable to: PYSEC-2021-37 / GHSA-mvg9-xffr-p774","Warn: Project is vulnerable to: PYSEC-2021-39 / GHSA-p43w-g3c5-g5mq","Warn: Project is vulnerable to: PYSEC-2022-8 / GHSA-pw3c-h7wp-cvhx","Warn: Project is vulnerable to: PYSEC-2021-93 / GHSA-q5hq-fp76-qmrc","Warn: Project is vulnerable to: PYSEC-2021-138 / GHSA-rwv7-3v45-hg29","Warn: Project is vulnerable to: PYSEC-2021-70 / GHSA-vqcj-wrf2-7v73","Warn: Project is vulnerable to: PYSEC-2022-9 / GHSA-xrcv-f9gm-v42c","Warn: Project is vulnerable to: PYSEC-2023-175","Warn: Project is vulnerable to: PYSEC-2020-92 / GHSA-hj5v-574p-mj7c","Warn: Project is vulnerable to: PYSEC-2022-42969","Warn: Project is vulnerable to: GHSA-j225-cvw7-qrx7","Warn: Project is vulnerable to: PYSEC-2021-142 / GHSA-8q59-q68h-6hv4","Warn: Project is vulnerable to: GHSA-9hjg-9r4m-mvj7","Warn: Project is vulnerable to: GHSA-9wx4-h78v-vm56","Warn: Project is vulnerable to: PYSEC-2023-74 / GHSA-j8r2-6x86-q33q","Warn: Project is vulnerable to: GHSA-g7vv-2v7x-gj9p","Warn: Project is vulnerable to: GHSA-34jh-p97f-mpxf","Warn: Project is vulnerable to: PYSEC-2023-212 / GHSA-g4mx-q9vg-27p4","Warn: Project is vulnerable to: GHSA-pq67-6m6q-mj2v","Warn: Project is vulnerable to: PYSEC-2021-108 / GHSA-q2q7-5pp4-w6pg","Warn: Project is vulnerable to: PYSEC-2023-192 / GHSA-v845-jxx5-vc9f","Warn: Project is vulnerable to: GHSA-jfmj-5v4g-7637","Warn: Project is vulnerable to: PYSEC-2022-266 / GHSA-9xgj-fcgf-x6mw","Warn: Project is vulnerable to: PYSEC-2022-43179 / GHSA-j4j9-7hg9-97g6","Warn: Project is vulnerable to: PYSEC-2022-234 / GHSA-xr2c-5w89-63pv"],"documentation":{"short":"Determines if the project has open, known unfixed vulnerabilities.","url":"https://github.com/ossf/scorecard/blob/f6ed084d17c9236477efd66e5b258b9d4cc7b389/docs/checks.md#vulnerabilities"}}]},"last_synced_at":"2025-08-20T15:28:00.129Z","repository_id":37677648,"created_at":"2025-08-20T15:28:00.129Z","updated_at":"2025-08-20T15:28:00.129Z"},"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":271424952,"owners_count":24757373,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","status":"online","status_checked_at":"2025-08-21T02:00:08.990Z","response_time":74,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["entity","excel","nlp","python","python3","spacy","spacy-extensions","spacy-nlp","spacy-pipeline","training","xlsx"],"created_at":"2024-09-24T19:45:16.324Z","updated_at":"2025-08-21T04:30:42.764Z","avatar_url":"https://github.com/kororo.png","language":"Python","funding_links":[],"categories":["Python"],"sub_categories":[],"readme":"ExcelCy\n=======\n\n[![Build Status](https://travis-ci.com/kororo/excelcy.svg?branch=master)](https://travis-ci.com/kororo/excelcy)\n[![Coverage Status](https://coveralls.io/repos/github/kororo/excelcy/badge.svg)](https://coveralls.io/github/kororo/excelcy)\n[![MIT license](https://img.shields.io/badge/License-MIT-blue.svg)](https://lbesson.mit-license.org/)\n[![PyPI pyversions](https://img.shields.io/pypi/pyversions/excelcy.svg)](https://pypi.python.org/project/excelcy/)\n[![PyPI - Downloads](https://img.shields.io/pypi/dm/excelcy)](https://pypi.python.org/project/excelcy/)\n\n* * * * *\n\nExcelCy is a NER trainer from XLSX, PDF, DOCX, PPT, PNG or JPG. ExcelCy uses spaCy framework to match Entity with PhraseMatcher or Matcher in regular expression.\n\nExcelCy is convenience\n----------------------\n\nThis is example taken from spaCy documentation, [Simple Style Training](https://spacy.io/usage/training#training-simple-style). It demonstrates how to train NER using spaCy:\n\n```python\nimport spacy\nimport random\n\nTRAIN_DATA = [\n     (\"Uber blew through $1 million a week\", {'entities': [(0, 4, 'ORG')]}), # note: it is required to supply the character position\n     (\"Google rebrands its business apps\", {'entities': [(0, 6, \"ORG\")]})] # note: it is required to supply the character position\n\nnlp = spacy.blank('en')\noptimizer = nlp.begin_training()\nfor i in range(20):\n    random.shuffle(TRAIN_DATA)\n    for text, annotations in TRAIN_DATA:\n        nlp.update([text], [annotations], sgd=optimizer)\n\nnlp.to_disk('test_model')\n```\n\nThe **TRAIN\\_DATA**, describes sentences and annotated entities to be trained. It is cumbersome to always count the characters. With ExcelCy, (start,end) characters can be omitted.\n\n```python\n# install excelcy\n# pip install excelcy\n\n# download the en model from spacy\n# python -m spacy download en\"\n\n# run this inside python or file\nfrom excelcy import ExcelCy\n\n# Test: John is the CEO of this_is_a_unique_company_name\nexcelcy = ExcelCy()\n# by default it is assume the nlp_base using model en_core_web_sm\n# excelcy.storage.config = Config(nlp_base='en_core_web_sm')\n# if you have existing model, use this\n# excelcy.storage.config = Config(nlp_path='/path/model')\ndoc = excelcy.nlp('John is the CEO of this_is_a_unique_company_name')\n# it will show no company entities\nprint([(ent.label_, ent.text) for ent in doc.ents])\n# run this in root of repo or https://github.com/kororo/excelcy/raw/master/tests/data/test_data_01.xlsx\nexcelcy = ExcelCy.execute(file_path='tests/data/test_data_01.xlsx')\n# use the nlp object as per spaCy API\ndoc = excelcy.nlp('John is the CEO of this_is_a_unique_company_name')\n# now it recognise the company name\nprint([(ent.label_, ent.text) for ent in doc.ents])\n# NOTE: if not showing, remember, it may be required to increase the \"train_iteration\" or lower the \"train_drop\", the \"config\" sheet in Excel\n```\n\nExcelCy is friendly\n-------------------\n\nBy default, ExcelCy training is divided into phases, the example Excel file can be found in [tests/data/test\\_data\\_01.xlsx](https://github.com/kororo/excelcy/raw/master/tests/data/test_data_01.xlsx):\n\n### 1. Discovery\n\nThe first phase is to collect sentences from data source in sheet \"source\". The data source can be either:\n\n-   Text: Direct sentence values.\n-   Files: PDF, DOCX, PPT, PNG or JPG will be parsed using\n    [textract](https://github.com/deanmalmgren/textract).\n\nNote: See textract source examples in [tests/data/test\\_data\\_03.xlsx](https://github.com/kororo/excelcy/raw/master/tests/data/test_data_03.xlsx)\nNote: Dependencies \"textract\" is not included in the ExcelCy, it is required to add manually\n\n### 2. Preparation\n\nNext phase, the Gold annotation needs to be defined in sheet \"prepare\", based on:\n\n-   Current Data Model: Using spaCy API of **nlp(sentence).ents**\n-   Phrase pattern: Robbie, Uber, Google, Amazon\n-   Regex pattern: \\^([0-1]?[0-9]|2[0-3]):[0-5][0-9]\\$\n\nAll annotations in here are considered as Gold annotations, which described in [here](https://spacy.io/usage/training#example-new-entity-type).\n\n### 3. Training\n\nMain phase of NER training, which described in [Simple Style Training](https://spacy.io/usage/training#training-simple-style).\nThe data is iterated from sheet \"train\", check sheet \"config\" to control the parameters.\n\n### 4. Consolidation\n\nThe last phase, is to test/save the results and repeat the phases if required.\n\nExcelCy is flexible\n-------------------\n\nNeed more specific export and phases? It is possible to control it using phase API.\nThis is the illustration of the real-world scenario:\n\n1.  Train from\n    [tests/data/test\\_data\\_05.xlsx](https://github.com/kororo/excelcy/raw/master/tests/data/test_data_05.xlsx)\n\n    ```shell script\n    # download the dataset\n    $ wget https://github.com/kororo/excelcy/raw/master/tests/data/test_data_05.xlsx\n    # this will create a directory and file \"export/train_05.xlsx\"\n    $ excelcy execute test_data_05.xlsx\n    ```\n\n2.  Open the result in \"export/train\\_05.xlsx\", it shows all identified sentences from source given. However, there is error in the \"Himalayas\" as identified as \"PRODUCT\".\n    \n3.  To fix this, add phrase matcher for \"Himalayas = FAC\". It is illustrated in\n    [tests/data/test\\_data\\_05a.xlsx](https://github.com/kororo/excelcy/raw/master/tests/data/test_data_05a.xlsx)\n    \n4.  Train again and check the result in \"export/train\\_05a.xlsx\"\n\n    ```shell script\n    # download the dataset\n    $ wget https://github.com/kororo/excelcy/raw/master/tests/data/test_data_05a.xlsx\n    # this will create a directory \"nlp/data\" and file \"export/train_05a.xlsx\"\n    $ excelcy execute test_data_05a.xlsx\n    ```\n\n5.  Check the result that there is backed up nlp data model in \"nlp\" and the result is corrected in \"export/train\\_05a.xlsx\"\n\n6.  Keep training the data model, if there is unexpected behaviour, there is backup data model in case needed.\n\nExcelCy is comprehensive\n------------------------\n\nUnder the hood, ExcelCy has strong and well-defined data storage. At any given phase above, the data can be inspected.\n\n```python\nfrom excelcy import ExcelCy\nfrom excelcy.storage import Config\n\n# Test: John is the CEO of this_is_a_unique_company_name\nexcelcy = ExcelCy()\nexcelcy.storage.config = Config(nlp_base='en_core_web_sm', train_iteration=10, train_drop=0.2)\ndoc = excelcy.nlp('John is the CEO of this_is_a_unique_company_name')\n# showing no ORG\nprint([(ent.label_, ent.text) for ent in doc.ents])\nexcelcy.storage.source.add(kind='text', value='John is the CEO of this_is_a_unique_company_name')\nexcelcy.discover()\nexcelcy.storage.prepare.add(kind='phrase', value='this_is_a_unique_company_name', entity='ORG')\nexcelcy.prepare()\nexcelcy.train()\ndoc = excelcy.nlp('John is the CEO of this_is_a_unique_company_name')\n# ORG now is recognised\nprint([(ent.label_, ent.text) for ent in doc.ents])\n# NOTE: if not showing, remember, it may be required to increase the \"train_iteration\" or lower the \"train_drop\", the \"config\" sheet in Excel\n```\n\nFeatures\n--------\n\n-   Load multiple data sources such as Word documents, PowerPoint presentations, PDF or images.\n-   Import/Export configuration with JSON, YML or Excel.\n-   Add custom Entity labels.\n-   Rule based phrase matching using [PhraseMatcher](https://spacy.io/usage/linguistic-features#adding-phrase-patterns)\n-   Rule based matching using [regex + Matcher](https://spacy.io/usage/linguistic-features#regex)\n-   Train Named Entity Recogniser with ease\n\nInstall\n-------\n\nEither use the famous pip or clone this repository and execute the\nsetup.py file.\n\n```shell script\n$ pip install excelcy\n# ensure you have the language model installed before\n$ spacy download en\n```\n\nTrain\n-----\n\nTo train the spaCy model:\n\n```python\nfrom excelcy import ExcelCy\nexcelcy = ExcelCy.execute(file_path='test_data_01.xlsx')\n```\n\nNote: [tests/data/test\\_data\\_01.xlsx](https://github.com/kororo/excelcy/raw/master/tests/data/test_data_01.xlsx)\n\nCLI\n---\n\nExelCy has basic CLI command for execute:\n\n```shell script\n$ excelcy execute https://github.com/kororo/excelcy/raw/master/tests/data/test_data_01.xlsx\n```\n\nTest\n----\n\nRun test by installing packages and run tox\n\n```shell script\n$ pip install poetry tox\n$ tox\n$ tox -e py36 -- tests/test_readme.py\n```\n\nFor hot-reload development coding\n```shell script\n$ npm i -g nodemon\n$ nodemon\n```\n\nData Definition\n---------------\n\nExcelCy has data definition which expressed in [api.yml](https://github.com/kororo/excelcy/raw/master/data/api.yml).\nAs long as, data given in this specific format and structure, ExcelCy will able to support any type of data format.\nCheck out, the Excel file format in [api.xlsx](https://github.com/kororo/excelcy/raw/master/data/api.xlsx).\nData classes are defined with [attrs](https://github.com/python-attrs/attrs),\ncheck in [storage.py](https://github.com/kororo/excelcy/raw/master/excelcy/storage.py) for more detail.\n\nPublishing\n----------\n```shell script\n# this is note for contributors\n# ensure locally tests all running\nnpm run test\n\n# prepare for new version\npoetry version 0.4.1\nnpm run export\n\n# make changes in the git, especially release branch and check in the travis\n# https://travis-ci.com/github/kororo/excelcy\n\n# if all goes well, push to master\n\n```\nFAQ\n---\n\n**What is that idx columns in the Excel sheet?**\n\nThe idea is to give reference between two things. Imagine in sheet \"train\", like to know where the sentence generated\nfrom in sheet \"source\". And also, the nature of Excel, you can sort things, this is the safe guard to keep things in\nthe correct order.\n\n**Can ExcelCy import/export to X, Y, Z data format?**\n\nExcelCy has strong and well-defined data storage, thanks to [attrs](https://github.com/python-attrs/attrs).\nIt is possible to import/export data in any format.\n\n**Error: ModuleNotFoundError: No module named 'pip'**\n\nThere are lots of possibility on this. Try to lower pip version (it was buggy for version 19.0.3).\n\n**ExcelCy accepts suggestions/ideas?**\n\nYes! Please submit them into new issue with label \"enhancement\".\n\nAcknowledgement\n---------------\n\nThis project uses other awesome projects:\n\n-   [attrs](https://github.com/python-attrs/attrs): Python Classes Without Boilerplate.\n-   [pyexcel](https://github.com/pyexcel/pyexcel): Single API for reading, manipulating and writing data in csv, ods, xls, xlsx and xlsm files.\n-   [pyyaml](https://github.com/yaml/pyyaml): The next generation YAML parser and emitter for Python.\n-   [spacy](https://github.com/explosion/spaCy): Industrial-strength Natural Language Processing (NLP) with Python and Cython.\n-   [textract](https://github.com/deanmalmgren/textract): extract text from any document. no muss. no fuss.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkororo%2Fexcelcy","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fkororo%2Fexcelcy","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fkororo%2Fexcelcy/lists"}