{"id":19391572,"url":"https://github.com/zevio/pcu","last_synced_at":"2026-04-13T17:07:15.835Z","repository":{"id":70861107,"uuid":"148149683","full_name":"zevio/PCU","owner":"zevio","description":"Plateforme de Connaissances Unifiées (PCU) project (i.e Unified Knowledge Platform)","archived":false,"fork":false,"pushed_at":"2018-11-28T21:53:19.000Z","size":57,"stargazers_count":1,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-01-07T10:10:04.367Z","etag":null,"topics":["extraction","json","keyphrase-extraction","kleis","knowledge","knowledge-extraction","langdetect","pcu","pcu-io","pcu-json","pcu-keyphrase","pcu-language","pcu-nlp","pcu-pdf","pcu-relation","pdf","python","spacy","text","workflow"],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/zevio.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2018-09-10T12:11:01.000Z","updated_at":"2018-11-29T14:23:56.000Z","dependencies_parsed_at":"2023-03-04T04:31:07.030Z","dependency_job_id":null,"html_url":"https://github.com/zevio/PCU","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zevio%2FPCU","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zevio%2FPCU/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zevio%2FPCU/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/zevio%2FPCU/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/zevio","download_url":"https://codeload.github.com/zevio/PCU/tar.gz/refs/heads/master","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":240557484,"owners_count":19820360,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["extraction","json","keyphrase-extraction","kleis","knowledge","knowledge-extraction","langdetect","pcu","pcu-io","pcu-json","pcu-keyphrase","pcu-language","pcu-nlp","pcu-pdf","pcu-relation","pdf","python","spacy","text","workflow"],"created_at":"2024-11-10T10:27:34.028Z","updated_at":"2026-04-13T17:07:15.752Z","avatar_url":"https://github.com/zevio.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# PCU\nPlateforme de Connaissances Unifiées (PCU) project (*i.e* Unified Knowledge Platform).\n\nSemantic platform for valuing data. Open-source, configurable, written in Python 3.\n\n## Components\n\nThe platform is composed of several components :\n\n* **[pcu_io][pcu_io]** : Parse a file to get its textual content. \n  * **[pcu_pdf][pcu_pdf]** : Parse PDF files (to an extent, files format supported by **[Apache Tika][tika]**).\n  * **[pcu_json][pcu_json]** : Parse JSON files.\n* **[pcu_language][pcu_language]** : Detect the main language used or all the languages used within a text. Based on **[langdetect][langdetect]**.\n* **[pcu_nlp][pcu_nlp]** : Get syntactic annotations of a text. Based on **[spacy.io][spacy]**.\n* **[pcu_keyphrase][pcu_keyphrase]** : Get keyphrases of a text. Based on **[kleis][kleis]**.\n* **[pcu_relation][pcu_relation]** : Get semantic relationships existing between keyphrases of a text. Based on **[Kata Gábor][gabor]**'s algorithm.\n\n[tika]: https://tika.apache.org\n[langdetect]:https://pypi.org/project/langdetect/\n[spacy]: https://spacy.io\n[kleis]: https://github.com/sdhdez/kleis-keyphrase-extraction\n[gabor]: http://www.inalco.fr/enseignant-chercheur/kata-gabor\n[pcu_io]: https://github.com/zevio/pcu_io\n[pcu_pdf]: https://github.com/zevio/pcu_pdf \n[pcu_json]: https://github.com/zevio/pcu_json\n[pcu_language]: https://github.com/zevio/pcu_language\n[pcu_nlp]: https://github.com/zevio/pcu_nlp\n[pcu_keyphrase]: https://github.com/zevio/pcu_keyphrase\n[pcu_relation]: https://github.com/zevio/pcu_relation\n\n![PCU components](https://framapic.org/1U5tcQkBJuNo/EW3JA763WxDy.png)\n\n## Installation\n\nTo install requirements, execute the Makefile with the following command line :\n\n`make install`\n\n## Usage\n\nThe semantic platform is entirely configurable. To use it, download the sources, go to pcu/ directory and tune the configuration file as you wish.\n\n```\n[pipeline]\nlanguage=\n; default language : if empty, language will be automatically detected\nnlp=spacy \n; name of the NLP pipeline to use\nkeyphrase=yes\n; yes if keyphrase extraction is enabled, no otherwise\n```\n\n* **language** : default language (en for English, fr for French). If empty, language will be automatically detected\n* **nlp** : name of the NLP pipeline to use (spacy)\n* **keyphrase** : yes if keyphrase extraction algorithm is enabled, no otherwise\n\u003c!--** **relation** : yes if semantic relations extraction algorithm is enabled, no otherwise--\u003e\n\nTo execute the workflow on your data, use the following command line :\n\n```\npython3 core.py path/to/data/to/process\n```\n\n## Solutions for problems encountered\n\n### For Windows users\n\nSome Windows users might encounter linking problems installing spacy, if so, launch ```make install``` as an administrator.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzevio%2Fpcu","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fzevio%2Fpcu","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fzevio%2Fpcu/lists"}