{"id":42937508,"url":"https://github.com/lab42-team/ontogen","last_synced_at":"2026-01-30T19:36:26.105Z","repository":{"id":46570080,"uuid":"355043105","full_name":"Lab42-Team/ontogen","owner":"Lab42-Team","description":"The generator of OWL ontologies based on relational tables in the CSV format.","archived":false,"fork":false,"pushed_at":"2025-09-27T01:49:01.000Z","size":1139,"stargazers_count":0,"open_issues_count":0,"forks_count":1,"subscribers_count":2,"default_branch":"master","last_synced_at":"2025-09-27T03:29:41.228Z","etag":null,"topics":["csv","ontology-engineering","ontology-generation","owl-ontology","table-transformation"],"latest_commit_sha":null,"homepage":"","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/Lab42-Team.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":null,"code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null}},"created_at":"2021-04-06T03:19:56.000Z","updated_at":"2025-09-27T01:53:35.000Z","dependencies_parsed_at":"2022-09-10T03:32:55.493Z","dependency_job_id":null,"html_url":"https://github.com/Lab42-Team/ontogen","commit_stats":null,"previous_names":[],"tags_count":2,"template":false,"template_full_name":null,"purl":"pkg:github/Lab42-Team/ontogen","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Lab42-Team%2Fontogen","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Lab42-Team%2Fontogen/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Lab42-Team%2Fontogen/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Lab42-Team%2Fontogen/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/Lab42-Team","download_url":"https://codeload.github.com/Lab42-Team/ontogen/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/Lab42-Team%2Fontogen/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":28918222,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-01-30T19:10:10.838Z","status":"ssl_error","status_checked_at":"2026-01-30T19:06:40.573Z","response_time":66,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.5:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["csv","ontology-engineering","ontology-generation","owl-ontology","table-transformation"],"created_at":"2026-01-30T19:36:25.410Z","updated_at":"2026-01-30T19:36:26.081Z","avatar_url":"https://github.com/Lab42-Team.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# OntoGen\n\nA command-line software called **OntoGen** for analysis and transformation of source spreadsheet data (CSV) to ontology (OWL/XML).\n\n## Version\n\n1.1\n\n## Preliminaries\n\nA source (input) spreadsheet represents a set of same type entities in a relational form (a subset of the Cartesian product of *K*-data domains), where:\n1.\t*Attribute (a column name)* is a name of a data domain in a relationship schema;\n2.\t*Metadata (a schema)* is an ordered set of *K*-attributes of a relational table;\n3.\t*Tuple (a record)* is an ordered set of *K*-atomic values (one for each attribute of a relation);\n4.\t*Data (a recordset)* is a set of tuples of a relational table.\n\nA spreadsheet of same type entities (*a canonicalized form*) is a relational table in the third normal form (3NF), which contains an ordered set of *N*-rows and *M*-columns.\n\nA table represents a set of entities of the same type, where:\n1.\t*Categorical column or Named entities column (NE-column)* contains names (text mentions) of some named entities;\n2.\t*Literal column (L-column)* contains literal values (e.g. dates, numbers);\n3.\t*Subject (thematic) column (S-column)* is a *NE*-column represented as a potential primary key and defines a subject of a source table;\n4.\t*Another (non-subject) columns* represent entity properties including their relationships with other entities.\n\n**Assumption 1.** *The first row of a source spreadsheet is a header containing attribute (column) names.*\n\n**Assumption 2.** *All values of column cells in a source spreadsheet have same entity types and data types.*\n\n**Assumption 3.** *Source spreadsheets should be presented in the CSV format.*\n\n**OntoGen** supports the process of ontology engineering based on spreadsheet data transformation.\n\n**Assumption 4.** *A target ontology is presented in the [OWL2 DL](https://www.w3.org/TR/owl2-overview/) format.*\n\n## Installation\n\nFirst, you need to clone the project into your directory:\n\n```\ngit clone https://github.com/Lab42-Team/ontogen.git\n```\n\nNext, you need to install all requirements for this project:\n\n```\npip install -r requirements.txt\n```\n\n*We recommend you to use Python 3.0 or more.*\n\n## Directory Structure\n\n* `datasets` contains datasets of source spreadsheets in the CSV format:\n    * `tough-tables` contains [Tough Tables (2T)](https://zenodo.org/record/4246370#.Yf5AO-pBw2w) dataset, where noise spreadsheets are excluded;\n    * `wiki-uku-49` contains spreadsheets describing the main concepts and relationships in the field of education, in particular, universities in the United Kingdom (see [wiki-UKU-49: United Kingdom Universities from Wikipedia](https://data.mendeley.com/datasets/33v9tk6jjb/1));\n    * `isi-167e` contains spreadsheets describing the main concepts and relationships in the field of Industrial Safety Inspection (see [ISI-167E: Entity spreadsheet tables](https://data.mendeley.com/datasets/3gjy46mx88/1)).\n* `examples` contains spreadsheet examples for testing.\n* `ontogen` contains software modules (py-scripts), including `main.py`.\n* `results` contains processing results (target ontologies).\n\n## Usage\n\n#### Usage: python main.py [OPTIONS]\n**Options:**\n- `--name=c:\\userpath` -- Create ontologies\n#### A simple example\n```\npython main.py --name=C:/test\n```\nor\n\n```\npython main.py\nYour path to source spreadsheets: C:/test\n```\n\n## Authors\n\n* [Daria A. Denisova](mailto:daryalich@mail.ru)\n* [Nikita O. Dorodnykh](mailto:tualatin32@mail.ru)","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flab42-team%2Fontogen","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flab42-team%2Fontogen","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flab42-team%2Fontogen/lists"}