{"id":17052463,"url":"https://github.com/crayxt/ecl_tokenizer","last_synced_at":"2026-04-20T10:31:22.804Z","repository":{"id":79922531,"uuid":"370560240","full_name":"crayxt/ecl_tokenizer","owner":"crayxt","description":"Simple tokenizator of Eclipse data decks.","archived":false,"fork":false,"pushed_at":"2021-12-24T13:32:26.000Z","size":65,"stargazers_count":1,"open_issues_count":0,"forks_count":0,"subscribers_count":1,"default_branch":"main","last_synced_at":"2025-03-23T05:23:43.463Z","etag":null,"topics":[],"latest_commit_sha":null,"homepage":null,"language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"gpl-3.0","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/crayxt.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2021-05-25T04:09:19.000Z","updated_at":"2023-02-27T15:14:24.000Z","dependencies_parsed_at":"2023-04-23T21:17:11.053Z","dependency_job_id":null,"html_url":"https://github.com/crayxt/ecl_tokenizer","commit_stats":null,"previous_names":[],"tags_count":0,"template":false,"template_full_name":null,"purl":"pkg:github/crayxt/ecl_tokenizer","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crayxt%2Fecl_tokenizer","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crayxt%2Fecl_tokenizer/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crayxt%2Fecl_tokenizer/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crayxt%2Fecl_tokenizer/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/crayxt","download_url":"https://codeload.github.com/crayxt/ecl_tokenizer/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/crayxt%2Fecl_tokenizer/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32042946,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-04-20T00:18:06.643Z","status":"online","status_checked_at":"2026-04-20T02:00:06.527Z","response_time":94,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":[],"created_at":"2024-10-14T10:09:31.778Z","updated_at":"2026-04-20T10:31:22.778Z","avatar_url":"https://github.com/crayxt.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# ecl_tokenizer\nSimple tokenizator of Eclipse data decks. Pure Python code with no dependencies.\nPandas and Numpy are required for further data manipulations.\n\n## Detection rules\nEclipse data deck (Eclipse case) is a collection of ASCII files.\nEach file has a set of data records, specified by keywords.\nEclipse has several types of keywords. We do not care about those types.\nWe only split a case by keyword:value pairs.\nEmpty lines and comments are ignored.\nKeyword owns the data that located between that keyword and the next keyword.\nIf data without keyword is found, exception is raised.\n\nThis is not a strict parser, as we do not know all of possible keywords and their syntax,\nwe reply on simple detection of possible keywords.\n\nWe assume that keyword is a command up to 8 chars long, starts with column 0 and could\nconsist of A-Z, 0-9, + and - symbols. Keyword should be uppercase and start with a letter.\n\nThere is special keyword `INCLUDE` which includes other file into current file. When such\nkeyword is found, it is parsed right away, and parser continues reading current after that.\n`INCLUDE` keywords could be nested.\n\nWe keep track of parsed files to prevent recursion. We also keep track of missing files that `INCLUDE` keyword refers to.\n\n## Workflow\n### Initialization\n```\nimport ecl_tokenizer as et\ncase = et.EclCase(r\"C:\\GitHub\\opm-tests\\model5\\0_BASE_MODEL5.DATA\")\n```\n### Show the structure of case\n```\n\u003e\u003e\u003e case.describe()\n+ C:\\GitHub\\opm-tests\\model5\\0_BASE_MODEL5.DATA\n|-- C:\\GitHub\\opm-tests\\model5\\0_BASE_MODEL5.DATA\n|-- C:\\GitHub\\opm-tests\\model5\\include\\test1_20x30x10.grdecl\n|-- C:\\GitHub\\opm-tests\\model5\\include\\permx_model5.grdecl\n|-- C:\\GitHub\\opm-tests\\model5\\include\\pvt_live_oil_dgas.ecl\n|-- C:\\GitHub\\opm-tests\\model5\\include\\rock.inc\n|-- C:\\GitHub\\opm-tests\\model5\\include\\relperm.inc\n|-- C:\\GitHub\\opm-tests\\model5\\include\\summary.inc\n|-- C:\\GitHub\\opm-tests\\model5\\include\\well_vfp.ecl\n|-- C:\\GitHub\\opm-tests\\model5\\include\\flowl_b_vfp.ecl\n|-- C:\\GitHub\\opm-tests\\model5\\include\\flowl_c_vfp.ecl\n```\n### Look up keywords\n```\n# Query for keyword presence.\n\u003e\u003e\u003e case.has_kwd(\"DIMENS\")\nTrue\n# Get list of all instances of given keyword.\n\u003e\u003e\u003e case.get_kwds(\"DIMENS\")\n[\u003cEclKwd: DIMENS    Section \"RUNSPEC  \" Parent: \"C:\\GitHub\\opm-tests\\model5\\0_BASE_MODEL5.DATA\" Line_number: \"17\"\u003e\n]\n```\n### Extracting data of keywords\n```\n# Get raw data of first instance of given keyword.\n\u003e\u003e\u003e case.get_kwds(\"DIMENS\")[0].value\n[' 20 30 10 /']\n# Get combined data from records of all instances of given keyword.\n\u003e\u003e\u003e case.get_kwds_data(\"DIMENS\")\n[['20', '30', '10']]\n```\n### Data extraction example\n```\n\u003e\u003e\u003e import pandas as pd\n\u003e\u003e\u003e import numpy as np\n# By default, N*M type of records are expanded.\n\u003e\u003e\u003e wsp = case.get_kwds_data(\"WELSPECS\")\n\u003e\u003e\u003e df1 = pd.DataFrame(wsp)\n\u003e\u003e\u003e df1\n       0     1   2   3  4      5  6  7     8  9  10 11\n0  'B-1H'  'B1'  11   3       OIL        SHUT\n1  'B-2H'  'B1'   4   7       OIL        SHUT\n2  'B-3H'  'B1'  11  12       OIL        SHUT\n3  'C-1H'  'C1'  13  20       OIL        SHUT\n4  'C-2H'  'C1'  12  27       OIL        SHUT\n5  'F-1H'  'F1'  19   4     WATER        SHUT\n6  'F-2H'  'F1'  19  12     WATER        SHUT\n7  'G-3H'  'G1'  19  21     WATER        SHUT\n8  'G-4H'  'G1'  19  25     WATER        SHUT\n\n# Disable expansion of N*M type of records.\n\u003e\u003e\u003e wsp2 = case.get_kwds_data(\"WELSPECS\", expand=False)\n\u003e\u003e\u003e df2 = pd.DataFrame(wsp2)\n\u003e\u003e\u003e df2\n       0     1   2   3   4      5   6   7     8   9   10  11\n0  'B-1H'  'B1'  11   3  1*    OIL  1*  1*  SHUT  1*  1*  1*\n1  'B-2H'  'B1'   4   7  1*    OIL  1*  1*  SHUT  1*  1*  1*\n2  'B-3H'  'B1'  11  12  1*    OIL  1*  1*  SHUT  1*  1*  1*\n3  'C-1H'  'C1'  13  20  1*    OIL  1*  1*  SHUT  1*  1*  1*\n4  'C-2H'  'C1'  12  27  1*    OIL  1*  1*  SHUT  1*  1*  1*\n5  'F-1H'  'F1'  19   4  1*  WATER  1*  1*  SHUT  1*  1*  1*\n6  'F-2H'  'F1'  19  12  1*  WATER  1*  1*  SHUT  1*  1*  1*\n7  'G-3H'  'G1'  19  21  1*  WATER  1*  1*  SHUT  1*  1*  1*\n8  'G-4H'  'G1'  19  25  1*  WATER  1*  1*  SHUT  1*  1*  1*\n# By default, all of columns are of string type. See the next section for workaround.\n\u003e\u003e\u003e df2.dtypes\n0     object\n1     object\n2     object\n3     object\n4     object\n5     object\n6     object\n7     object\n8     object\n9     object\n10    object\n11    object\ndtype: object\n```\n### Data types\nBy default, data returned from `case.get_kwds_data(\"kwd\")` is of string type.\nTo recognize them as numbers, supply the `dtype=float32` argument to `pd.DataFrame` constructor.\n(It seems like older versions of pandas do not support integer columns).\n```\n\u003e\u003e\u003e df2 = pd.DataFrame(wsp2, dtype=\"float32\")\n\u003e\u003e\u003e df2\n       0     1     2     3   4      5   6   7     8   9   10  11\n0  'B-1H'  'B1'  11.0   3.0  1*    OIL  1*  1*  SHUT  1*  1*  1*\n1  'B-2H'  'B1'   4.0   7.0  1*    OIL  1*  1*  SHUT  1*  1*  1*\n2  'B-3H'  'B1'  11.0  12.0  1*    OIL  1*  1*  SHUT  1*  1*  1*\n3  'C-1H'  'C1'  13.0  20.0  1*    OIL  1*  1*  SHUT  1*  1*  1*\n4  'C-2H'  'C1'  12.0  27.0  1*    OIL  1*  1*  SHUT  1*  1*  1*\n5  'F-1H'  'F1'  19.0   4.0  1*  WATER  1*  1*  SHUT  1*  1*  1*\n6  'F-2H'  'F1'  19.0  12.0  1*  WATER  1*  1*  SHUT  1*  1*  1*\n7  'G-3H'  'G1'  19.0  21.0  1*  WATER  1*  1*  SHUT  1*  1*  1*\n8  'G-4H'  'G1'  19.0  25.0  1*  WATER  1*  1*  SHUT  1*  1*  1*\n\u003e\u003e\u003e df2.dtypes\n0      object\n1      object\n2     float32\n3     float32\n4      object\n5      object\n6      object\n7      object\n8      object\n9      object\n10     object\n11     object\ndtype: object\n```\n### Example of PERMX/PORO extraction\n```\n# Get the model dimensions.\n\u003e\u003e\u003e case.get_kwds_data(\"DIMENS\")\n[['20', '30', '10']]\n# How many PERMX values we should have?\n\u003e\u003e\u003e 20*30*10\n6000\n# Check for presense of PERMX and PORO keywords in this case.\n\u003e\u003e\u003e case.has_kwd(\"PERMX\")\nTrue\n\u003e\u003e\u003e case.has_kwd(\"PORO\")\nTrue\n# Get PERMX data and convert to numpy array of proper type.\n\u003e\u003e\u003e permx = case.get_kwds_data(\"PERMX\")\n\u003e\u003e\u003e permx_arr = np.array(permx, dtype=\"float32\")\n# Number of data records is correct.\n\u003e\u003e\u003e permx_arr.shape\n(1, 6000)\n```\n## Usage\nYou are welcome to use this code in accordance with its license.\nCopyrights should be preserved.\nThe Author is not responsible for any outcome that you experience from this code.\nSee the license for more details.\n\n## Contributing\nAll code contains bugs. This code is reached the state where it deserved sharing.\nIf you find any bug, please contribute via raising tickets or submitting pull requests, as long as you agree with the license terms.\n\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrayxt%2Fecl_tokenizer","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fcrayxt%2Fecl_tokenizer","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fcrayxt%2Fecl_tokenizer/lists"}