{"id":38514658,"url":"https://github.com/oreum-industries/oreum_core","last_synced_at":"2026-03-11T09:09:21.724Z","repository":{"id":40560074,"uuid":"336530541","full_name":"oreum-industries/oreum_core","owner":"oreum-industries","description":"Core tools for use on client projects","archived":false,"fork":false,"pushed_at":"2026-03-07T19:40:45.000Z","size":835,"stargazers_count":1,"open_issues_count":3,"forks_count":0,"subscribers_count":0,"default_branch":"master","last_synced_at":"2026-03-07T19:43:10.912Z","etag":null,"topics":["arviz","core-tools","data-science","insurance","internal-project","pymc"],"latest_commit_sha":null,"homepage":"https://pypi.org/project/oreum_core","language":"Python","has_issues":false,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/oreum-industries.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE.md","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-02-06T12:17:54.000Z","updated_at":"2026-03-07T14:14:50.000Z","dependencies_parsed_at":"2023-09-23T09:23:41.222Z","dependency_job_id":"605319ed-3cfe-4619-9d94-c37c96a40460","html_url":"https://github.com/oreum-industries/oreum_core","commit_stats":null,"previous_names":[],"tags_count":129,"template":false,"template_full_name":null,"purl":"pkg:github/oreum-industries/oreum_core","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oreum-industries%2Foreum_core","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oreum-industries%2Foreum_core/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oreum-industries%2Foreum_core/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oreum-industries%2Foreum_core/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/oreum-industries","download_url":"https://codeload.github.com/oreum-industries/oreum_core/tar.gz/refs/heads/master","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/oreum-industries%2Foreum_core/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30376810,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-11T06:09:32.197Z","status":"ssl_error","status_checked_at":"2026-03-11T06:09:17.086Z","response_time":84,"last_error":"SSL_read: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["arviz","core-tools","data-science","insurance","internal-project","pymc"],"created_at":"2026-01-17T06:27:02.820Z","updated_at":"2026-03-11T09:09:21.718Z","avatar_url":"https://github.com/oreum-industries.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Oreum Core Tools `oreum_core`\n\n[![Python](https://img.shields.io/badge/python-3.13-blue)](https://www.python.org)\n[![License](https://img.shields.io/badge/license-Apache2.0-blue.svg)](https://choosealicense.com/licenses/apache-2.0/)\n[![GitHub Release](https://img.shields.io/github/v/release/oreum-industries/oreum_core?display_name=tag\u0026sort=semver)](https://github.com/oreum-industries/oreum_core/releases)\n[![PyPI](https://img.shields.io/pypi/v/oreum_core)](https://pypi.org/project/oreum_core)\n[![lint](https://github.com/oreum-industries/oreum_core/workflows/lint/badge.svg)](https://github.com/oreum-industries/oreum_core/actions/workflows/lint.yml)\n[![test](https://github.com/oreum-industries/oreum_core/workflows/test/badge.svg)](https://github.com/oreum-industries/oreum_core/actions/workflows/test.yml)\n[![publish](https://github.com/oreum-industries/oreum_core/actions/workflows/publish.yml/badge.svg)](https://github.com/oreum-industries/oreum_core/actions/workflows/publish.yml)\n[![code style: ruff](https://img.shields.io/endpoint?url=https://raw.githubusercontent.com/astral-sh/ruff/main/assets/badge/v2.json)](https://github.com/astral-sh/ruff)\n[![code security: bandit](https://img.shields.io/badge/code%20security-bandit-yellow.svg)](https://github.com/PyCQA/bandit)\n[![code style: interrogate](https://raw.githubusercontent.com/oreum-industries/oreum_core/master/assets/img/interrogate_badge.svg)](https://pypi.org/project/interrogate/)\n[![test coverage](https://raw.githubusercontent.com/oreum-industries/oreum_core/master/assets/img/coverage_badge.svg)](https://github.com/oreum-industries/oreum_core/actions/workflows/test.yml)\n\n---\n\n## 1. Description and Scope\n\n`oreum_core` is an ever-evolving package of core tools for use on client\nprojects by Oreum Industries.\n\n+ Provides an essential workflow for data curation, EDA, basic ML using the core\n  scientific Python stack incl. `numpy`, `scipy`, `matplotlib`, `seaborn`,\n  `pandas`, `scikit-learn`\n+ Optionally provides an advanced Bayesian modeling workflow in R\u0026D and\n  Production using a leading probabilistic programming stack incl. `pymc`,\n  `pytensor`, `arviz`\n  (do `pip install oreum_core[pymc]`)\n+ Optionally enables a generalist black-box ML workflow in R\u0026D using a leading\n  Gradient Boosted Trees stack using `xgboost`\n  (do `pip install oreum_core[tree]`)\n+ Also includes several utilities for text cleaning, sql scripting, file handling\n\n\nThis package **is**:\n\n+ A work in progress (v0.y.z) and liable to breaking changes and inconvenience\n  to the user\n+ Solely designed for ease of use and rapid development by employees of\n  Oreum Industries, and selected clients with guidance\n\nThis package **is not**:\n\n+ Intended for public usage and will not be supported for public usage\n+ Intended for contributions by anyone not an employee of Oreum Industries,\n  and unsolicited contributions will not be accepted.\n\n\n### Notes\n\n+ Project began on 2021-01-01\n+ The `README.md` is MacOS and POSIX oriented\n+ See `LICENCE.md` for licensing and copyright details\n+ See `pyproject.toml` for various package details\n+ See `CLAUDE.md` for Claude Code rules\n+ This uses a logger named `'oreum_core'`, feel free to incorporate or ignore\n  see `__init__.py` for details\n+ Hosting:\n  + Source code repo on [GitHub](https://github.com/oreum-industries/oreum_core)\n  + Source code release on [GitHub](https://github.com/oreum-industries/oreum_core/releases)\n  + Package release on [PyPi](https://pypi.org/project/oreum_core)\n+ Implementation:\n  + This project is enabled by a modern, open-source, advanced software stack\n    for data curation, statistical analysis and predictive modelling\n  + Specifically we use an open-source Python-based suite of software packages,\n    the core of which is often known as the Scientific Python stack, supported\n    by [NumFOCUS](https://numfocus.org)\n  + Once installed (see section 2), see `LICENSES_3P.md` for full\n    details of all package licences\n+ Environments: this project was originally developed on a Macbook Air M2\n  (Apple Silicon ARM64) running MacOS 15 (Sequoia) using `osx-arm64` Accelerate\n\n\n### Package Structure\n\nTop-level:\n```\noreum_core/\n├── curate/      # Data ingestion \u0026 transformation\n├── eda/         # Exploratory data analysis\n├── model_pymc/  # Bayesian modeling (optional dep: pip install oreum_core[pymc])\n├── model_tree/  # Gradient-boosted trees (optional dep: pip install oreum_core[tree])\n└── utils/       # BaseFileIO base class for all I/O handlers, also string sanitization\n```\n\n---\n\n\n## 2. Instructions to Create Dev Environment\n\nFor local development on MacOS\n\n### 2.0 Pre-requisite installs via `homebrew`\n\n1. Install Homebrew, see instructions at [https://brew.sh](https://brew.sh)\n2. Install system-level tools incl. `direnv`, `gcc`, `git`, `graphviz`, `uv`:\n\n```zsh\n$\u003e make brew\n```\n\n### 2.1 Git clone the repo\n\nAssumes system-level tools installed as above:\n\n```zsh\n$\u003e git clone https://github.com/oreum-industries/oreum_core\n$\u003e cd oreum_core\n```\nThen allow `direnv` on MacOS to autorun file `.envrc` upon directory open\n\n\n### 2.2 Create virtual environment and install dev packages\n\nNotes:\n\n+ We use local `.venv/` virtual env via [`uv`](https://github.com/astral-sh/uv)\n+ Packages are technically articulated in `pyproject.toml` and might not be the\n  latest - to aid stability for `pymc` (usually in a state of development flux)\n\n\n#### 2.2.1 Create the dev environment\n\nFrom the dir above `oreum_core/` project dir:\n\n```zsh\n$\u003e make -C oreum_core/ dev\n```\n\nThis will also create some files to help confirm / diagnose successful installation:\n\n+ `dev/install_log/blas_info.txt` for the `BLAS MKL` installation for `numpy`\n+ `LICENSES_3P.md` details the license for each third-party package used\n\n\n#### 2.2.2 (Optional best practice) Test successful installation of dev env\n\nFrom the dir above `oreum_core/` project dir:\n\n```zsh\n$\u003e make -C oreum_core/ dev-test\n```\n\nThis will also add files `dev/install_log/tests_[numpy|scipy].txt` which detail\nsuccessful installation (or not) for `numpy`, `scipy`\n\n\n#### 2.2.3 (Useful during env install experimentation): To remove the dev env\n\nFrom the dir above `oreum_core/` project dir:\n\n```zsh\n$\u003e make -C oreum_core/ dev-uninstall\n```\n\n### 2.3 Code Linting \u0026 Repo Control\n\n#### 2.3.1 Pre-commit\n\nWe use [pre-commit](https://pre-commit.com) to run a suite of automated tests\nfor code linting \u0026 quality control and repo control prior to commit on local\ndevelopment machines.\n\n+ Precommit is already installed by the `make dev` command (which itself calls\n`pip install -e .[dev]`)\n+ The pre-commit script will then run on your system upon `git commit`\n+ See this project's `.pre-commit-config.yaml` for details\n\n\n#### 2.3.2 Github Actions\n\nWe use [Github Actions](https://docs.github.com/en/actions/using-workflows) aka\nGithub Workflows to run:\n\n1. A suite of automated tests for commits received at the origin (i.e. GitHub)\n2. Publishing to PyPi upon creating a GH Release\n\n+ See `Makefile` for the CLI commands that are issued\n+ See `.github/workflows/*` for workflow details\n\n\n#### 2.3.3 Git LFS\n\nWe use [Git LFS](https://git-lfs.github.com) to store any large files alongside\nthe repo. This can be useful to replicate exact environments during development\nand/or for automated tests\n\n+ This requires a local machine install\n  (see [Getting Started](https://git-lfs.github.com))\n+ See `.gitattributes` for details\n\n\n### 2.4 Configs for Local Development\n\nSome notes to help configure local development environment\n\n#### 2.4.1 Git config `~/.gitconfig`\n\n```yaml\n[user]\n    name = \u003cYOUR NAME\u003e\n    email = \u003cYOUR EMAIL ADDRESS\u003e\n```\n\n\n### 2.5 Install VSCode IDE\n\nWe strongly recommend using [VSCode](https://code.visualstudio.com) for all\ndevelopment on local machines, and this is a hard pre-requisite to use\nthe `.devcontainer` environment (see section 3)\n\nThis repo includes relevant lightweight project control and config in:\n\n```zsh\noreum_core.code-workspace\n.vscode/extensions.json\n.vscode/settings.json\n```\n\n### 2.6 Publishing to PyPi\n\nA note for maintainers (Oreum Industries only), publishing to pypi, ensure\nlocal dev machine presence of the following in a config file `~/.pypirc`\n\n```yaml\n[distutils]\nindex-servers =\n   pypi\n   testpypi\n\n[pypi]\nrepository = https://upload.pypi.org/legacy/\nusername = __token__\n\n[testpypi]\nrepository = https://test.pypi.org/legacy/\nusername = __token__\n\n```\n\n---\n\n## 3. Code Standards\n\nEven when writing R\u0026D code, we strive to meet and exceed (even define) best\npractices for code quality, documentation and reproducibility for modern\ndata science projects.\n\n### 3.1 Code Linting \u0026 Repo Control\n\nWe use a suite of automated tools to check and enforce code quality. We indicate\nthe relevant shields at the top of this README. See section 1.4 above for how\nthis is enforced at precommit on developer machines and upon PR at the origin as\npart of our CI process, prior to master branch merge.\n\nThese include:\n\n+ [`ruff`](https://docs.astral.sh/ruff/) - extremely fast standardised linting\n  and formatting, which replaces `black`, `flake8`, `isort`\n+ [`interrogate`](https://pypi.org/project/interrogate/) - ensure complete Python\n  docstrings\n+ [`bandit`](https://github.com/PyCQA/bandit) - test for common Python security\n  issues\n\nWe also run a suite of general tests pre-packaged in\n[`precommit`](https://pre-commit.com).\n\n\n---\n\n## 4. Usage\n\n### 4.1 Plot theming\n\n```python\nfrom oreum_core.eda import set_plot_theme\nset_plot_theme()  # or pass overrides: set_plot_theme(context=\"paper\")\n```\n\n---\n\nCopyright 2026 Oreum FZCO t/a Oreum Industries. All rights reserved.\nOreum FZCO, IFZA, Dubai Silicon Oasis, Dubai, UAE, reg. 25515\n[oreum.io](https://oreum.io)\n\n---\nOreum Industries \u0026copy; 2026\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foreum-industries%2Foreum_core","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Foreum-industries%2Foreum_core","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Foreum-industries%2Foreum_core/lists"}