{"id":34667066,"url":"https://github.com/gmgeorg/pypress","last_synced_at":"2026-05-07T02:39:45.270Z","repository":{"id":45251988,"uuid":"432821066","full_name":"gmgeorg/pypress","owner":"gmgeorg","description":"PRESS: Predictive State Smoothing in Python (tf.keras)","archived":false,"fork":false,"pushed_at":"2025-12-24T17:35:07.000Z","size":1975,"stargazers_count":2,"open_issues_count":0,"forks_count":0,"subscribers_count":2,"default_branch":"main","last_synced_at":"2025-12-26T07:57:16.268Z","etag":null,"topics":["keras","kernel-smoothing","mixture-model","neural-network","predictive-models","semi-parametric-regression","smoothing-methods","tensorflow"],"latest_commit_sha":null,"homepage":"","language":"Jupyter Notebook","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"mit","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/gmgeorg.png","metadata":{"files":{"readme":"README.md","changelog":"CHANGELOG.md","contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null,"notice":null,"maintainers":null,"copyright":null,"agents":null,"dco":null,"cla":null}},"created_at":"2021-11-28T20:46:26.000Z","updated_at":"2025-12-24T17:30:31.000Z","dependencies_parsed_at":"2025-03-18T03:35:14.973Z","dependency_job_id":null,"html_url":"https://github.com/gmgeorg/pypress","commit_stats":null,"previous_names":[],"tags_count":9,"template":false,"template_full_name":null,"purl":"pkg:github/gmgeorg/pypress","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gmgeorg%2Fpypress","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gmgeorg%2Fpypress/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gmgeorg%2Fpypress/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gmgeorg%2Fpypress/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/gmgeorg","download_url":"https://codeload.github.com/gmgeorg/pypress/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/gmgeorg%2Fpypress/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":32720771,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-05-07T02:14:30.463Z","status":"ssl_error","status_checked_at":"2026-05-07T02:14:29.405Z","response_time":62,"last_error":"SSL_connect returned=1 errno=0 peeraddr=140.82.121.6:443 state=error: unexpected eof while reading","robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":false,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["keras","kernel-smoothing","mixture-model","neural-network","predictive-models","semi-parametric-regression","smoothing-methods","tensorflow"],"created_at":"2025-12-24T19:18:00.914Z","updated_at":"2026-05-07T02:39:45.254Z","avatar_url":"https://github.com/gmgeorg.png","language":"Jupyter Notebook","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pypress: Predictive State Smoothing (PRESS) in Python (`tf.keras`)\n\n![Python](https://img.shields.io/badge/python-3670A0?style=for-the-badge\u0026logo=python\u0026logoColor=ffdd54)\n![TensorFlow](https://img.shields.io/badge/TensorFlow-%23FF6F00.svg?style=for-the-badge\u0026logo=TensorFlow\u0026logoColor=white)\n[![PRs Welcome](https://img.shields.io/badge/PRs-welcome-brightgreen.svg?style=flat-square)](http://makeapullrequest.com)\n[![License](https://img.shields.io/badge/License-Apache_2.0-blue.svg)](https://opensource.org/licenses/Apache-2.0)\n[![Github All Releases](https://img.shields.io/github/downloads/gmgeorg/pypress/total.svg)]()\n\nPredictive State Smoothing (PRESS) is a semi-parametric statistical machine learning algorithm\nfor regression and classification problems. `pypress` is using TensorFlow Keras to implement\nthe predictive learning algorithms proposed in\n\n\n* Goerg (2018) *[Classification using Predictive State Smoothing (PRESS): A scalable kernel classifier for high-dimensional features with variable selection](https://research.google/pubs/pub46767/)*.\n\n* Goerg (2017) *[Predictive State Smoothing (PRESS): Scalable non-parametric regression for high-dimensional data with variable selection](https://research.google/pubs/pub46141/).*\n\nSee [below](#nutshell) for details on how PRESS works in a nutshell.\n\n\n# Installation\n\nIt can be installed directly from `github.com` using:\n```\npip install git+https://github.com/gmgeorg/pypress.git\n```\n\n\n# Example usage\n\nPRESS is available as 2 layers that need to be added one after the other; alternatively\nthere is a `PRESS()` wrapper feed-forward layer that applies both layers at once.\n\n\n```python\nfrom sklearn.datasets import load_breast_cancer\nimport sklearn\nX, y = load_breast_cancer(return_X_y=True, as_frame=True)\nX_s = sklearn.preprocessing.robust_scale(X)  # See demo.ipynb to properly scale X with train/test split\n\n\nimport tensorflow as tf\nfrom pypress.keras import layers\nfrom pypress.keras import regularizers\n\nmod = tf.keras.Sequential()\n# see layers.PRESS() for single layer wrapper\nmod.add(layers.PredictiveStateSimplex(\n            n_states=6,\n            activity_regularizer=regularizers.Uniform(0.01),\n            input_dim=X.shape[1]))\nmod.add(layers.PredictiveStateMeans(units=1, activation=\"sigmoid\"))\nmod.compile(loss=\"binary_crossentropy\",\n            optimizer=tf.keras.optimizers.Nadam(learning_rate=0.01),\n            metrics=[tf.keras.metrics.AUC(curve=\"PR\", name=\"auc_pr\")])\nmod.summary()\nmod.fit(X_s, y, epochs=10, validation_split=0.2)\n```\n\n```\nModel: \"sequential_12\"\n_________________________________________________________________\n Layer (type)                Output Shape              Param #\n=================================================================\n predictive_state_simplex_1  (None, 6)                186\n 1 (PredictiveStateSimplex)\n\n predictive_state_means_11 (  (None, 1)                6\n PredictiveStateMeans)\n\n=================================================================\nTotal params: 192\nTrainable params: 192\nNon-trainable params: 0\n```\n\n\nSee also the [`notebook/demo.ipynb`](notebooks/demo.ipynb) for end to end examples for PRESS regression and classification models.\n\n# PRESS in a nutshell \u003ca name=\"nutshell\"/\u003e\n\nThe figure below, adapted from **Goerg (2018)**, contrasts the architecture of a standard feed-forward Deep Neural Network (DNN) with the **Predictive State Smoothing (PRESS)** approach.\n\n![PRESS architecture](imgs/press_architecture.png)\n\n### 1. Standard Feed-Forward DNNs\nIn typical prediction problems, our goal is to model the conditional distribution $p(y \\mid X)$ or the conditional expectation $E[y \\mid X]$. A standard feed-forward network estimates this by directly mapping features ($X$) to an output through a series of highly non-linear transformations (as seen in Figure 3a).\n\n### 2. The PRESS Decomposition\nIn contrast, PRESS decomposes the predictive distribution into a mixture distribution over **predictive states** ($S$). This architecture relies on a critical property: conditioned on a predictive state $j$, the output ($y$) becomes conditionally independent of the input features ($X$).\n\nMathematically, this is expressed as:\n\n![PRESS equation](imgs/press_decomposition_equation.png)\n\nThe second equality holds because the state $j$ captures all relevant information from $X$ necessary to predict $y$, rendering the raw features redundant once the state is known.\n\n### 3. Key Advantages and Clustering\nThe primary strength of this decomposition is that predictive states serve as **minimal sufficient statistics** for $y$. They provide an optimal informational summary—maximizing compression while retaining full predictive power.\n\nAn important byproduct of this framework is the ability to perform **predictive clustering**:\n* Once the mapping from features ($X$) to the predictive state simplex is learned, observations can be clustered within the state space.\n* Observations sharing similar predictive states are guaranteed to have similar predictive distributions for $y$, providing a principled way to group data based on future outcomes rather than raw input similarity.\n\n### 4. Comparison to Mixture Density Networks (MDN)\n\nWhile PRESS shares similarities with [Mixture Density Networks (MDN)](https://publications.aston.ac.uk/id/eprint/373/1/NCRG_94_004.pdf), there is a fundamental distinction. In an MDN, the output parameters are often direct functions of the features. In **PRESS**, the conditional independence of $y$ and $X$ given $S$ ensures that the output means are conditioned *only* on the predictive state, not the raw features.\n\n\n## License\n\nThis project is licensed under the terms of the MIT license. See [LICENSE](https://github.com/gmgeorg/pypress/blob/main/LICENSE) for additional details.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgmgeorg%2Fpypress","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Fgmgeorg%2Fpypress","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Fgmgeorg%2Fpypress/lists"}