{"id":22668579,"url":"https://github.com/lanl/pycp_apr","last_synced_at":"2026-03-05T03:31:30.215Z","repository":{"id":40309771,"uuid":"371850664","full_name":"lanl/pyCP_APR","owner":"lanl","description":"CP-APR Tensor Decomposition with PyTorch backend. pyCP_APR can perform non-negative Poisson Tensor Factorization on GPU, and includes an interface for anomaly detection using the extracted latent patterns.","archived":false,"fork":false,"pushed_at":"2023-12-18T23:23:21.000Z","size":6176,"stargazers_count":15,"open_issues_count":0,"forks_count":7,"subscribers_count":4,"default_branch":"main","last_synced_at":"2025-05-16T11:26:49.818Z","etag":null,"topics":["anomaly-detection","candecomp-parafac","canonical-polyadic","cpd","cybersecurity","dense","dense-tensors","gpu","latent-features","non-negative-tensor-factorization","numpy","poisson-distribution","pytorch","sparse","sparse-tensors","tensor-decomposition","tensor-factorization","tensors"],"latest_commit_sha":null,"homepage":"https://lanl.github.io/pyCP_APR/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":null,"status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lanl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":null,"codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null,"zenodo":null}},"created_at":"2021-05-29T00:55:33.000Z","updated_at":"2025-02-17T03:37:23.000Z","dependencies_parsed_at":"2025-04-12T11:08:54.453Z","dependency_job_id":"bcabddc8-d657-4669-9840-ad29b2859a83","html_url":"https://github.com/lanl/pyCP_APR","commit_stats":null,"previous_names":[],"tags_count":3,"template":false,"template_full_name":null,"purl":"pkg:github/lanl/pyCP_APR","repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lanl%2FpyCP_APR","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lanl%2FpyCP_APR/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lanl%2FpyCP_APR/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lanl%2FpyCP_APR/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lanl","download_url":"https://codeload.github.com/lanl/pyCP_APR/tar.gz/refs/heads/main","sbom_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lanl%2FpyCP_APR/sbom","scorecard":null,"host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":286080680,"owners_count":30108605,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2026-03-05T01:39:18.192Z","status":"online","status_checked_at":"2026-03-05T02:00:06.710Z","response_time":93,"last_error":null,"robots_txt_status":"success","robots_txt_updated_at":"2025-07-24T06:49:26.215Z","robots_txt_url":"https://github.com/robots.txt","online":true,"can_crawl_api":true,"host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["anomaly-detection","candecomp-parafac","canonical-polyadic","cpd","cybersecurity","dense","dense-tensors","gpu","latent-features","non-negative-tensor-factorization","numpy","poisson-distribution","pytorch","sparse","sparse-tensors","tensor-decomposition","tensor-factorization","tensors"],"created_at":"2024-12-09T15:15:50.794Z","updated_at":"2026-03-05T03:31:30.197Z","avatar_url":"https://github.com/lanl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# pyCP_APR\n\n\u003cdiv align=\"center\", style=\"font-size: 50px\"\u003e\n\n[![Build Status](https://github.com/lanl/pyCP_APR/actions/workflows/ci_tests.yml/badge.svg?branch=main)](https://github.com/lanl/pyCP_APR/actions/workflows/ci_tests.yml/badge.svg?branch=main) [![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg) [![Python Version](https://img.shields.io/badge/python-v3.9-blue)](https://img.shields.io/badge/python-v3.8.5-blue) [![DOI](https://img.shields.io/badge/DOI-10.5281%2Fzenodo.4840598-blue.svg)](https://doi.org/10.5281/zenodo.4840598)\n\n\u003c/div\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg width=\"324\" height=\"200\" src=\"docs/rd100.png\"\u003e\n\u003c/p\u003e\n\n**pyCP_APR** is a Python library for tensor decomposition and anomaly detection that is developed as part of the R\u0026D 100 award wining **[SmartTensors AI](https://smart-tensors.lanl.gov/software/)** project. It is designed for the fast analysis of large datasets by accelerating computation speed using GPUs. pyCP_APR uses the CANDECOMP/PARAFAC Alternating Poisson Regression (CP-APR) tensor factorization algorithm utilizing both Numpy and PyTorch backend. While the Numpy backend can be used for the analysis of both sparse and dense tensors, PyTorch backend provides faster decomposition of large and sparse tensors on the GPU. pyCP_APR's Scikit-learn like API allows comfortable interaction with the library, and include the methods for anomaly detection via the p-values obtained from the CP-APR factorization. The anomaly detection methods via the p-values optained from CP-APR was introduced by Eren et al. in [6] using the [Unified Host and Network Dataset](https://csr.lanl.gov/data/2017/) [5]. Our work follows the [MATLAB Tensor Toolbox](https://www.tensortoolbox.org/cp.html) [1-3] implementation of CP-APR [4].\n\n\n\u003cdiv align=\"center\", style=\"font-size: 50px\"\u003e\n\n### [:information_source: Documentation](https://lanl.github.io/pyCP_APR/) \u0026emsp; [:orange_book: Example Notebooks](examples/) \u0026emsp; [:bar_chart: Datasets](data/tensors) \n  \n### [:page_facing_up: Paper 1](https://ieeexplore.ieee.org/abstract/document/9280524) \u0026emsp; [:page_facing_up: Paper 2](https://dl.acm.org/doi/abs/10.1145/3519602)\n    \n### [:link: Website](https://smart-tensors.LANL.gov)\n\n\u003c/div\u003e\n\n\n## Installation\n\n#### Option 1: Install using *pip*\n```shell\npip install git+https://github.com/lanl/pyCP_APR.git\n```\n\n#### Option 2: Install from source\n```shell\ngit clone https://github.com/lanl/pyCP_APR.git\ncd pyCP_APR\nconda create --name pyCP_APR python=3.9\nsource activate pyCP_APR\npip install -e . # or \u003cpython setup.py install\u003e\n```\n\n#### Jupyter Setup Tutorial for using the examples ([Link](https://www.maksimeren.com/post/conda-and-jupyter-setup-for-research/))\n\n## Example Usage\n```python\nfrom pyCP_APR import CP_APR\nfrom pyCP_APR.datasets import load_dataset\n\n# Load a sample tensor\ndata = load_dataset(name=\"TOY\")\n\n# Training and test tensor in COO format\n# Non-zero coordinates and corresponding non-zero values\ncoords_train, nnz_train = data['train_coords'], data['train_count']\ncoords_test, nnz_test = data['test_coords'], data['test_binary']\n\n# CP-APR Object with PyTorch backend on a GPU. \n# Transfer the latent factors back to Numpy arrays.\nmodel = CP_APR(n_iters=10,\n               random_state=42,\n               verbose=1,\n               method='torch',\n               device='gpu',\n               return_type='numpy')\n\n# Take rank 45 decomposition\nM = model.fit(coords=coords_train, values=nnz_train, rank=45)\n\n# Predict the scores over the trained tensor\ny_score = model.predict_scores(coords=coords_test, values=nnz_test)\n```\n**See the [examples](examples/) for more.**\n\n\n## How to Cite pyCP_APR?\nIf you use pyCP_APR please cite the [original paper](https://doi.org/10.1109/ISI49825.2020.9280524) that introduces our anomaly detection framework, and the [follow-up paper](https://doi.org/10.1145/3519602) that generalizes the method to number of other anomaly detection problems and introduces the library alongside new ensemble based extension of our anomaly detection method:\n```latex\n@article{10.1145/3519602,\n  author = {Eren, Maksim E. and Moore, Juston S. and Skau, Erik and Moore, Elisabeth and Bhattarai, Manish and Chennupati, Gopinath and Alexandrov, Boian S.},\n  title = {General-Purpose Unsupervised Cyber Anomaly Detection via Non-Negative Tensor Factorization},\n  year = {2022},\n  publisher = {Association for Computing Machinery},\n  address = {New York, NY, USA},\n  issn = {2692-1626},\n  url = {https://doi.org/10.1145/3519602},\n  doi = {10.1145/3519602},\n  note = {Just Accepted},\n  journal = {Digital Threats},\n  month = {feb},\n  keywords = {malware, anomaly detection, CPD, ensemble learning, non-negative tensor factorization, data fusion, GPU, Poisson tensor factorization, cyber security, unsupervised learning}\n}\n\n@INPROCEEDINGS{Eren2020ISI,\n  author={M. E. {Eren} and J. S. {Moore} and B. S. {Alexandrov}},\n  booktitle={2020 IEEE International Conference on Intelligence and Security Informatics (ISI)},\n  title={Multi-Dimensional Anomalous Entity Detection via Poisson Tensor Factorization},\n  year={2020},\n  pages={1-6},\n  doi={10.1109/ISI49825.2020.9280524}\n}\n\n@MISC{Eren2021pyCPAPR,\n  author = {M. E. {Eren} and J. S. {Moore} and E. {Skau} and M. {Bhattarai} and G. {Chennupati} and B. S. {Alexandrov}},\n  title = {pyCP\\_APR},\n  year = {2021},\n  publisher = {GitHub},\n  journal = {GitHub repository},\n  doi = {10.5281/zenodo.4840598},\n  howpublished = {\\url{https://github.com/lanl/pyCP\\_APR}}\n}\n```\n\n## Authors\n- [Maksim Ekin Eren](mailto:maksim@lanl.gov): Advanced Research in Cyber Systems, Los Alamos National Laboratory\n- [Juston S. Moore](mailto:jmoore01@lanl.gov): Advanced Research in Cyber Systems, Los Alamos National Laboratory\n- [Erik Skau](mailto:ewskau@lanl.gov): Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory\n- [Manish Bhattarai](mailto:ceodspspectrum@lanl.gov): Theoretical Division, Los Alamos National Laboratory\n- [Gopinath Chennupati](mailto:cgnath.dr@gmail.com): Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory\n- [Boian S. Alexandrov](mailto:boian@lanl.gov): Theoretical Division, Los Alamos National Laboratory\n\n\n## Copyright Notice\n\u003e© 2021. Triad National Security, LLC. All rights reserved.\nThis program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos\nNational Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S.\nDepartment of Energy/National Nuclear Security Administration. All rights in the program are\nreserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear\nSecurity Administration. The Government is granted for itself and others acting on its behalf a\nnonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare\nderivative works, distribute copies to the public, perform publicly and display publicly, and to permit\nothers to do so.\n\n**LANL C Number: C21028**\n\n## License:\nThis program is open source under the BSD-3 License.\nRedistribution and use in source and binary forms, with or without modification, are permitted\nprovided that the following conditions are met:\n\n1. Redistributions of source code must retain the above copyright notice, this list of conditions and the following disclaimer.\n\n2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions and the following disclaimer in the documentation and/or other materials provided with the distribution.\n\n3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse or promote products derived from this software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS\nIS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR\nPURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR\nCONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,\nEXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,\nPROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;\nOR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,\nWHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR\nOTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF\nADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\n\n## Developer Test Suite\nDeveloper test suites are located under [```tests/```](tests/) directory. Tests can be ran from this folder using ```python -m unittest *```.\n\n\n## Acknowledgments\nWe thank Austin Thresher for the valuable feedback on our software design.\n\n\n## References\n[1] General software, latest release: Brett W. Bader, Tamara G. Kolda and others, Tensor Toolbox for MATLAB, Version 3.2.1, www.tensortoolbox.org, April 5, 2021.\n\n[2] Dense tensors: B. W. Bader and T. G. Kolda, Algorithm 862: MATLAB Tensor Classes for Fast Algorithm Prototyping, ACM Trans. Mathematical Software, 32(4):635-653, 2006, http://dx.doi.org/10.1145/1186785.1186794.\n\n[3] Sparse, Kruskal, and Tucker tensors: B. W. Bader and T. G. Kolda, Efficient MATLAB Computations with Sparse and Factored Tensors, SIAM J. Scientific Computing, 30(1):205-231, 2007, http://dx.doi.org/10.1137/060676489.\n\n[4] Chi, E.C. and Kolda, T.G., 2012. On tensors, sparsity, and nonnegative factorizations. SIAM Journal on Matrix Analysis and Applications, 33(4), pp.1272-1299.\n\n[5] M. Turcotte, A. Kent and C. Hash, “Unified Host and Network Data Set”, in Data Science for Cyber-Security. November 2018, 1-22.\n\n[6] M. E. Eren, J. S. Moore and B. S. Alexandrov, \"Multi-Dimensional Anomalous Entity Detection via Poisson Tensor Factorization,\" 2020 IEEE International Conference on Intelligence and Security Informatics (ISI), 2020, pp. 1-6, doi: 10.1109/ISI49825.2020.9280524.\n","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flanl%2Fpycp_apr","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flanl%2Fpycp_apr","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flanl%2Fpycp_apr/lists"}