{"id":22668287,"url":"https://github.com/lanl/t-elf","last_synced_at":"2025-04-12T11:04:06.311Z","repository":{"id":208726708,"uuid":"703212457","full_name":"lanl/T-ELF","owner":"lanl","description":"Tensor Extraction of Latent Features (T-ELF). Within T-ELF's arsenal are non-negative matrix and tensor factorization solutions, equipped with automatic model determination (also known as the estimation of latent factors - rank) for accurate data modeling. Our software suite encompasses cutting-edge data pre-processing and post-processing modules.","archived":false,"fork":false,"pushed_at":"2024-07-24T20:16:27.000Z","size":39342,"stargazers_count":6,"open_issues_count":23,"forks_count":2,"subscribers_count":7,"default_branch":"main","last_synced_at":"2024-07-24T22:46:47.235Z","etag":null,"topics":["blind-source-separation","dimensionality-reduction","feature-extraction","gpu","high-performance-computing","hpc","latent-variables","machine-learning","matrix","matrix-completion","matrix-factorization","non-negative-matrix-factorization","pattern-extraction","semi-supervised-learning","tensor-decomposition","tensor-factorization","tensors","text-preprocessing","unsupervised-learning"],"latest_commit_sha":null,"homepage":"https://lanl.github.io/T-ELF/","language":"Python","has_issues":true,"has_wiki":null,"has_pages":null,"mirror_url":null,"source_name":null,"license":"other","status":null,"scm":"git","pull_requests_enabled":true,"icon_url":"https://github.com/lanl.png","metadata":{"files":{"readme":"README.md","changelog":null,"contributing":null,"funding":null,"license":"LICENSE","code_of_conduct":null,"threat_model":null,"audit":null,"citation":"CITATION.cff","codeowners":null,"security":null,"support":null,"governance":null,"roadmap":null,"authors":null,"dei":null,"publiccode":null,"codemeta":null}},"created_at":"2023-10-10T20:08:39.000Z","updated_at":"2024-07-24T20:16:31.000Z","dependencies_parsed_at":"2024-01-10T21:57:34.295Z","dependency_job_id":"10903499-8c1d-47f4-bfe9-8060fd6e79a3","html_url":"https://github.com/lanl/T-ELF","commit_stats":null,"previous_names":["lanl/t-elf"],"tags_count":19,"template":false,"template_full_name":null,"repository_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lanl%2FT-ELF","tags_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lanl%2FT-ELF/tags","releases_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lanl%2FT-ELF/releases","manifests_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories/lanl%2FT-ELF/manifests","owner_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners/lanl","download_url":"https://codeload.github.com/lanl/T-ELF/tar.gz/refs/heads/main","host":{"name":"GitHub","url":"https://github.com","kind":"github","repositories_count":228911888,"owners_count":17990774,"icon_url":"https://github.com/github.png","version":null,"created_at":"2022-05-30T11:31:42.601Z","updated_at":"2022-07-04T15:15:14.044Z","host_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub","repositories_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repositories","repository_names_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/repository_names","owners_url":"https://repos.ecosyste.ms/api/v1/hosts/GitHub/owners"}},"keywords":["blind-source-separation","dimensionality-reduction","feature-extraction","gpu","high-performance-computing","hpc","latent-variables","machine-learning","matrix","matrix-completion","matrix-factorization","non-negative-matrix-factorization","pattern-extraction","semi-supervised-learning","tensor-decomposition","tensor-factorization","tensors","text-preprocessing","unsupervised-learning"],"created_at":"2024-12-09T15:14:39.402Z","updated_at":"2025-04-12T11:04:06.271Z","avatar_url":"https://github.com/lanl.png","language":"Python","funding_links":[],"categories":[],"sub_categories":[],"readme":"# Tensor Extraction of Latent Features (T-ELF) \u003cimg align=\"left\" width=\"50\" height=\"50\" src=\"docs/cube.jpg\"\u003e\n\n\u003cdiv align=\"center\", style=\"font-size: 50px\"\u003e\n\n[![Build Status](https://github.com/lanl/T-ELF/actions/workflows/ci_tests.yml/badge.svg?branch=main)](https://github.com/lanl/T-ELF/actions/workflows/ci_tests.yml/badge.svg?branch=main) [![License](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg)](https://img.shields.io/badge/License-BSD%203--Clause-blue.svg) [![Python Version](https://img.shields.io/badge/python-v3.11.10-blue)](https://img.shields.io/badge/python-v3.11.10-blue) [![DOI](https://zenodo.org/badge/703212457.svg)](https://zenodo.org/doi/10.5281/zenodo.10257896)\n\n\u003c/div\u003e \n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/tensorsrnd.png\"\u003e\n\u003c/p\u003e\n\n\u003cdiv align=\"center\", style=\"font-size: 50px\"\u003e\n\n### [:information_source: Documentation](https://lanl.github.io/T-ELF/) \u0026emsp; [:orange_book: Examples](examples/) \u0026emsp; [:page_with_curl: Publications](https://smart-tensors.lanl.gov/publications/) \u0026emsp; [:link: Website](https://smart-tensors.LANL.gov)\n\n\u003c/div\u003e\n\nT-ELF is one of the machine learning software packages developed as part of the [R\u0026D 100](https://smart-tensors.lanl.gov/news/rnd100_smarttensors/) winning **[SmartTensors AI](https://smart-tensors.lanl.gov/software/)** project at Los Alamos National Laboratory (LANL). T-ELF presents an array of customizable software solutions crafted for analysis of datasets. Acting as a comprehensive toolbox, T-ELF specializes in data pre-processing, extraction of latent features, and structuring results to facilitate informed decision-making. Leveraging high-performance computing and cutting-edge GPU architectures, our toolbox is optimized for analyzing large datasets from diverse set of problems.\n\nCentral to T-ELF's core capabilities lie non-negative matrix and tensor factorization solutions for discovering multi-faceted hidden details in data, featuring automated model determination facilitating the estimation of latent factors or rank. This pivotal functionality ensures precise data modeling and the extraction of concealed patterns. Additionally, our software suite incorporates cutting-edge modules for both pre-processing and post-processing of data, tailored for diverse tasks including text mining, Natural Language Processing, and robust tools for matrix and tensor analysis and construction.\n\n\u003cdiv align=\"center\", style=\"font-size: 50px\"\u003e\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/smart_tensors_image.png\"\u003e\n\u003c/p\u003e\n\n\u003c/div\u003e\n\n\nT-ELF's adaptability spans across a multitude of disciplines, positioning it as a robust AI and data analytics solution. Its proven efficacy extends across various fields such as Large-scale Text Mining, High Performance Computing, Computer Security, Applied Mathematics, Dynamic Networks and Ranking, Biology, Material Science, Medicine, Chemistry, Data Compression, Climate Studies, Relational Databases, Data Privacy, Economy, and Agriculture.\n\n\n## Installation\n\n### Step 1: [Install Poetry to your system](https://python-poetry.org/docs/)\nThis step is optional. Use Pip or Conda if Poetry is not avaiable.\n\n### Step 2: Install the Library\n\n**Option 1: Install via Poetry or Pip**\n```shell\nconda create --name TELF python=3.11.10\nsource activate TELF # or \u003cconda activate TELF\u003e\npoetry install # or \u003cpip install .\u003e\n```\n\n**Option 2: Install via Conda**\n```shell\ngit clone https://gitlab.lanl.gov/maksim/telf_internal\ncd telf_internal\nconda env create --file environment_gpu.yml # use \u003cconda env create --file environment_cpu.yml\u003e for CPU only\nconda activate TELF_conda\nconda develop .\n```\n\n### Step 3: Post-installation Dependencies\nNext, we need to install the optional and additional dependencies. These include optional dependencies for GPU and HPC capabilities, as well as required dependencies like the SpaCy language models.\nTo view all available options, please run:\n```shell\npython post_install.py --help\n```\nInstall the additional dependencies:\n```shell\npython post_install.py # use the following, for example, for GPU system: \u003cpython post_install.py --gpu\u003e\n```\n\n#### Jupyter Setup Tutorial for using the examples ([Link](https://www.maksimeren.com/post/conda-and-jupyter-setup-for-research/))\n\n\n\n## Capabilities\n\n\u003cdiv align=\"center\", style=\"font-size: 50px\"\u003e\n\n\u003cp align=\"center\"\u003e\n  \u003cimg src=\"docs/Second_image_TensorNetworks.jpg\"\u003e\n\u003c/p\u003e\n\n### Please see our [:page_with_curl: Publications](https://smart-tensors.lanl.gov/publications/) for the capabilities\n\n\u003c/div\u003e\n\n\n## Modules\n\n### TELF.factorization\n\n|         **Method**        |      **Dense**     |     **Sparse**     |       **GPU**      |       **CPU**      | **Multiprocessing** |       **HPC**      |                          **Description**                         | **Example** |\n|:-------------------------:|:------------------:|:------------------:|:------------------:|:------------------:|:-------------------:|:------------------:|:----------------------------------------------------------------:|:-----------:|\n|            NMFk           | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |  :heavy_check_mark: | :heavy_check_mark: |              NMF with Automatic Model Determination                              |   [Link](examples/NMFk/NMFk.ipynb)  |\n|        Custom NMFk        | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |  :heavy_check_mark: | :heavy_check_mark: |                Use Custom NMF Functions with NMFk                                |   [Link](examples/NMFk/Custom_NMF_NMFk.ipynb)  |\n|          TriNMFk          | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |  :heavy_check_mark: |                    | NMF with Automatic Model Determination for Clusters and Patterns                 |   [Link](examples/TriNMFk/TriNMFk.ipynb)  |\n|          RESCALk          | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |  :heavy_check_mark: | :heavy_check_mark: |             RESCAL with Automatic Model Determination                            |   [Link](examples/RESCALk/RESCALk.ipynb)  |\n|           RNMFk           | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |  :heavy_check_mark: | :heavy_check_mark: |                         Recommender NMFk                                         |   [Link](examples/RNMFk/RNMFk.ipynb)  |\n|           SymNMFk         | :heavy_check_mark: |                    | :heavy_check_mark: | :heavy_check_mark: |  :heavy_check_mark: | :heavy_check_mark: |                         NMFk with Symmetric Clustering                           |   [Link](examples/SymNMFk/SymNMFk.ipynb)          |\n|           WNMFk           | :heavy_check_mark: |                    | :heavy_check_mark: | :heavy_check_mark: |  :heavy_check_mark: | :heavy_check_mark: |                         NMFk with weighting - used for recommendation system     |   [Link](examples/WNMFk/WNMFk.ipynb)          |\n|           HNMFk           | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |  :heavy_check_mark: | :heavy_check_mark: |                         Hierarchical NMFk                                        |   [Link](examples/HNMFk/HNMFk.ipynb)       |\n|           BNMFk           | :heavy_check_mark: |                    | :heavy_check_mark: | :heavy_check_mark: |  :heavy_check_mark: | :heavy_check_mark: |                           Boolean NMFk                                           |   [Link](examples/BNMFk/BNMFk.ipynb) |\n|           LMF             | :heavy_check_mark: |                    | :heavy_check_mark: | :heavy_check_mark: |                     |                    |                           Logistic Matrix Factorization                          |   [Link](examples/LMF/LMF.ipynb) |\n|         SPLIT             | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: |                     |        Joint NMFk factorization of multiple data via SPLIT                       | [Link](examples/SPLIT/00-SPLIT.ipynb) |\n| SPLITTransfer | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark: | :heavy_check_mark:  |                    |      Supervised transfer learning method via SPLIT and NMFk                      | [Link](examples/SPLITTransfer/00-SPLITTransfer.ipynb) |\n\n### TELF.pre_processing\n\n| **Method** | **Multiprocessing** |       **HPC**       |                           **Description**                          | **Example** |\n|:----------:|:-------------------:|:-------------------:|:------------------------------------------------------------------:|:-----------:|\n|   Vulture  | :heavy_check_mark:  | :heavy_check_mark:  |         Advanced text processing tool for cleaning and NLP         |  [Link](examples/Vulture)  |\n|   Beaver   | :heavy_check_mark:  | :heavy_check_mark:  |        Fast matrix and tensor building tool for text mining        |  [Link](examples/Beaver)  |\n|  iPenguin  | :heavy_check_mark:  |                     |         Online information retrieval tool for Scopus, SemanticScholar, and OSTI         | [Link](examples/iPenguin) |\n|    Orca    | :heavy_check_mark:  |                     | Duplicate author detector for text mining and information retrieval |   [Link](examples/Orca)          |\n\n### TELF.post_processing\n\n| **Method** |                       **Description**                      | **Example** |\n|:----------:|:----------------------------------------------------------:|:-----------:|\n|    Wolf    |              Graph centrality and ranking tool             |      [Link](examples/Wolf)       |\n|   Peacock  | Data visualization and generation of actionable statistics |  [Link](examples/Peacock) |\n|    SeaLion    |              Generic report generation tool            | [Link](examples/SeaLion) |\n|    Fox    |              Report generation tool for text data from NMFk using OpenAI            | [Link](examples/Fox)  |\n|    ArcticFox    |        Report generation tool for text data from HNMFk using local LLMs            | [Link](examples/ArcticFox)  |\n\n### TELF.applications\n\n| **Method** |                            **Description**                           | **Example** |\n|:----------:|:--------------------------------------------------------------------:|:-----------:|\n|   Cheetah  |                        Fast search by keywords and phrases                       |    [Link](examples/Cheetah)         |\n|    Bunny   | Dataset generation tool for documents and their citations/references |  [Link](examples/Bunny)  |\n|  Penguin   |         Text storage tool                                    | [Link](examples/Penguin) |\n|    Termite   | Knowladge graph building tool | :soon: |\n\n\n## How to Cite T-ELF?\nIf you use T-ELF please cite.\n\n**APA:**\n```latex\nEren, M., Solovyev, N., Barron, R., Bhattarai, M., Truong, D., Boureima, I., Skau, E., Rasmussen, K., \u0026 Alexandrov, B. (2023). Tensor Extraction of Latent Features (T-ELF) [Computer software]. https://doi.org/10.5281/zenodo.10257897\n```\n\n**BibTeX:**\n```latex\n@software{TELF,\n  author = {Eren, Maksim and Solovyev, Nick and Barron, Ryan and Bhattarai, Manish and Truong, Duc and Boureima, Ismael and Skau, Erik and Rasmussen, Kim and Alexandrov, Boian},\n  month = oct,\n  title = {{Tensor Extraction of Latent Features (T-ELF)}},\n  url = {https://github.com/lanl/T-ELF},\n  doi = {10.5281/zenodo.10257897},\n  year = {2023}\n}\n```\n\n## Authors\n- [Maksim Ekin Eren](mailto:maksim@lanl.gov): Information Systems and Modeling Group, Los Alamos National Laboratory ([Website](https://www.maksimeren.com/))\n- [Nicholas Solovyev](mailto:nks@lanl.gov): Theoretical Division, Los Alamos National Laboratory\n- [Ryan Barron](mailto:barron@lanl.gov): Theoretical Division, Los Alamos National Laboratory\n- [Manish Bhattarai](mailto:ceodspspectrum@lanl.gov): Theoretical Division, Los Alamos National Laboratory\n- [Duc Truong](mailto:dptruong@lanl.gov): Theoretical Division, Los Alamos National Laboratory\n- [Ismael Boureima](mailto:iboureima@lanl.gov): Theoretical Division, Los Alamos National Laboratory\n- [Erik Skau](mailto:ewskau@lanl.gov): Computer, Computational, and Statistical Sciences Division, Los Alamos National Laboratory\n- [Kim Rasmussen](mailto:kor@lanl.gov): Theoretical Division, Los Alamos National Laboratory\n- [Boian S. Alexandrov](mailto:boian@lanl.gov): Theoretical Division, Los Alamos National Laboratory\n\n## Patents\n\u003eBoian ALEXANDROV, o. S. F., New Mexico, Maksim Ekin EREN, of Sante Fe, New Mexico, Manish BHATTARAI, of Albuquerque, New Mexico, Kim Orskov RASMUSSEN of Sante Fe, New Mexico, and Charles K. NICHOLAS, of Columbia, Maryland, (“Assignor”) DATA IDENTIFICATION AND CLASSIFICATION METHOD, APPARATUS, AND SYSTEM. No. 63/472,188. Triad National Security, LLC. (June 9, 2023).\n\n\u003eBS. Alexandrov, LB. Alexandrov, and VG. Stanev et al. 2020. Source identification by non-negative matrix factorization combined with semi-supervised clustering. US Patent S10,776,718 (2020).\n\n## Copyright Notice\n\u003e© 2022. Triad National Security, LLC. All rights reserved.\nThis program was produced under U.S. Government contract 89233218CNA000001 for Los Alamos\nNational Laboratory (LANL), which is operated by Triad National Security, LLC for the U.S.\nDepartment of Energy/National Nuclear Security Administration. All rights in the program are\nreserved by Triad National Security, LLC, and the U.S. Department of Energy/National Nuclear\nSecurity Administration. The Government is granted for itself and others acting on its behalf a\nnonexclusive, paid-up, irrevocable worldwide license in this material to reproduce, prepare\nderivative works, distribute copies to the public, perform publicly and display publicly, and to permit\nothers to do so.\n\n**LANL C Number: C22048**\n\n## License\nThis program is open source under the BSD-3 License.\nRedistribution and use in source and binary forms, with or without modification, are permitted\nprovided that the following conditions are met:\n\n1. Redistributions of source code must retain the above copyright notice, this list of conditions and\nthe following disclaimer.\n \n2. Redistributions in binary form must reproduce the above copyright notice, this list of conditions\nand the following disclaimer in the documentation and/or other materials provided with the\ndistribution.\n \n3. Neither the name of the copyright holder nor the names of its contributors may be used to endorse\nor promote products derived from this software without specific prior written permission.\n\nTHIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS \"AS\nIS\" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE\nIMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR\nPURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR\nCONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL,\nEXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO,\nPROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS;\nOR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY,\nWHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR\nOTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF\nADVISED OF THE POSSIBILITY OF SUCH DAMAGE.\n\n\n## Developer Test Suite\nDeveloper test suites are located under [```tests/```](tests/) directory. Tests can be ran from this folder using ```python -m pytest *```.\n\n## LANL HPC Installation Notes\n\n### Chicoma\n```shell\n# replace \u003cpath to your conda environments under projects\u003e with your own path below.\nconda create --prefix=\u003cpath to your conda environments under projects\u003e python=3.11.10\nsource activate \u003cpath to your conda environments under projects\u003e # or use conda activate \u003c...\u003e\npip install .\npython post_install.py --gpu --hpc-conda\n```\n\n### Darwin\n```shell\nsalloc -n 1 -p shared-gpu\nmodule load openmpi\nmodule load miniconda3\nconda create --name TELF python=3.11.10\nconda activate TELF # or \u003csource activate TELF\u003e\npip install .\npython post_install.py --gpu --hpc\n```","project_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flanl%2Ft-elf","html_url":"https://awesome.ecosyste.ms/projects/github.com%2Flanl%2Ft-elf","lists_url":"https://awesome.ecosyste.ms/api/v1/projects/github.com%2Flanl%2Ft-elf/lists"}