https://github.com/synbiodex/seqtrainer
ML training on SBOL data
https://github.com/synbiodex/seqtrainer
Last synced: 4 months ago
JSON representation
ML training on SBOL data
- Host: GitHub
- URL: https://github.com/synbiodex/seqtrainer
- Owner: SynBioDex
- License: mit
- Created: 2025-05-29T22:56:34.000Z (about 1 year ago)
- Default Branch: main
- Last Pushed: 2025-09-26T01:32:56.000Z (9 months ago)
- Last Synced: 2025-10-05T23:25:38.399Z (8 months ago)
- Language: Jupyter Notebook
- Size: 75.2 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 2
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# SeqTrainer
ML package for generating tabular and graph datasets from Synthetic Biology data in SBOL, preprocesing, ML models, trainning and metrics.
## High Performance Computing (HPC) Usage
first you need to get the original datasets
navigate to https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE144621
download GSE144621_U00096.2_frag-rLP5_LB_expression.txt.gz and GSE144621_U00096.2_frag-rLP5_M9_expression.txt.gz
unzip those files and add them to the data/original_data folder
the names that we used for those files is frag-rLP5-LB_expression.txt and frag-rLP5-M9_expression.txt
navigate to the folder hpc
run 300K_preprocessing.py