https://github.com/cmdoret/sequence_loopability
Investigating sequence features underlying chromatin loops
https://github.com/cmdoret/sequence_loopability
Last synced: about 1 year ago
JSON representation
Investigating sequence features underlying chromatin loops
- Host: GitHub
- URL: https://github.com/cmdoret/sequence_loopability
- Owner: cmdoret
- License: gpl-3.0
- Created: 2021-03-27T09:37:56.000Z (about 5 years ago)
- Default Branch: master
- Last Pushed: 2021-06-03T13:58:42.000Z (about 5 years ago)
- Last Synced: 2025-02-17T23:47:34.978Z (over 1 year ago)
- Language: Jupyter Notebook
- Size: 7.45 MB
- Stars: 1
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# sequence_loopability
Investigating the sequence features underlying chromatin loops.
## Project structure:
* src: Stores the code for the main pipeline, data extraction, processing, training, evaluation...
* notebooks: Exploratory analyses in the form of jupyter notebooks.
* seqloops: Boilerplate code and utilities meant to be imported as a python package.
* scripts: various scripts that were used to generate the input data.
## Setup:
All dependencies can be installed using:
```bash
make deps
```
To make `seqloops` importable in python scripts and notebooks, you can run: `make setup`.
All input and output data are managed via dvc. They can be imported as follows:
```bash
pip install dvc[gdrive]
dvc pull
```
## Workflow
Code changes are managed via `git`. Data changes are managed via `dvc`, which is connected to a google drive folder.
When modifying or adding new datafiles in the `data` folder, the modifications must be uploaded to the dvc server.
The updated small tracker file (`.dvc`) must be commited to git to keep track of changes.
The standard process is as follows:
```bash
dvc add data
dvc push
git add data.dvc
git commit -m 'added new file'
git push
```