https://github.com/linas/biome-distribution
Research into statistical distributions of genomes, proteomes and reactomes
https://github.com/linas/biome-distribution
genomic-data-analysis genomics-visualization proteogenomics proteomics reactome reactome-pathway
Last synced: 5 months ago
JSON representation
Research into statistical distributions of genomes, proteomes and reactomes
- Host: GitHub
- URL: https://github.com/linas/biome-distribution
- Owner: linas
- License: other
- Created: 2020-01-23T21:16:36.000Z (over 6 years ago)
- Default Branch: master
- Last Pushed: 2022-02-14T20:19:41.000Z (over 4 years ago)
- Last Synced: 2025-01-22T22:29:11.424Z (over 1 year ago)
- Topics: genomic-data-analysis, genomics-visualization, proteogenomics, proteomics, reactome, reactome-pathway
- Language: Scheme
- Size: 14.2 MB
- Stars: 6
- Watchers: 3
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE.txt
Awesome Lists containing this project
README
Biome Distributions
-------------------
Research concerning the statistical distribution of genetic
interactions, the proteins expressed by genes, and their
participation in reaction pathways.
The primary result is a paper, being prepared for publication.
It is in the [paper](./paper) directory. The
[PDF is here](./paper/biome-distributions.pdf).
It was naively hypothesized that genome/proteome reaction pathways
form a scale-free network, and thus would have a Zipfian distribution.
Much to our surprise, this is not the case! It seems like *everything*
follows a square-root Zipfian distribution! I do not know of any
network theory or biology theory that would explain this, so it is
a surprise.
An exploration of the mutual information of interaction pathways is
also performed. It appears that these are easily fit with a bimodal
Gaussian distribution.
This is for human genome/reactome data. I don't doubt that the results
are generic in biology.
### Directory layout
Other directories here:
* [diary](./diary) A research diary of notes.
* [graphs](./graphs) Graphs and the tools used to prepare them.
* [src](./src) Source code for loading, processing and analyzing
the genomic and proteomic data, including reactome data, etc.
### License

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.