https://github.com/althonos/pymemesuite
Cython bindings and Python interface to the MEME suite, a collection of tools for the analysis of sequence motifs.
https://github.com/althonos/pymemesuite
bioinformatics cython-library genomics meme-suite sequence-analysis sequence-motif
Last synced: 2 months ago
JSON representation
Cython bindings and Python interface to the MEME suite, a collection of tools for the analysis of sequence motifs.
- Host: GitHub
- URL: https://github.com/althonos/pymemesuite
- Owner: althonos
- License: mit
- Created: 2022-05-13T16:34:30.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2025-02-14T18:05:59.000Z (3 months ago)
- Last Synced: 2025-03-15T19:18:40.771Z (2 months ago)
- Topics: bioinformatics, cython-library, genomics, meme-suite, sequence-analysis, sequence-motif
- Language: Cython
- Homepage:
- Size: 23.7 MB
- Stars: 13
- Watchers: 2
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: CHANGELOG.md
- Contributing: CONTRIBUTING.md
- License: COPYING
Awesome Lists containing this project
README
# πβοΈ PyMEMEsuite [](https://github.com/althonos/pymemesuite/stargazers)
*[Cython](https://cython.org/) bindings and Python interface to the [MEME suite](https://meme-suite.org), a collection of tools for the analysis of sequence motifs.*
[](https://github.com/althonos/pymemesuite/actions)
[](https://codecov.io/gh/althonos/pymemesuite/)
[](https://pypi.org/project/pymemesuite)
[](https://pypi.org/project/pymemesuite/#files)
[](https://pypi.org/project/pymemesuite/#files)
[](https://pypi.org/project/pymemesuite/#files)
[](https://choosealicense.com/licenses/mit/)
[](https://github.com/althonos/pymemesuite/)
[](https://github.com/althonos/pymemesuite/issues)
[](https://pymemesuite.readthedocs.io)
[](https://github.com/althonos/pymemesuite/blob/master/CHANGELOG.md)
[](https://pepy.tech/project/pymemesuite)**π© If you are interested in running sequence motif analyses with PSSMs, I recommend using the [`lightmotif`](https://github.com/althonos/lightmotif/tree/main/lightmotif-py) project, which is a re-implementation of the algorithm used in FIMO, which can process sequences at speeds of GB/s.**
## πΊοΈ Overview
The [MEME suite](https://meme-suite.org/) is a collection of tools used for
discovery and analysis of biological sequence motifs.`pymemesuite` is a Python module, implemented using the [Cython](https://cython.org)
language, that provides bindings to the [MEME](https://meme-suite.org/) suite.
It directly interacts with the MEME internals, which has the following
advantages over CLI wrappers:- **single dependency**: If your software or your analysis pipeline is
distributed as a Python package, you can add `pymemesuite` as a dependency
to your project, and stop worrying about the MEME binaries being properly
setup on the end-user machine.
- **no intermediate files**: Everything happens in memory, in Python objects
you have control on, making it easier to pass your inputs to MEME without
needing to write them to a temporary file. Output retrieval is also done
in memory.*This library is still a work-in-progress, and in an experimental stage,
but it should already pack enough features to run biological analyses or
workflows involving [FIMO](https://meme-suite.org/meme/doc/fimo.html).*## π§ Installing
`pymemesuite` can be installed from [PyPI](https://pypi.org/project/pymemesuite/),
which hosts some pre-built CPython wheels for x86-64 Linux, as well as the
code required to compile from source with Cython:
```console
$ pip install pymemesuite
```## π‘ Example
Use `MotifFile` to load a motif from a MEME motif file, and display the
consensus motif sequence followed by the letter frequencies:```python
from pymemesuite.common import MotifFilewith MotifFile("tests/data/fimo/prodoric_mx000001_meme.txt") as motif_file:
motif = motif_file.read()print(motif.name.decode())
print(motif.consensus)for row in motif.frequencies:
print(" ".join(f'{freq:.2f}' for freq in row))
```Then use `FIMO` to find occurences of this particular motif in a collection of
sequences, and show coordinates of matches:```python
import Bio.SeqIO
from pymemesuite.common import Sequence
from pymemesuite.fimo import FIMOsequences = [
Sequence(str(record.seq), name=record.id.encode())
for record in Bio.SeqIO.parse("tests/data/fimo/mibig-genes.fna", "fasta")
]fimo = FIMO(both_strands=False)
pattern = fimo.score_motif(motif, sequences, motif_file.background)for m in pattern.matched_elements:
print(
m.source.accession.decode(),
m.start,
m.stop,
m.strand,
m.score,
m.pvalue,
m.qvalue
)
```You should then get a single matched element on the forward strand:
```
BGC0002035.1_3425_15590 6700 6714 + 9.328571428571422 1.1024163606971822e-05 0.6174858127445146
```## π Features
### π§Ά Thread-safety
`FIMO` objects are thread-safe, and the `FIMO.score_motif` and `FIMO.score_pssm`
methods are re-entrant. This means you can search occurences of several
motifs in parallel with a `ThreadPool` and a single `FIMO` instance:
```python
from multiprocessing.pool import ThreadPool
from pymemesuite.fimo import FIMOfimo = FIMO()
with ThreadPool() as pool:
patterns = pool.map(
lambda motif: fimo.score_motif(motif, sequences, background),
motifs
)
```### π Roadmap
- [ ] **error management**: Make sure to catch exceptions raised by the MEME core without exiting forcefully.
- [ ] **transfac**: Support for TRANSFAC motifs in addition to MEME motifs.
- [ ] **meme**: Motif discovery through enrichment analysis between two collections of sequences.## π Feedback
### β οΈ Issue Tracker
Found a bug ? Have an enhancement request ? Head over to the [GitHub issue
tracker](https://github.com/althonos/pymemesuite/issues) if you need to report
or ask something. If you are filing in on a bug, please include as much
information as you can about the issue, and try to recreate the same bug
in a simple, easily reproducible situation.### ποΈ Contributing
Contributions are more than welcome! See [`CONTRIBUTING.md`](https://github.com/althonos/pymemesuite/blob/master/CONTRIBUTING.md) for more details.
## βοΈ License
This library is provided under the [MIT License](https://choosealicense.com/licenses/mit/).
The MEME suite code is available under an academic license which allows
distribution and non-commercial usage. See `vendor/meme/COPYING` for more
information.Test sequence data were obtained from [MIBiG](https://mibig.secondarymetabolites.org/)
and are distributed under the [CC BY 4.0](https://creativecommons.org/licenses/by/4.0/)
license. Test motifs were obtained from [PRODORIC](https://www.prodoric.de) and are
distributed under the [CC BY-NC 4.0](https://creativecommons.org/licenses/by/4.0/)
license.*This project is in no way affiliated, sponsored, or otherwise endorsed by
the [original MEME suite authors](https://meme-suite.org/meme/doc/authors.html).
It was developed by [Martin Larralde](https://github.com/althonos/pymemesuite)
during his PhD project at the [European Molecular Biology Laboratory](https://www.embl.de/)
in the [Zeller team](https://github.com/zellerlab).*