Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/servierhub/top-life-sciences
Top Life Sciences open-source software
https://github.com/servierhub/top-life-sciences
List: top-life-sciences
ai awesome awesome-list awesome-lists bioinformatics biology biology-ai computational-biology computational-chemistry ebiology life-sciences lifescience lifesciences pharma pharmaceuticals servier
Last synced: about 1 month ago
JSON representation
Top Life Sciences open-source software
- Host: GitHub
- URL: https://github.com/servierhub/top-life-sciences
- Owner: servierhub
- Created: 2024-05-31T22:55:08.000Z (9 months ago)
- Default Branch: main
- Last Pushed: 2024-06-09T21:14:12.000Z (9 months ago)
- Last Synced: 2025-01-08T09:02:38.239Z (about 1 month ago)
- Topics: ai, awesome, awesome-list, awesome-lists, bioinformatics, biology, biology-ai, computational-biology, computational-chemistry, ebiology, life-sciences, lifescience, lifesciences, pharma, pharmaceuticals, servier
- Language: Makefile
- Homepage:
- Size: 1.72 MB
- Stars: 6
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
- top-pharma-fr - **servierhub/top-life-sciences** - source software<br>`ai`, `awesome`, `awesome-list`, `awesome-lists`, `bioinformatics`, `biology`, `biology-ai`, `computational-biology`, `computational-chemistry`, `ebiology`, `life-sciences`, `lifescience`, `lifesciences`, `pharma`, `pharmaceuticals`, `servier`<br><img src='https://github.com/HubTou/topgh/blob/main/icons/gstars.png'> 1 <img src='https://github.com/HubTou/topgh/blob/main/icons/last.png'> 2024-06-06 07:56:59 | (Ranked by starred repositories)
- ultimate-awesome - top-life-sciences - Top Life Sciences open-source software. (Other Lists / Julia Lists)
README
[](https://github.com/ServierHub/)
# Top life sciences open source software
This is an automatically generated[^1] **ranked list** of [open source](https://opensource.org/osd) software from
[pharmaceutical companies](https://en.wikipedia.org/wiki/List_of_pharmaceutical_companies) and cross organizations,
[biotechnology companies](https://en.wikipedia.org/wiki/Category:Biotechnology_companies),
research institutes,
open source communities and individuals,
plus some life-science software from technological companies.It's made from a **curated** list of [GitHub accounts](Results/SOURCES.md), and will be periodically refreshed from these sources' repositories.
You can also access [what they have updated lately](Results/NEW.md)
and [which topics are covered](Results/TOPICS.md) by these software.## Ranked by starred repositories
> [!NOTE]
>**stars** - number of people who especially appreciated the repository
>**forks** - number of people who have cloned the repository in order to modify it
>**watchers** - number of people who are monitoring changes in the repository
>**main programming language**
>**license**
>**last update date & time**
|Rank|Software|
|---|:---|
|1|[**google-deepmind/alphafold**](https://github.com/google-deepmind/alphafold)
Open source code for AlphaFold.11987
2135
226
Python
Apache-2.0 license
2023-04-05 09:45:53 |
|2|[**deepchem/deepchem**](https://github.com/deepchem/deepchem)
Democratizing Deep-Learning for Drug Discovery, Quantum Chemistry, Materials Science and Biology
`biology`, `deep-learning`, `drug-discovery`, `hacktoberfest`, `materials-science`, `quantum-chemistry`5220
1626
Python
MIT License
2024-06-08 13:03:11 |
|3|[**biopython/biopython**](https://github.com/biopython/biopython)
Official git repository for Biopython (originally converted from CVS)
`bioinformatics`, `biopython`, `dna`, `genomics`, `phylogenetics`, `protein`, `protein-structure`, `python`, `sequence-alignment`4213
1728
168
Python
Unknown LICENSE |
|4|[**google/deepvariant**](https://github.com/google/deepvariant)
DeepVariant is an analysis pipeline that uses a deep neural network to call genetic variants from next-generation DNA sequencing data.
`bioinformatics`, `deep-learning`, `deep-neural-network`, `deepvariant`, `dna`, `genome`, `genomics`, `machine-learning`, `ngs`, `science`, `sequencing`, `tensorflow`3100
698
159
Python
BSD-3-Clause license
2024-03-19 19:20:10 |
|5|[**facebookresearch/esm**](https://github.com/facebookresearch/esm)
Evolutionary Scale Modeling (esm): Pretrained language models for proteins2917
577
63
Python
MIT license
2022-10-18 13:38:47 |
|6|[**aqlaboratory/openfold**](https://github.com/aqlaboratory/openfold)
Trainable, memory-efficient, and GPU-friendly PyTorch reproduction of AlphaFold 2
`alphafold2`, `protein-structure`, `pytorch`2572
466
Python
Apache License 2.0
2024-06-04 08:33:28 |
|7|[**rdkit/rdkit**](https://github.com/rdkit/rdkit)
The official sources for the RDKit library
`c-plus-plus`, `cheminformatics`, `python`, `rdkit`2483
845
HTML
BSD 3-Clause "New" or "Revised" License
2024-06-08 03:18:22 |
|8|[**AstraZeneca/awesome-explainable-graph-reasoning**](https://github.com/AstraZeneca/awesome-explainable-graph-reasoning)
A collection of research papers and software related to explainability in graph machine learning.
`awesome-list`, `deep-learning`, `explainable-ai`, `explainable-ml`, `graph`, `graph-algorithms`, `graphml`1941
129
Apache License 2.0
2022-04-04 14:54:08 |
|9|[**OpenGene/fastp**](https://github.com/OpenGene/fastp)
An ultra-fast all-in-one FASTQ preprocessor (QC/adapters/trimming/filtering/splitting/merging...)
`adapter`, `bioinformatics`, `duplication`, `fastq`, `filter`, `filtering`, `illumina`, `merging`, `ngs`, `overlap`, `polyg`, `preprocessing`, `qc`, `quality`, `quality-control`, `sequencing`, `splitting`, `trimming`, `umi`1803
333
C++
MIT License
2024-04-07 08:16:11 |
|10|[**scverse/scanpy**](https://github.com/scverse/scanpy)
Single-cell analysis in Python. Scales to >1M cells.
`anndata`, `bioinformatics`, `data-science`, `machine-learning`, `python`, `scanpy`, `scverse`, `transcriptomics`, `visualize-data`1789
579
Python
BSD 3-Clause "New" or "Revised" License
2024-06-07 08:43:34 |
|11|[**lh3/minimap2**](https://github.com/lh3/minimap2)
A versatile pairwise aligner for genomic and spliced nucleotide sequences
`bioinformatics`, `genomics`, `sequence-alignment`, `spliced-alignment`1708
396
C
Other
2024-05-22 19:58:33 |
|12|[**allenai/scispacy**](https://github.com/allenai/scispacy)
A full spaCy pipeline and models for scientific/biomedical documents.
`bioinformatics`, `biomedical`, `custom-pipes`, `nlp`, `scientific-documents`, `spacy`1629
221
52
Python
Apache-2.0 license
2024-03-08 05:57:56 |
|13|[**broadinstitute/gatk**](https://github.com/broadinstitute/gatk)
Official code repository for GATK versions 4 and up
`bioinformatics`, `dna`, `gatk`, `genome`, `genomics`, `ngs`, `science`, `sequencing`, `spark`1621
577
156
Java
specific
2023-12-13 22:53:56 |
|14|[**bioconda/bioconda-recipes**](https://github.com/bioconda/bioconda-recipes)
Conda recipes for the bioconda channel.
`bioinformatics`, `conda`, `hacktoberfest`, `package-management`1595
3089
96
Shell
MIT license |
|15|[**samtools/samtools**](https://github.com/samtools/samtools)
Tools (written in C using htslib) for manipulating next-generation sequencing data1572
572
C
Other
2024-06-07 09:32:59 |
|16|[**Slicer/Slicer**](https://github.com/Slicer/Slicer)
Multi-platform, free open source software for visualization and image computing.
`3d-printing`, `3d-slicer`, `c-plus-plus`, `computed-tomography`, `image-guided-therapy`, `image-processing`, `itk`, `kitware`, `medical-image-computing`, `medical-imaging`, `national-institutes-of-health`, `neuroimaging`, `nih`, `python`, `qt`, `registration`, `segmentation`, `tcia-dac`, `tractography`, `vtk`1521
520
38
C++
specific |
|17|[**lh3/bwa**](https://github.com/lh3/bwa)
Burrow-Wheeler Aligner for short-read alignment (see minimap2 for long-read alignment)
`bioinformatics`, `fm-index`, `genomics`, `sequence-alignment`1468
547
C
GNU General Public License v3.0
2024-04-15 02:54:32 |
|18|[**DeepGraphLearning/torchdrug**](https://github.com/DeepGraphLearning/torchdrug)
A powerful and flexible machine learning platform for drug discovery
`deep-learning`, `drug-discovery`, `graph-neural-networks`, `pytorch`1407
194
31
Python
Apache-2.0 license
2023-07-16 22:37:17 |
|19|[**lh3/seqtk**](https://github.com/lh3/seqtk)
Toolkit for processing sequences in FASTA/Q formats
`bioinformatics`, `sequence-analysis`1332
310
C
MIT License
2023-10-24 15:01:39 |
|20|[**galaxyproject/galaxy**](https://github.com/galaxyproject/galaxy)
Data intensive science for everyone.
`bioinformatics`, `dna`, `genomics`, `hacktoberfest`, `ngs`, `pipeline`, `science`, `sequencing`, `usegalaxy`, `workflow`, `workflow-engine`1329
967
69
Python
specific
2024-05-07 13:56:26 |
|21|[**schrodinger/fixed-data-table-2**](https://github.com/schrodinger/fixed-data-table-2)
A React table component designed to allow presenting millions of rows of data.1290
289
JavaScript
Other
2024-05-23 05:13:10 |
|22|[**soedinglab/MMseqs2**](https://github.com/soedinglab/MMseqs2)
MMseqs2: ultra fast and sensitive search and clustering suite
`alignment`, `bioinformatics`, `blast`, `linclust`, `metagenomics`, `mmseqs`, `profile-search`, `sequence-clustering`, `sequence-search`, `taxonomy`1281
181
C
GNU General Public License v3.0
2024-05-23 07:07:21 |
|23|[**facebookresearch/fastMRI**](https://github.com/facebookresearch/fastMRI)
A large-scale dataset of both raw MRI measurements and clinical MRI images.
`convolutional-neural-networks`, `deep-learning`, `fastmri`, `fastmri-challenge`, `fastmri-dataset`, `medical-imaging`, `mri`, `mri-reconstruction`, `pytorch`1259
370
74
Python
MIT license
2023-06-26 17:17:06 |
|24|[**greenelab/deep-review**](https://github.com/greenelab/deep-review)
A collaboratively written review paper on deep learning, genomics, and precision medicine
`deep-learning`, `genomics`, `manubot`, `manuscript`, `neural-networks`, `review`1235
271
129
HTML
Unknown LICENSE.md
2018-03-12 15:06:48 |
|25|[**shenwei356/seqkit**](https://github.com/shenwei356/seqkit)
A cross-platform and ultrafast toolkit for FASTA/Q file manipulation
`bioinformatics`, `cross-platform`, `fasta`, `fastq`, `golang`, `manipulation`, `sequence`, `tool`, `toolkit`1226
157
26
Go
MIT license
2024-05-17 15:59:35 |
|26|[**MultiQC/MultiQC**](https://github.com/MultiQC/MultiQC)
Aggregate results from bioinformatics analyses across many samples into a single report.
`analysis`, `bioconda`, `bioinformatics`, `data-visualization`, `multiqc`, `pypi`, `python`, `quality-control`, `reporting`, `seqera`, `vizualisation`1185
582
37
JavaScript
GPL-3.0 license
2024-05-31 18:30:12 |
|27|[**dcm4che/dcm4che**](https://github.com/dcm4che/dcm4che)
DICOM Implementation in JAVA1165
637
119
Java
specific
2024-04-22 10:59:11 |
|28|[**scverse/scvi-tools**](https://github.com/scverse/scvi-tools)
Deep probabilistic analysis of single-cell and spatial omics data
`cite-seq`, `deep-generative-model`, `deep-learning`, `human-cell-atlas`, `scrna-seq`, `scverse`, `single-cell-genomics`, `single-cell-rna-seq`, `variational-autoencoder`, `variational-bayes`1149
342
Python
BSD 3-Clause "New" or "Revised" License
2024-06-05 17:01:13 |
|29|[**vgteam/vg**](https://github.com/vgteam/vg)
tools for working with genome variation graphs
`dna`, `genome-graph`, `genomics`, `graph`, `variation-graph`1072
191
48
C++
specific
2024-05-20 18:50:28 |
|30|[**schrodinger/pymol-open-source**](https://github.com/schrodinger/pymol-open-source)
Open-source foundation of the user-sponsored PyMOL molecular visualization system.1071
260
C
Other
2024-06-06 19:36:48 |
|31|[**scipipe/scipipe**](https://github.com/scipipe/scipipe)
Robust, flexible and resource-efficient pipelines using Go and the commandline
`bioinformatics`, `bioinformatics-pipeline`, `cheminformatics`, `dataflow`, `fbp`, `go`, `golang`, `pipeline`, `scientific-workflows`, `scipipe`, `workflow`, `workflow-engine`1055
72
38
Go
MIT license
2021-10-14 09:11:34 |
|32|[**shenwei356/csvtk**](https://github.com/shenwei356/csvtk)
A cross-platform, efficient and practical CSV/TSV toolkit in Golang
`bioinformatics`, `command-line`, `cross-platform`, `csv`, `golang`, `tool`, `toolkit`, `tsv`972
85
25
Go
MIT license
2024-05-29 15:30:38 |
|33|[**bigdatagenomics/adam**](https://github.com/bigdatagenomics/adam)
ADAM is a genomics analysis platform with specialized file formats built using Apache Avro, Apache Spark, and Apache Parquet. Apache 2 licensed.
`avro`, `big-data`, `bioinformatics`, `genomics`, `java`, `parquet`, `python`, `r`, `scala`, `spark`967
304
Scala
Apache License 2.0
2024-03-23 13:27:52 |
|34|[**broadinstitute/cromwell**](https://github.com/broadinstitute/cromwell)
Scientific workflow engine designed for simplicity & scalability. Trivially transition between one off use cases to massive scale production environments
`application`, `bioinformatics`, `cloud`, `containers`, `docker`, `executor`, `ga4gh`, `hpc`, `scala`, `wdl`, `workflow`, `workflow-description-language`, `workflow-execution`965
351
112
Scala
BSD-3-Clause LICENSE.txt
2024-05-07 17:47:13 |
|35|[**hail-is/hail**](https://github.com/hail-is/hail)
Cloud-native genomic dataframes and batch computing
`bioinformatics`, `genetics`, `genomics`, `gwas`, `hail`, `python`, `software`, `vcf`946
238
55
Python
MIT license
2024-06-05 17:48:05 |
|36|[**broadinstitute/picard**](https://github.com/broadinstitute/picard)
A set of command line tools (in Java) for manipulating high-throughput sequencing (HTS) data and formats such as SAM/BAM/CRAM and VCF.944
365
160
Java
MIT license
2023-11-14 22:01:18 |
|37|[**aqlaboratory/proteinnet**](https://github.com/aqlaboratory/proteinnet)
Standardized data set for machine learning of protein structure
`dataset`, `deep-learning`, `machine-learning`, `protein-sequence`, `protein-structure`, `proteins`849
130
Python
MIT License
2020-11-18 23:43:32 |
|38|[**shenwei356/rush**](https://github.com/shenwei356/rush)
A cross-platform command-line tool for executing jobs in parallel
`bioinformatics`, `command`, `cross-platform`, `execute`, `golang`, `parallel`, `pipeline`, `shell`, `windows`834
63
20
Go
MIT license
2023-11-13 17:53:58 |
|39|[**evo-design/evo**](https://github.com/evo-design/evo)
DNA foundation modeling from molecular to genome scale832
97
Jupyter Notebook
Apache License 2.0
2024-04-30 22:35:34 |
|40|[**PaddlePaddle/PaddleHelix**](https://github.com/PaddlePaddle/PaddleHelix)
Bio-Computing Platform Featuring Large-Scale Representation Learning and Multi-Task Deep Learning “螺旋桨”生物计算工具集
`biocomputing`, `ddi`, `deeplearning`, `dti`, `graph-networks`, `machine-learning`, `molecule-design`, `ppi`, `protein-design`, `protein-docking`, `protein-folding`, `protein-structure-prediction`, `representation-learning`, `rna-structure-prediction`, `self-supervised-learning`799
188
25
Python
Apache-2.0 license
2023-08-01 09:31:36 |
|41|[**samtools/htslib**](https://github.com/samtools/htslib)
C library for high-throughput sequencing data formats
`bam`, `bcf`, `bioinformatics`, `cram`, `htslib`, `ngs`, `sam`, `vcf`779
448
C
Other
2024-06-06 15:40:15 |
|42|[**google/nucleus**](https://github.com/google/nucleus)
Python and C++ code for reading and writing genomics data.
`bioinformatics`, `dna`, `genomics`, `tensorflow`777
126
53
C++
specific
2021-08-31 23:19:33 |
|43|[**nroduit/Weasis**](https://github.com/nroduit/Weasis)
Weasis is a DICOM viewer available as a desktop application or as a web-based application.
`dicom`, `dicom-image`, `dicom-image-viewer`, `dicom-images`, `dicom-pr`, `dicom-rt`, `dicom-seg`, `dicom-viewer`, `dicom-web-viewer`, `dicomweb`, `ecg`, `export-dicom`, `medical`, `medical-imaging`, `multiplanar-reconstruction`, `viewer`, `volume-rendering`, `weasis`763
281
49
Java
specific
2024-05-06 18:42:54 |
|44|[**baidu-research/NCRF**](https://github.com/baidu-research/NCRF)
Cancer metastasis detection with neural conditional random field (NCRF)
`camelyon16`, `conditional-random-fields`, `deep-learning`, `pathology`, `whole-slide-imaging`749
184
37
Python
Apache-2.0 license
2018-06-17 18:22:34 |
|45|[**AstraZeneca/chemicalx**](https://github.com/AstraZeneca/chemicalx)
A PyTorch and TorchDrug based deep learning library for drug pair scoring. (KDD 2022)
`biology`, `chemistry`, `deep-chemistry`, `deep-learning`, `drug`, `drug-discovery`, `drug-interaction`, `drug-pair`, `geometric-deep-learning`, `geometry`, `graph-neural-network`, `machine-learning`, `pharma`, `polypharmacy`, `pytorch`, `smiles`, `smiles-strings`, `torch`, `torchdrug`701
89
Python
Apache License 2.0
2023-09-11 08:01:43 |
|46|[**samtools/hts-specs**](https://github.com/samtools/hts-specs)
Specifications of SAM/BAM and related high-throughput sequencing file formats627
173
TeX
2024-06-06 06:50:26 |
|47|[**samtools/bcftools**](https://github.com/samtools/bcftools)
This is the official development repository for BCFtools. See installation instructions and other documentation here http://samtools.github.io/bcftools/howtos/install.html626
241
C
Other
2024-06-07 13:13:17 |
|48|[**insilicomedicine/GENTRL**](https://github.com/insilicomedicine/GENTRL)
Generative Tensorial Reinforcement Learning (GENTRL) model596
216
Python
2020-04-28 11:58:05 |
|49|[**shenwei356/awesome**](https://github.com/shenwei356/awesome)
Awesome resources on Bioinformatics, data science, machine learning, programming language (Python, Golang, R, Perl) and miscellaneous stuff.
`awesome`, `data-science`, `git`, `golang`, `linux`, `perl`, `programing-language`, `python`593
163
35
![]()
MIT license
2023-09-25 02:09:01 |
|50|[**chanzuckerberg/cellxgene**](https://github.com/chanzuckerberg/cellxgene)
An interactive explorer for single-cell transcriptomics data
`dataviz`, `scientific`, `scrna-seq`, `transcriptomics`, `visualization`591
111
33
JavaScript
MIT license
2023-12-19 22:19:07 |
|51|[**invesalius/invesalius3**](https://github.com/invesalius/invesalius3)
3D medical imaging reconstruction software584
277
37
Python
GPL-2.0 license
2022-04-14 02:28:31 |
|52|[**lh3/bioawk**](https://github.com/lh3/bioawk)
BWK awk modified for biological data
`bioinformatics`, `sequence-analysis`582
121
C
2022-08-11 01:06:45 |
|53|[**MolecularAI/aizynthfinder**](https://github.com/MolecularAI/aizynthfinder)
A tool for retrosynthetic planning
`astrazeneca`, `chemical-reactions`, `cheminformatics`, `monte-carlo-tree-search`, `neural-networks`, `reaction-informatics`548
125
Python
MIT License
2024-06-03 13:34:33 |
|54|[**owkin/PyDESeq2**](https://github.com/owkin/PyDESeq2)
A Python implementation of the DESeq2 pipeline for bulk RNA-seq DEA.
`bioinformatics`, `differential-expression`, `python`, `rna-seq`, `transcriptomics`533
58
Python
MIT License
2024-06-06 01:43:52 |
|55|[**broadinstitute/infercnv**](https://github.com/broadinstitute/infercnv)
Inferring CNV from Single-Cell RNA-Seq520
159
42
R
specific
2020-02-07 20:29:28 |
|56|[**scverse/anndata**](https://github.com/scverse/anndata)
Annotated data.
`anndata`, `bioinformatics`, `data-science`, `machine-learning`, `scanpy`, `scverse`, `transcriptomics`511
148
Python
BSD 3-Clause "New" or "Revised" License
2024-06-07 16:03:50 |
|57|[**soedinglab/hh-suite**](https://github.com/soedinglab/hh-suite)
Remote protein homology detection suite.
`alignment`, `bioinformatics`, `cpp`, `hh-suite`, `hhblits`, `hhpred`, `hhsearch`, `opensource`, `profile-profile-search`, `profile-search`, `protein-structure`, `sequence-search`, `simd`, `viterbi`509
128
C
GNU General Public License v3.0
2023-08-13 08:44:05 |
|58|[**chhylp123/hifiasm**](https://github.com/chhylp123/hifiasm)
Hifiasm: a haplotype-resolved assembler for accurate Hifi reads
`bioinformatics`, `denovo-assembly`, `genomics`, `hifi-read`, `pacbio`490
84
28
C++
MIT license
2024-05-06 14:29:45 |
|59|[**insitro/redun**](https://github.com/insitro/redun)
Yet another redundant workflow engine
`aws`, `bioinformatics`, `data-engineering`, `data-science`, `docker`, `etl`, `gcp`, `ml`, `python`, `workflow-engine`489
40
Python
Apache License 2.0
2024-06-06 18:52:56 |
|60|[**biosustain/potion**](https://github.com/biosustain/potion)
Flask-Potion is a RESTful API framework for Flask and SQLAlchemy, Peewee or MongoEngine
`flask`, `flask-extensions`, `mongoengine`, `peewee`, `sqlalchemy`488
51
Python
Other
2019-04-23 17:00:39 |
|61|[**google-deepmind/alphamissense**](https://github.com/google-deepmind/alphamissense)461
58
25
Python
Apache-2.0 license |
|62|[**scverse/squidpy**](https://github.com/scverse/squidpy)
Spatial Single Cell Analysis in Python
`data-visualization`, `image-analysis`, `single-cell-genomics`, `single-cell-rna-seq`, `spatial-analysis`, `spatial-transcriptomics`, `squidpy`399
71
Python
BSD 3-Clause "New" or "Revised" License
2024-06-08 21:22:47 |
|63|[**lh3/minigraph**](https://github.com/lh3/minigraph)
Sequence-to-graph mapper and graph generator
`bioinformatics`, `genome-graph`, `genomics`, `pan-genome`, `sequence-alignment`394
38
C
MIT License
2024-05-22 00:59:12 |
|64|[**benevolentAI/guacamol**](https://github.com/benevolentAI/guacamol)
Benchmarks for generative chemistry383
82
Python
MIT License
2024-02-11 08:59:38 |
|65|[**calico/basenji**](https://github.com/calico/basenji)
Sequential regulatory activity predictions with deep convolutional neural networks.373
119
Python
Apache License 2.0
2024-05-28 20:08:23 |
|66|[**ome/bioformats**](https://github.com/ome/bioformats)
Bio-Formats is a Java library for reading and writing data in life sciences image file formats. It is developed by the Open Microscopy Environment. Bio-Formats is released under the GNU General Public License (GPL); commercial licenses are available from Glencoe Software.
`bio-formats`, `format-converter`, `format-reader`, `image`, `java`, `life-sciences-image`, `lightsheet`, `metadata`, `whole-slide-imaging`, `wsi`367
239
Java
GNU General Public License v2.0
2024-06-07 19:34:33 |
|67|[**MolecularAI/GraphINVENT**](https://github.com/MolecularAI/GraphINVENT)
Graph neural networks for molecular design.356
74
Python
MIT License
2023-03-11 11:55:32 |
|67|[**chembl/chembl_webresource_client**](https://github.com/chembl/chembl_webresource_client)
Official Python client for accessing ChEMBL API
`chembl`, `cheminformatics`, `chemistry`, `chemoinformatics`, `python`, `rest`, `rest-client`356
95
Python
Other
2024-02-26 15:44:57 |
|68|[**shenwei356/taxonkit**](https://github.com/shenwei356/taxonkit)
A Practical and Efficient NCBI Taxonomy Toolkit, also supports creating NCBI-style taxdump files for custom taxonomies like GTDB/ICTV
`bioinformatics`, `cross-platform`, `lca`, `lineage`, `taxdump`, `taxid`, `taxonkit`, `taxonomy`342
29
10
Go
MIT license
2024-04-25 17:15:34 |
|69|[**deepchem/DeepLearningLifeSciences**](https://github.com/deepchem/DeepLearningLifeSciences)
Example code from the book "Deep Learning for the Life Sciences"338
150
Jupyter Notebook
MIT License
2021-09-17 05:10:37 |
|70|[**MolecularAI/Reinvent**](https://github.com/MolecularAI/Reinvent)
`astrazeneca`, `cheminformatics`, `denovo-design`, `neural-networks`, `reinforcement-learning`, `transfer-learning`332
108
Python
Apache License 2.0
2023-10-19 05:26:16 |
|71|[**aqlaboratory/rgn**](https://github.com/aqlaboratory/rgn)
Recurrent Geometric Networks for end-to-end differentiable learning of protein structure
`deep-learning`, `deep-neural-networks`, `protein-structure`, `protein-structure-prediction`326
89
Python
MIT License
2019-08-01 14:17:59 |
|72|[**tencent-ailab/grover**](https://github.com/tencent-ailab/grover)
This is a Pytorch implementation of the paper: Self-Supervised Graph Transformer on Large-Scale Molecular Data313
68
7
Python
specific
2021-01-18 09:06:32 |
|73|[**lh3/miniprot**](https://github.com/lh3/miniprot)
Align proteins to genomes with splicing and frameshift
`bioinformatics`, `sequence-alignment`305
16
C
MIT License
2024-04-12 21:01:25 |
|74|[**Roche/pyreadstat**](https://github.com/Roche/pyreadstat)
Python package to read sas, spss and stata files into pandas data frames. It is a wrapper for the C library readstat.
`conversion`, `pandas-dataframe`, `python`, `readstat`, `sas7bdat`, `spss`, `stata-files`303
55
C
Other
2024-06-04 09:55:07 |
|75|[**lh3/miniasm**](https://github.com/lh3/miniasm)
Ultrafast de novo assembly for long noisy reads (though having no consensus step)
`bioinformatics`, `denovo-assembly`, `genomics`293
68
TeX
MIT License
2023-12-13 01:35:58 |
|76|[**chanzuckerberg/MedMentions**](https://github.com/chanzuckerberg/MedMentions)
A corpus of Biomedical papers annotated with mentions of UMLS entities.291
31
25
|
|77|[**AstraZeneca/rexmex**](https://github.com/AstraZeneca/rexmex)
A general purpose recommender metrics library for fair evaluation.
`coverage`, `deep-learning`, `evaluation`, `machine-learning`, `metric`, `metrics`, `mrr`, `personalization`, `precision`, `rank`, `ranking`, `recall`, `recommender`, `recommender-system`, `recsys`, `rsquared`275
25
Python
2023-08-22 09:22:20 |
|78|[**samtools/htsjdk**](https://github.com/samtools/htsjdk)
A Java API for high-throughput sequencing data (HTS) formats.
`bam`, `cram`, `dna`, `fasta`, `genomics`, `java`, `java-api`, `ngs`, `sam`, `sequencing`, `vcf`274
244
Java
2024-06-04 18:40:43 |
|79|[**shenwei356/brename**](https://github.com/shenwei356/brename)
A practical cross-platform command-line tool for safely batch renaming files/directories via regular expression
`batch`, `batch-rename`, `batch-rename-files`, `batch-renamer`, `go`, `golang`, `rename`, `safe`, `windows`254
21
6
Go
MIT license
2024-04-14 08:22:45 |
|80|[**lh3/wgsim**](https://github.com/lh3/wgsim)
Reads simulator
`bioinformatics`, `genomics`252
90
C
2021-09-03 14:58:22 |
|81|[**Acellera/htmd**](https://github.com/Acellera/htmd)
HTMD: Programming Environment for Molecular Discovery
`automate`, `drug-discovery`, `htmd`, `molecular-simulations`250
58
Rich Text Format
Other
2024-06-07 15:24:26 |
|82|[**DeepGraphLearning/GearNet**](https://github.com/DeepGraphLearning/GearNet)
GearNet and Geometric Pretraining Methods for Protein Structure Representation Learning, ICLR'2023 (https://arxiv.org/abs/2203.06125)
`graph-neural-networks`, `pre-training`, `protein-representation-learning`249
26
10
Python
MIT license |
|83|[**MolecularAI/REINVENT4**](https://github.com/MolecularAI/REINVENT4)
AI molecular design tool for de novo design, scaffold hopping, R-group replacement, linker design and molecule optimization.
`ai`, `astrazeneca`, `cheminformatics`, `chemistry`, `deep-learning`, `denovo-design`, `drug-design`, `drug-discovery`, `generative-ai`, `ml`, `molecule-generation`, `neural-networks`, `reinforcement-learning`, `transfer-learning`247
57
Python
Apache License 2.0
2024-04-27 11:00:08 |
|84|[**rdkit/rdkit-tutorials**](https://github.com/rdkit/rdkit-tutorials)
Tutorials to learn how to work with the RDKit239
71
Jupyter Notebook
Other
2023-03-19 13:36:55 |
|85|[**insightsengineering/rtables**](https://github.com/insightsengineering/rtables)
Reporting tables with R
`pharmaceuticals`, `r`, `tables`213
49
R
Other
2024-06-07 21:27:39 |
|86|[**Bayer-Group/cloudformation-template-generator**](https://github.com/Bayer-Group/cloudformation-template-generator)
A type-safe Scala DSL for generating CloudFormation templates211
71
Scala
BSD 3-Clause "New" or "Revised" License
2022-07-29 11:32:04 |
|87|[**pharmaverse/admiral**](https://github.com/pharmaverse/admiral)
ADaM in R Asset Library
`cdisc`, `clinical-trials`, `open-source`, `r`207
53
R
Apache License 2.0
2024-06-07 18:23:44 |
|87|[**OpenGene/awesome-bio-datasets**](https://github.com/OpenGene/awesome-bio-datasets)
awesome-bio-datasets207
42
MIT License
2017-10-28 12:32:15 |
|88|[**OpenGene/AfterQC**](https://github.com/OpenGene/AfterQC)
Automatic Filtering, Trimming, Error Removing and Quality Control for fastq data
`adapter-trimming`, `bioinformatics`, `error`, `fastq`, `filtering`, `ngs`, `overlap`, `qc`, `quality-control`, `sequencing`, `trimming`203
50
Python
MIT License
2020-05-14 07:15:54 |
|89|[**Bayer-Group/etcd-aws-cluster**](https://github.com/Bayer-Group/etcd-aws-cluster)
A container to assist in managing a etcd2 cluster from an Amazon auto scaling group202
102
Shell
BSD 3-Clause "New" or "Revised" License
2017-02-01 01:09:05 |
|89|[**modernatx/seqlike**](https://github.com/modernatx/seqlike)
Unified biological sequence manipulation in Python
`biological-sequences`, `biopython`, `machine-learning`, `sequence`202
18
Python
Apache License 2.0
2024-02-16 13:13:05 |
|89|[**scverse/scirpy**](https://github.com/scverse/scirpy)
A scanpy extension to analyse single-cell TCR and BCR data.202
31
Python
BSD 3-Clause "New" or "Revised" License
2024-06-06 06:21:35 |
|90|[**lh3/gfatools**](https://github.com/lh3/gfatools)
Tools for manipulating sequence graphs in the GFA and rGFA formats
`bioinformatics`, `genome-graph`, `genomics`201
18
C
2024-02-20 15:29:14 |
|90|[**scverse/muon**](https://github.com/scverse/muon)
muon is a multimodal omics Python framework
`anndata`, `cite-seq`, `mudata`, `multi-omics`, `multimodal-data`, `multimodal-omics-analysis`, `muon`, `scanpy`, `scatac-seq`, `scrna-seq`, `scverse`201
28
Python
BSD 3-Clause "New" or "Revised" License
2024-05-30 21:21:35 |
|91|[**aws-samples/aws-batch-genomics**](https://github.com/aws-samples/aws-batch-genomics)
Software sets up and runs an genome sequencing analysis workflow using AWS Batch and AWS Step Functions.199
75
39
Python
Apache-2.0 license
2018-11-29 18:40:42 |
|92|[**rdkit/mmpdb**](https://github.com/rdkit/mmpdb)
A package to identify matched molecular pairs and use them to predict property changes.195
53
Python
Other
2024-04-30 10:55:30 |
|93|[**Acellera/moleculekit**](https://github.com/Acellera/moleculekit)
MoleculeKit: Your favorite molecule manipulation kit
`drug-discovery`, `machine-learning`, `molecular-modeling`, `molecular-simulation`, `molecule`, `proteins`193
35
Python
Other
2024-06-04 13:53:30 |
|94|[**bioinform/somaticseq**](https://github.com/bioinform/somaticseq)
An ensemble approach to accurately detect somatic mutations using SomaticSeq
`cancer-genomics`, `somatic-variants`189
53
Python
BSD 2-Clause "Simplified" License
2024-05-30 07:55:34 |
|95|[**MolecularAI/Chemformer**](https://github.com/MolecularAI/Chemformer)188
34
Python
Apache License 2.0
2024-05-29 14:43:33 |
|96|[**owkin/FLamby**](https://github.com/owkin/FLamby)
Cross-silo Federated Learning playground in Python. Discover 7 real-world federated datasets to test your new FL strategies and try to beat the leaderboard.
`dataset`, `deep-learning`, `differential-privacy`, `federated-learning`, `healthcare`, `machine-learning`, `python`187
22
Python
MIT License
2024-06-03 12:18:27 |
|96|[**ome/openmicroscopy**](https://github.com/ome/openmicroscopy)
OME (Open Microscopy Environment) develops open-source software and data format standards for the storage and manipulation of biological light microscopy data. A joint project between universities, research establishments and industry in Europe and the USA, OME has over 20 active researchers with strong links to the microscopy community. Funded …
`database`, `image`, `java`, `omero`, `python`, `server`187
100
Java
GNU General Public License v2.0
2024-06-08 00:39:30 |
|97|[**AstraZeneca-NGS/VarDict**](https://github.com/AstraZeneca-NGS/VarDict)
VarDict186
60
Perl
MIT License
2024-01-05 14:06:13 |
|97|[**scverse/spatialdata**](https://github.com/scverse/spatialdata)
An open and interoperable data framework for spatial omics data186
34
Python
BSD 3-Clause "New" or "Revised" License
2024-06-08 00:23:48 |
|98|[**haowenz/chromap**](https://github.com/haowenz/chromap)
Fast alignment and preprocessing of chromatin profiles
`bioinformatics`, `chromatin-profiles`, `genomics`, `sequence-analysis`184
18
7
C++
MIT license
2024-02-06 15:29:20 |
|99|[**chao1224/MoleculeSTM**](https://github.com/chao1224/MoleculeSTM)
Multi-modal Molecule Structure-text Model for Text-based Editing and Retrieval, Nat Mach Intell 2023 (https://www.nature.com/articles/s42256-023-00759-6)
`clip`, `computation-chemistry`, `drug-discovery`, `editing`, `foundation-model`, `molecule-editing`, `moleculeclip`, `moleculestm`, `pretraining`, `retrieval`182
17
4
Python
specific
2024-04-19 05:25:24 |
|100|[**openpharma/visR**](https://github.com/openpharma/visR)
A package to wrap functionality for plots, tables and diagrams adhering to graphical principles.179
32
R
Other
2024-06-04 13:48:59 |
|100|[**chembl/ChEMBL_Structure_Pipeline**](https://github.com/chembl/ChEMBL_Structure_Pipeline)
ChEMBL database structure pipelines179
38
Python
MIT License
2023-10-25 15:20:47 |
|101|[**AstraZeneca/awesome-drug-discovery-knowledge-graphs**](https://github.com/AstraZeneca/awesome-drug-discovery-knowledge-graphs)
A collection of research papers, datasets and software related to knowledge graphs for drug discovery. Accompanies the paper "A review of biomedical datasets relating to drug discovery: a knowledge graph perspective" (Briefings in Bioinformatics, 2022)
`awesome-list`, `drug-discovery`, `drug-discovery-knowledge-graph`, `knowledge-graph`177
19
Apache License 2.0
2023-09-10 16:33:40 |
|102|[**lh3/biofast**](https://github.com/lh3/biofast)
Benchmarking programming languages/implementations for common tasks in Bioinformatics
`bioinformatics`175
26
C
2021-12-09 14:10:44 |
|103|[**shenwei356/kmcp**](https://github.com/shenwei356/kmcp)
Accurate metagenomic profiling && Fast large-scale sequence/genome searching
`bigsi`, `cobs`, `fracminhash`, `kmer`, `metagenomics`, `scaled-minhash`, `searching`, `sketch`, `sketching`, `syncmers`, `taxonomic-classification`, `taxonomic-profiling`, `virome`173
13
6
Go
MIT license
2023-09-22 04:09:54 |
|104|[**rgcgithub/regenie**](https://github.com/rgcgithub/regenie)
regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.172
49
C++
Other
2024-04-03 13:52:31 |
|105|[**soedinglab/metaeuk**](https://github.com/soedinglab/metaeuk)
MetaEuk - sensitive, high-throughput gene discovery and annotation for large-scale eukaryotic metagenomics
`bioinformatics`, `eukaryotes`, `gene-discovery`, `gene-prediction`, `metagenomics`171
24
C
GNU General Public License v3.0
2024-05-30 09:04:06 |
|106|[**recursionpharma/gflownet**](https://github.com/recursionpharma/gflownet)
GFlowNet library specialized for graph & molecular data
`deep-learning`, `gflownet`, `graph-neural-network`, `pytorch`168
34
Python
MIT License
2024-06-06 13:29:06 |
|106|[**scverse/scanpy-tutorials**](https://github.com/scverse/scanpy-tutorials)
Scanpy Tutorials.168
113
Jupyter Notebook
2024-06-03 19:42:01 |
|107|[**bioinform/neusomatic**](https://github.com/bioinform/neusomatic)
NeuSomatic: Deep convolutional neural networks for accurate somatic mutation detection
`convolutional-neural-networks`, `deep-learning`, `genomics`, `somatic-variants`167
50
Python
Other
2021-12-23 10:41:50 |
|108|[**lh3/readfq**](https://github.com/lh3/readfq)
Fast multi-line FASTA/Q reader in several programming languages
`bioinformatics`, `sequence-analysis`166
60
C
2021-06-06 07:27:15 |
|109|[**insightsengineering/teal**](https://github.com/insightsengineering/teal)
Exploratory Web Apps for Analyzing Clinical Trial Data
`clinical-trials`, `nest`, `r`, `shiny`, `webapp`164
29
R
Other
2024-06-07 12:49:26 |
|110|[**lh3/cgranges**](https://github.com/lh3/cgranges)
A C/C++ library for fast interval overlap queries (with a "bedtools coverage" example)
`algorithm`, `bioinformatics`, `genomics`161
18
C
MIT License
2024-05-28 21:47:37 |
|110|[**lh3/kmer-cnt**](https://github.com/lh3/kmer-cnt)
Code examples of fast and simple k-mer counters for tutorial purposes
`bioinformatics`, `genomics`, `k-mer-counting`161
13
C++
MIT License
2020-03-10 16:24:06 |
|111|[**greenelab/tybalt**](https://github.com/greenelab/tybalt)
Training and evaluating a variational autoencoder for pan-cancer gene expression data
`analysis`, `autoencoder`, `cancer`, `cancer-genomics`, `deep-learning`, `gene-expression`, `script`, `tool`, `unsupervised-learning`, `variational-autoencoder`, `variational-autoencoders`159
62
10
HTML
BSD-3-Clause license
2017-11-13 13:38:42 |
|112|[**aqlaboratory/genie**](https://github.com/aqlaboratory/genie)
De Novo Protein Design by Equivariantly Diffusing Oriented Residue Clouds
`diffusion-models`, `protein-design`154
18
Python
Apache License 2.0
2024-04-21 13:48:25 |
|113|[**DeepGraphLearning/ConfGF**](https://github.com/DeepGraphLearning/ConfGF)
Implementation of Learning Gradient Fields for Molecular Conformation Generation (ICML 2021).153
34
10
Python
MIT license |
|114|[**benevolentAI/DeeplyTough**](https://github.com/benevolentAI/DeeplyTough)
DeeplyTough: Learning Structural Comparison of Protein Binding Sites
`3d-models`, `deep-learning`, `drug-discovery`, `metric-learning`, `protein-structure`151
39
Python
Other
2023-04-07 09:33:44 |
|115|[**chao1224/GraphMVP**](https://github.com/chao1224/GraphMVP)
Pre-training Molecular Graph Representation with 3D Geometry, ICLR'22 (https://openreview.net/forum?id=xQUe1pOKPam)
`contrastive-learning`, `generative-model`, `geometry`, `graph`, `molecule`, `pretraining`, `self-supervised`, `self-supervised-learning`150
20
5
Python
MIT license
2022-09-20 14:29:48 |
|116|[**OpenGene/MutScan**](https://github.com/OpenGene/MutScan)
Detect and visualize target mutations by scanning FastQ files directly
`bioinformatics`, `cancer`, `detection`, `fastq`, `mutation`, `ngs`, `somatic`, `validation`, `variant`, `visualization`146
38
C
MIT License
2022-02-10 01:52:44 |
|117|[**MolecularAI/ReinventCommunity**](https://github.com/MolecularAI/ReinventCommunity)
`astrazeneca`, `cheminformatics`, `denovo-design`, `jupyter-notebook`, `neural-networks`, `reinforcement-learning`, `transfer-learning`145
57
Jupyter Notebook
MIT License
2022-04-22 16:44:35 |
|117|[**lh3/psmc**](https://github.com/lh3/psmc)
Implementation of the Pairwise Sequentially Markovian Coalescent (PSMC) model
`bioinformatics`, `genomics`, `population-genetics`145
60
C
Other
2022-11-21 04:39:31 |
|117|[**tencent-ailab/DrugOOD**](https://github.com/tencent-ailab/DrugOOD)
OOD Dataset Curator and Benchmark for AI-aided Drug Discovery145
19
6
Python
specific |
|118|[**ome/ome-zarr-py**](https://github.com/ome/ome-zarr-py)
Implementation of next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.
`ngff`, `ome`, `ome-zarr`, `zarr`143
51
Python
Other
2024-06-06 12:51:57 |
|119|[**Novartis/tidymodules**](https://github.com/Novartis/tidymodules)
An Object-Oriented approach to Shiny modules
`communication`, `inheritance`, `oop`, `r`, `shiny`, `shiny-modules`, `tidy-operators`141
11
R
Other
2023-02-23 15:04:31 |
|120|[**aws-samples/aws-genomics-workflows**](https://github.com/aws-samples/aws-genomics-workflows)
Genomics Workflows on AWS
`aws`, `batch`, `genomics`, `step-functions`, `workflows`140
106
19
Shell
MIT-0 license
2022-03-30 21:38:09 |
|121|[**MolecularAI/deep-molecular-optimization**](https://github.com/MolecularAI/deep-molecular-optimization)
Molecular optimization by capturing chemist’s intuition using the Seq2Seq with attention and the Transformer
`molecular-optimization`, `multi-property-optimization`, `seq2seq`, `transformer`139
36
Python
Apache License 2.0
2023-03-16 07:05:06 |
|122|[**AstraZeneca/SubTab**](https://github.com/AstraZeneca/SubTab)
The official implementation of the paper, "SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning"
`contrastive-learning`, `multi-view-learning`, `representation-learning`, `self-supervised-learning`, `tabular-data`138
20
Python
Apache License 2.0
2022-07-01 09:03:38 |
|122|[**johnsonandjohnson/Bodiless-JS**](https://github.com/johnsonandjohnson/Bodiless-JS)
Framework for building editable websites on the JAMStack138
59
TypeScript
Apache License 2.0
2024-01-24 03:00:32 |
|123|[**Benson-Genomics-Lab/TRF**](https://github.com/Benson-Genomics-Lab/TRF)
Tandem Repeats Finder: a program to analyze DNA sequences137
24
C
GNU Affero General Public License v3.0
2023-01-16 20:44:26 |
|124|[**lh3/pangene**](https://github.com/lh3/pangene)
Constructing a pangenome gene graph
`bioinformatics`, `pangenome`136
7
C
2024-05-29 00:13:01 |
|125|[**owkin/HistoSSLscaling**](https://github.com/owkin/HistoSSLscaling)
Code associated to the publication: Scaling self-supervised learning for histopathology with masked image modeling, A. Filiot et al., MedRxiv (2023). We publicly release Phikon 🚀
`computational-pathology`135
11
Jupyter Notebook
Other
2024-01-29 22:35:32 |
|126|[**AstraZeneca/awesome-shapley-value**](https://github.com/AstraZeneca/awesome-shapley-value)
Reading list for "The Shapley Value in Machine Learning" (JCAI 2022)
`artificial-intelligence`, `data-science`, `deep-learning`, `explainability`, `explainable`, `explainable-ai`, `explainable-artificial-intelligence`, `explainable-ml`, `lime`, `machine-learning`, `owen-value`, `shap`, `shapley`, `shapley-additive-explanations`, `shapley-decomposition`, `shapley-q-value`, `shapley-value`, `xai`134
10
Apache License 2.0
2022-08-08 08:53:10 |
|127|[**lh3/bedtk**](https://github.com/lh3/bedtk)
A simple toolset for BED files (warning: CLI may change before bedtk becomes stable)
`bioinformatics`132
15
C
MIT License
2024-05-28 21:48:28 |
|128|[**Bioconductor/Contributions**](https://github.com/Bioconductor/Contributions)
Contribute Packages to Bioconductor
`bioconductor`131
33
2023-09-12 18:32:10 |
|129|[**Merck/BioPhi**](https://github.com/Merck/BioPhi)
BioPhi is an open-source antibody design platform. It features methods for automated antibody humanization (Sapiens), humanness evaluation (OASis) and an interface for computer-assisted antibody sequence design.
`antibody`, `humanization`, `humanness`, `oasis`, `sapiens`129
44
Python
MIT License
2024-06-03 07:17:18 |
|129|[**soedinglab/plass**](https://github.com/soedinglab/plass)
sensitive and precise assembly of short sequencing reads
`bioinformatics`, `metagenomics`, `metatranscriptomics`, `opensource`, `proteins`, `proteomics`, `sequence-assembler`129
14
C
GNU General Public License v3.0
2024-04-16 20:44:12 |
|130|[**benevolentAI/guacamol_baselines**](https://github.com/benevolentAI/guacamol_baselines)
Baselines models for GuacaMol benchmarks128
33
Python
MIT License
2024-02-16 09:40:42 |
|131|[**AstraZeneca-NGS/VarDictJava**](https://github.com/AstraZeneca-NGS/VarDictJava)
VarDict Java port127
52
Java
MIT License
2024-01-05 14:03:51 |
|132|[**lh3/ksw2**](https://github.com/lh3/ksw2)
Global alignment and alignment extension
`bioinformatics`, `sequence-alignment`124
24
C
Other
2023-06-27 17:21:12 |
|132|[**chao1224/ChatDrug**](https://github.com/chao1224/ChatDrug)
LLM for Drug Editing, ICLR 2024
`chatgpt`, `chatgpt3`, `conversation`, `domain-feedback`, `drug`, `drug-discovery`, `drug-editing`, `editing`, `llm`, `molecule`, `motif`, `peptide`, `protein`, `retrieval`, `secondary-structure`, `small-molecule`, `structure`124
8
3
Python
2024-05-28 19:44:44 |
|133|[**rdkit/rdkit-js**](https://github.com/rdkit/rdkit-js)
A powerful cheminformatics and molecule rendering toolbelt for JavaScript, powered by RDKit .
`cheminformatics`, `drug-discovery`, `javascript`, `molecule`, `molecule-viewer`, `molecule-visualization`, `node-js`, `npm`, `rdkit`, `react`, `wasm`123
35
Dockerfile
BSD 3-Clause "New" or "Revised" License
2024-06-01 09:54:52 |
|133|[**blazerye/DrugAssist**](https://github.com/blazerye/DrugAssist)
DrugAssist: A Large Language Model for Molecule Optimization
`ai-for-science`, `drug-discovery`, `instruction-datasets`, `instruction-tuning`, `large-language-models`, `molecule-generation`, `molecule-optimization`123
10
3
Python |
|134|[**bigdatagenomics/mango**](https://github.com/bigdatagenomics/mango)
A scalable genome browser. Apache 2 licensed.122
30
Scala
Apache License 2.0
2022-12-02 22:21:57 |
|135|[**OpenGene/repaq**](https://github.com/OpenGene/repaq)
A fast lossless FASTQ compressor with ultra-high compression ratio120
20
C
MIT License
2023-09-22 02:48:34 |
|136|[**Bioconductor/BiocStickers**](https://github.com/Bioconductor/BiocStickers)
Stickers for some Bioconductor packages - feel free to contribute and/or modify.
`bioconductor`, `stickers`119
86
R
Other
2024-05-10 05:58:21 |
|136|[**greenelab/pancancer**](https://github.com/greenelab/pancancer)
Building classifiers using cancer transcriptomes across 33 different cancer-types
`analysis`, `cancer`, `classifier`, `gene-expression`, `machine-learning`, `methodology`, `pancancer`, `tcga`, `tool`, `transcriptome`119
58
10
Jupyter Notebook
BSD-3-Clause license
2018-03-01 15:38:33 |
|137|[**Roche/BalancedLossNLP**](https://github.com/Roche/BalancedLossNLP)118
23
Jupyter Notebook
Other
2023-06-12 21:51:15 |
|138|[**Merck/deepbgc**](https://github.com/Merck/deepbgc)
BGC Detection and Classification Using Deep Learning
`bidirectional-lstm`, `biosynthetic-gene-clusters`, `deep-learning`, `deepbgc`, `natural-products`, `pfam2vec`, `python`, `synthetic-biology`117
26
Jupyter Notebook
MIT License
2023-11-11 12:48:56 |
|138|[**benevolentAI/MolBERT**](https://github.com/benevolentAI/MolBERT)117
35
Python
MIT License
2021-06-06 10:28:35 |
|139|[**genentech/equifold**](https://github.com/genentech/equifold)
Official code repository for EquiFold: Protein Structure Prediction with a Novel Coarse-Grained Structure Representation
`machine-learning`, `proteins`, `structural-biology`, `structure-prediction`116
15
Python
Apache License 2.0
2023-01-08 19:51:30 |
|140|[**OpenGene/GeneFuse**](https://github.com/OpenGene/GeneFuse)
Gene fusion detection and visualization
`alk`, `bioinformatics`, `cancer`, `cosmic`, `eml4`, `fusion`, `gene`, `ret`, `ros1`114
62
C
MIT License
2022-02-21 08:07:06 |
|141|[**biosustain/cameo**](https://github.com/biosustain/cameo)
cameo - computer aided metabolic engineering & optimization113
42
Python
Apache License 2.0
2022-11-07 14:54:19 |
|142|[**EBI-Metagenomics/emg-viral-pipeline**](https://github.com/EBI-Metagenomics/emg-viral-pipeline)
VIRify: detection of phages and eukaryotic viruses from metagenomic and metatranscriptomic assemblies
`cwl`, `nextflow`, `pipeline`, `viruses`, `workflow`109
13
Python
Apache License 2.0
2024-05-08 20:10:03 |
|142|[**OpenGene/gencore**](https://github.com/OpenGene/gencore)
Generate duplex/single consensus reads to reduce sequencing noises and remove duplications
`bioinformatics`, `consensus`, `deduplication`, `deep-sequencing`, `duplex`, `duplex-sequencing`, `duplication`, `ngs`, `sequencing`, `sequencing-error`, `sequencing-noise`, `somatic`109
32
C++
MIT License
2023-10-27 06:19:21 |
|142|[**OpenGene/fastv**](https://github.com/OpenGene/fastv)
An ultra-fast tool for identification of SARS-CoV-2 and other microbes from sequencing data. This tool can be used to detect viral infectious diseases, like COVID-19.
`2019-ncov`, `bioinformatics`, `coronavirus`, `covid`, `covid-19`, `hcov`, `meta-genomics`, `microbial-sequences`, `mngs`, `ngs`, `sars-cov-2`, `sequencing`, `viral`, `viral-infectious-diseases`, `virus`, `visualization`109
24
C++
MIT License
2023-10-27 06:16:38 |
|143|[**lh3/yak**](https://github.com/lh3/yak)
Yet another k-mer analyzer
`bioinformatics`, `k-mer`108
8
C
MIT License
2024-04-01 21:39:44 |
|143|[**lh3/fermikit**](https://github.com/lh3/fermikit)
De novo assembly based variant calling pipeline for Illumina short reads
`bioinformatics`, `denovo-assembly`, `genomics`, `variant-calling`108
23
TeX
Other
2020-11-30 22:57:56 |
|144|[**Merck/Halyard**](https://github.com/Merck/Halyard)
Halyard is an extremely horizontally scalable Triplestore with support for Named Graphs, designed for integration of extremely large Semantic Data Models, and for storage and SPARQL 1.1 querying of the whole Linked Data universe snapshots.107
17
Java
Apache License 2.0
2023-01-23 16:59:32 |
|144|[**ome/ngff**](https://github.com/ome/ngff)
Next-generation file format (NGFF) specifications for storing bioimaging data in the cloud.
`bioimaging`, `cloud`, `data-science`, `file-formats`, `spec`107
38
Bikeshed
Other
2024-06-02 06:26:47 |
|144|[**soedinglab/CCMpred**](https://github.com/soedinglab/CCMpred)
Protein Residue-Residue Contacts from Correlated Mutations predicted quickly and accurately.107
25
C
GNU Affero General Public License v3.0
2023-11-08 07:51:35 |
|145|[**lh3/minimap**](https://github.com/lh3/minimap)
This repo is DEPRECATED. Please use minimap2, the successor of minimap.106
29
C
MIT License
2017-09-20 14:15:02 |
|146|[**chao1224/Geom3D**](https://github.com/chao1224/Geom3D)
Geom3D: Geometric Modeling on 3D Structures, NeurIPS 2023
`3d`, `3d-structures`, `ai4science`, `biology`, `chemistry`, `crystals`, `drugs`, `equivariance`, `geometry`, `group`, `invariance`, `material`, `molecules`, `physics`, `proteins`, `symmetry`105
9
2
Python
MIT license
2024-06-05 03:18:58 |
|147|[**phuse-org/phuse-scripts**](https://github.com/phuse-org/phuse-scripts)
Delivery standard industry analyses, built upon CDISC standards for analysis data104
88
SAS
MIT License
2023-08-01 15:21:20 |
|147|[**chembl/FPSim2**](https://github.com/chembl/FPSim2)
Simple package for fast molecular similarity searches
`cheminformatics`, `chemistry`, `gpu`, `python`, `similarity-search`104
17
Python
MIT License
2024-02-15 11:13:05 |
|148|[**bayer-science-for-a-better-life/Img2Mol**](https://github.com/bayer-science-for-a-better-life/Img2Mol)103
41
Jupyter Notebook
Apache License 2.0
2023-03-24 18:07:41 |
|149|[**Biogen-Inc/tidyCDISC**](https://github.com/Biogen-Inc/tidyCDISC)
Demo the app here: https://bit.ly/tidyCDISC_app
`pharma`, `r`, `rinpharma`, `rstats`102
38
R
GNU Affero General Public License v3.0
2023-09-22 15:18:20 |
|150|[**openpharma/mmrm**](https://github.com/openpharma/mmrm)
Mixed Models for Repeated Measures (MMRM) in R.100
17
R
Other
2024-06-03 18:02:15 |
|150|[**MolecularAI/DockStream**](https://github.com/MolecularAI/DockStream)
DockStream: A Docking Wrapper to Enhance De Novo Molecular Design
`astrazeneca`, `chemoinformatics`, `denovo-design`, `jupyter-notebook`, `molecular-docking`, `reinforcement-learning`100
30
Python
Apache License 2.0
2023-03-16 07:07:10 |
|150|[**Bayer-Group/paquo**](https://github.com/Bayer-Group/paquo)
PAthological QUpath Obsession - QuPath and Python conversations
`digital-pathology`, `python`, `qupath`100
16
Python
GNU General Public License v3.0
2024-06-02 18:21:27 |
|151|[**genentech/gReLU**](https://github.com/genentech/gReLU)
gReLU is a python library to train, interpret, and apply deep learning models to DNA sequences.99
5
Python
MIT License
2024-06-07 20:29:13 |
|152|[**lh3/hickit**](https://github.com/lh3/hickit)
TAD calling, phase imputation, 3D modeling and more for diploid single-cell Hi-C (Dip-C) and general Hi-C
`bioinformatics`, `genomics`, `hi-c`98
11
C
2021-02-04 01:47:43 |
|153|[**aqlaboratory/rgn2**](https://github.com/aqlaboratory/rgn2)97
28
Python
2023-11-28 17:16:23 |
|154|[**lh3/bgt**](https://github.com/lh3/bgt)
Flexible genotype query among 30,000+ samples whole-genome
`bioinformatics`, `genomics`96
10
C
MIT License
2019-09-04 19:43:27 |
|154|[**scverse/rapids_singlecell**](https://github.com/scverse/rapids_singlecell)
Rapids_singlecell: A GPU-accelerated tool for scRNA analysis. Offers seamless scverse compatibility for efficient single-cell data processing and analysis.
`anndata`, `bioinformatics`, `gpu`, `scverse`, `single-cell`96
18
Python
MIT License
2024-06-03 18:07:06 |
|154|[**shenwei356/bio_scripts**](https://github.com/shenwei356/bio_scripts)
Practical, reusable scripts for bioinformatics
`bioinformatics`, `perl`, `python`, `reusable`, `script`96
65
Perl
MIT License
2019-02-12 13:21:47 |
|155|[**EBISPOT/OLS**](https://github.com/EBISPOT/OLS)
Ontology Lookup Service from SPOT at EBI
`java`, `neo4j`, `obofoundry`, `owl`, `owl-api`95
40
JavaScript
Apache License 2.0
2023-04-28 20:09:19 |
|156|[**Sanofi-Public/CodonBERT**](https://github.com/Sanofi-Public/CodonBERT)
Repository for mRNA Paper and CodonBERT publication.94
14
Python
Other
2024-05-03 19:24:06 |
|156|[**OpenGene/scrnapip**](https://github.com/OpenGene/scrnapip)
A Systematic and Dynamic Pipeline for Single-Cell RNA Sequencing Analysis94
14
HTML
2023-10-16 01:24:06 |
|157|[**EBI-Metagenomics/genomes-catalogue-pipeline**](https://github.com/EBI-Metagenomics/genomes-catalogue-pipeline)
MGnify genome analysis pipeline93
21
Python
Other
2024-06-06 09:44:21 |
|158|[**samtools/tabix**](https://github.com/samtools/tabix)
Note: tabix and bgzip binaries are now part of the HTSlib project.92
40
C
2021-08-03 14:29:38 |
|158|[**shenwei356/BlackheartedHospital**](https://github.com/shenwei356/BlackheartedHospital) (forked from: [open-power-workgroup/Hospital](https://github.com/open-power-workgroup/Hospital))
网传附莆田系医院名单,欢迎更新92
15
2016-05-03 07:06:09 |
|159|[**AbSciBio/unlocking-de-novo-antibody-design**](https://github.com/AbSciBio/unlocking-de-novo-antibody-design)91
14
Other
2024-01-09 17:36:19 |
|159|[**schrodinger/gpusimilarity**](https://github.com/schrodinger/gpusimilarity)
A Cuda/Thrust implementation of fingerprint similarity searching
`cheminformatics`, `chemistry`, `gpu`, `similarity-analysis`91
26
C++
BSD 3-Clause "New" or "Revised" License
2024-01-24 19:08:08 |
|159|[**lh3/dipcall**](https://github.com/lh3/dipcall)
Reference-based variant calling pipeline for a pair of phased haplotype assemblies91
9
JavaScript
MIT License
2021-06-06 20:36:10 |
|160|[**Bioconductor/CSAMA**](https://github.com/Bioconductor/CSAMA)
Course material for CSAMA: Statistical Data Analysis for Genome Scale Biology89
45
HTML
2024-06-06 12:04:08 |
|160|[**AstraZeneca/onto_merger**](https://github.com/AstraZeneca/onto_merger)
OntoMerger is an ontology alignment library for deduplicating knowledge graph nodes that represent the same domain.
`algorithm`, `alignment`, `biological-networks`, `biology`, `graph`, `kg`, `knowledge`, `knowledge-graph`, `mapping`, `ontology`, `ontology-alignment`89
5
HTML
Apache License 2.0
2024-01-11 19:22:08 |
|160|[**hoelzer-lab/rnaflow**](https://github.com/hoelzer-lab/rnaflow)
A simple RNA-Seq differential gene expression pipeline using Nextflow89
19
HTML
GNU General Public License v3.0
2024-02-26 20:45:37 |
|160|[**shenwei356/perfect-bioinformatic-tools**](https://github.com/shenwei356/perfect-bioinformatic-tools)
What should perfect bioinformatic tools be like?
`bioinformatics`, `cli`, `usability`89
1
Creative Commons Zero v1.0 Universal
2024-03-19 10:22:54 |
|161|[**Sanofi-IADC/whispr**](https://github.com/Sanofi-IADC/whispr)
Open source event, comment and alert processing hub created by Sanofi IADC88
8
TypeScript
MIT License
2024-06-04 12:01:03 |
|161|[**calico/scBasset**](https://github.com/calico/scBasset)
Sequence-based Modeling of single-cell ATAC-seq using Convolutional Neural Networks.88
11
Jupyter Notebook
Apache License 2.0
2024-02-08 19:20:16 |
|161|[**shenwei356/bio**](https://github.com/shenwei356/bio)
A lightweight and high-performance bioinformatics package in Golang
`bioinformatics`, `golang`, `minimizer`, `package`, `scaled-minhash`, `sequence`, `syncmer`, `taxdump`, `taxonomy`88
9
7
Go
MIT license
2024-03-11 09:41:44 |
|162|[**owkin/HE2RNA_code**](https://github.com/owkin/HE2RNA_code)
Train a model to predict gene expression from histology slides.87
39
Python
GNU General Public License v3.0
2022-07-06 20:53:24 |
|162|[**scverse/pertpy**](https://github.com/scverse/pertpy)
Perturbation Analysis in the scverse ecosystem.
`perturbation`, `scverse`, `single-cell`87
19
Python
MIT License
2024-06-08 08:07:34 |
[Next page](Results/README-2.md)
[^1]: This page was generated with the [topgh](https://github.com/HubTou/topgh) open source software on 2024-06-09