Projects in Awesome Lists by AstraZeneca
A curated list of projects in awesome lists by AstraZeneca .
https://github.com/astrazeneca/chemicalx
A PyTorch and TorchDrug based deep learning library for drug pair scoring. (KDD 2022)
biology chemistry deep-chemistry deep-learning drug drug-discovery drug-interaction drug-pair geometric-deep-learning geometry graph-neural-network machine-learning pharma polypharmacy pytorch smiles smiles-strings torch torchdrug
Last synced: 16 May 2025
https://github.com/AstraZeneca/chemicalx
A PyTorch and TorchDrug based deep learning library for drug pair scoring. (KDD 2022)
biology chemistry deep-chemistry deep-learning drug drug-discovery drug-interaction drug-pair geometric-deep-learning geometry graph-neural-network machine-learning pharma polypharmacy pytorch smiles smiles-strings torch torchdrug
Last synced: 27 Mar 2025
https://github.com/AstraZeneca/rexmex
A general purpose recommender metrics library for fair evaluation.
coverage deep-learning evaluation machine-learning metric metrics mrr personalization precision rank ranking recall recommender recommender-system recsys rsquared
Last synced: 27 Mar 2025
https://github.com/astrazeneca/rexmex
A general purpose recommender metrics library for fair evaluation.
coverage deep-learning evaluation machine-learning metric metrics mrr personalization precision rank ranking recall recommender recommender-system recsys rsquared
Last synced: 04 Apr 2025
https://github.com/AstraZeneca/SubTab
The official implementation of the paper, "SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning"
contrastive-learning multi-view-learning representation-learning self-supervised-learning tabular-data
Last synced: 01 May 2025
https://github.com/astrazeneca/subtab
The official implementation of the paper, "SubTab: Subsetting Features of Tabular Data for Self-Supervised Representation Learning"
contrastive-learning multi-view-learning representation-learning self-supervised-learning tabular-data
Last synced: 08 May 2025
https://github.com/astrazeneca/awesome-shapley-value
Reading list for "The Shapley Value in Machine Learning" (JCAI 2022)
artificial-intelligence data-science deep-learning explainability explainable explainable-ai explainable-artificial-intelligence explainable-ml lime machine-learning owen-value shap shapley shapley-additive-explanations shapley-decomposition shapley-q-value shapley-value xai
Last synced: 26 Dec 2025
https://github.com/astrazeneca/biology-for-ai
learning biology syllabus, geared for machine learning folks
Last synced: 26 Dec 2025
https://github.com/astrazeneca/awesome-drug-pair-scoring
Readings for "A Unified View of Relational Deep Learning for Drug Pair Scoring." (IJCAI 2022)
chemistry ddi decagon deep-chemistry deep-learning drug drug-combination drug-design drug-drug-interaction drug-repurposing drug-synergy drug-target-interactions gcn gnn graph-neural-network knowledge-graph machine-learning polypharmacy relational-learning synergy-prediction
Last synced: 11 Feb 2026
https://github.com/astrazeneca/onto_merger
OntoMerger is an ontology alignment library for deduplicating knowledge graph nodes that represent the same domain.
algorithm alignment biological-networks biology graph kg knowledge knowledge-graph mapping ontology ontology-alignment
Last synced: 20 Feb 2026
https://github.com/AstraZeneca/onto_merger
OntoMerger is an ontology alignment library for deduplicating knowledge graph nodes that represent the same domain.
algorithm alignment biological-networks biology graph kg knowledge knowledge-graph mapping ontology ontology-alignment
Last synced: 09 May 2025
https://github.com/AstraZeneca/biology-for-ai
learning biology syllabus, geared for machine learning folks
Last synced: 28 Sep 2025
https://github.com/astrazeneca/kazu
Fast, world class biomedical NER
biomedical-text-mining natural-language-processing nlp
Last synced: 04 Apr 2025
https://github.com/astrazeneca/jazzy
Fast calculation of hydrogen-bond strengths and free energy of hydration of small molecules.
Last synced: 05 Apr 2025
https://github.com/AstraZeneca/jazzy
Fast calculation of hydrogen-bond strengths and free energy of hydration of small molecules.
Last synced: 28 Sep 2025
https://github.com/AstraZeneca/KAZU
Fast, world class biomedical NER
biomedical-text-mining natural-language-processing nlp
Last synced: 28 Sep 2025
https://github.com/astrazeneca/kallisto
Efficiently calculate 3D-features for quantitative structure-activity relationship approaches.
chemistry computational-chemistry machinelearning quantum-chemistry
Last synced: 23 Apr 2025
https://github.com/AstraZeneca/kallisto
Efficiently calculate 3D-features for quantitative structure-activity relationship approaches.
chemistry computational-chemistry machinelearning quantum-chemistry
Last synced: 28 Sep 2025
https://github.com/astrazeneca/judgyprophet
Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).
ai bayesian data-science forecasting machine-learning python statistics
Last synced: 08 May 2025
https://github.com/AstraZeneca/judgyprophet
Forecasting for knowable future events using Bayesian informative priors (forecasting with judgmental-adjustment).
ai bayesian data-science forecasting machine-learning python statistics
Last synced: 28 Sep 2025
https://github.com/AstraZeneca/skywalkR
code for Gogleva et al manuscript
drug-discovery knowledge-graph recommender-system shiny-apps ui
Last synced: 28 Sep 2025
https://github.com/astrazeneca/skywalkr
code for Gogleva et al manuscript
drug-discovery knowledge-graph recommender-system shiny-apps ui
Last synced: 08 May 2025
https://github.com/astrazeneca/peptide-tools
Programs to calculate phys-chem properties of synthetic peptides and proteins: isoelectric point and extinction coefficients.
Last synced: 08 May 2025
https://github.com/astrazeneca/stargazer
StarGazer is a tool designed for rapidly assessing drug repositioning opportunities. It combines multi-source, multi-omics data with a novel target prioritization scoring system in an interactive Python-based Streamlit dashboard. StarGazer displays target prioritization scores for genes associated with 1844 phenotypic traits.
Last synced: 08 May 2025
https://github.com/AstraZeneca/StarGazer
StarGazer is a tool designed for rapidly assessing drug repositioning opportunities. It combines multi-source, multi-omics data with a novel target prioritization scoring system in an interactive Python-based Streamlit dashboard. StarGazer displays target prioritization scores for genes associated with 1844 phenotypic traits.
Last synced: 28 Sep 2025
https://github.com/AstraZeneca/kgem-in-drug-discovery
Code to accompany the "Understanding the Performance of Knowledge Graph Embeddings in Drug Discovery" manuscript (Artificial Intelligence in the Life Sciences, 2022)
drug-discovery drug-discovery-knowledge-graph knowledge-graph knowledge-graph-embedding-models knowledge-graph-embeddings target-prediction
Last synced: 28 Sep 2025
https://github.com/astrazeneca/kgem-in-drug-discovery
Code to accompany the "Understanding the Performance of Knowledge Graph Embeddings in Drug Discovery" manuscript (Artificial Intelligence in the Life Sciences, 2022)
drug-discovery drug-discovery-knowledge-graph knowledge-graph knowledge-graph-embedding-models knowledge-graph-embeddings target-prediction
Last synced: 08 May 2025
https://github.com/AstraZeneca/Omicsfold
Multi-omics data normalisation, model fitting and visualisation.
Last synced: 28 Sep 2025
https://github.com/AstraZeneca/VecNER
A library of tools for dictionary-based Named Entity Recognition (NER), based on word vector representations to expand dictionary terms.
dictionary-based-ner entity-extraction natural-language-processing ner nlp
Last synced: 28 Sep 2025
https://github.com/astrazeneca/vecner
A library of tools for dictionary-based Named Entity Recognition (NER), based on word vector representations to expand dictionary terms.
dictionary-based-ner entity-extraction natural-language-processing ner nlp
Last synced: 08 May 2025
https://github.com/astrazeneca/omicsfold
Multi-omics data normalisation, model fitting and visualisation.
Last synced: 08 May 2025
https://github.com/astrazeneca/diffabxl
The official implementation of DiffAbXL benchmarked in the paper "Benchmarking Generative Models for Antibody Design".
antibody-design binding-affinity diffusion-models generative-ai graph-neural-networks in-silico-design llm-models log-likelihood
Last synced: 04 Sep 2025
https://github.com/astrazeneca/napari-wsi
A plugin to read whole slide images within napari.
Last synced: 27 Oct 2025
https://github.com/astrazeneca/ness
Official implementation of "NESS: Node Embeddings from Static Subgraphs"
contrastive-learning graph graph-auto-encoder link-prediction node-embedding self-supervised-learning subgraph
Last synced: 08 May 2025
https://github.com/astrazeneca/biomedical-kg-topological-imbalance
Code to accompany the "Implications of Topological Imbalance for Representation Learning on Biomedical Knowledge Graphs" (Briefings in Bioinformatics, 2022)
drug-discovery drug-discovery-knowledge-graph knowledge-graph knowledge-graph-completion target-prediction
Last synced: 08 May 2025
https://github.com/astrazeneca/hsqc_structure_elucidation
Implementation of the SGNN graph neural network for 1H and 13C NMR prediction and a tool for distinguishing different molecules based on HSQC simulations
Last synced: 08 May 2025
https://github.com/astrazeneca/mcpl
Official implementation for "An image is worth multiple words: discovering object level concepts using multi-concepts prompts learning" [ICML 2024]]
Last synced: 08 May 2025
https://github.com/astrazeneca/ibd-interpret
We trained high performing open source models on image scans of tissue biopsies to predict endoscopic categories in inflammatory bowel disease. These predictive models can help us better understand the disease pathology and represent a step towards automated clinical recruitment strategies.
Last synced: 08 May 2025
https://github.com/astrazeneca/machine-learning-for-predicting-targeted-protein-degradation
The code was developed for training diverse ML and DL models to predict PROTACs degradation. Data cleaning for two public datasets, PROTAC-DB and PROTACpedia, are also included. PROTACs are of high interest for all disease areas of AZ and thus predicting their degradation is of general interest.
Last synced: 03 Jul 2025
https://github.com/astrazeneca/multimodal-python-course
The purpose of the code is to facilitate a comprehensive understanding of multimodal data science applications within medical domain. The code serves to support the delivery of a cutting-edge workshop designed to introduce researchers to the rapidly evolving field of multimodal data science
Last synced: 13 Mar 2026
https://github.com/astrazeneca/skywalkr-graph-features
Example notebooks that illustrate how to generate knowledge-based features. Features can be used in a variety of ML models, including recommender systems.
knowledge-graph recommender-system
Last synced: 08 May 2025
https://github.com/astrazeneca/detectis
A pipeline to rapidly detect exogenous DNA integration sites using DNA or RNA paired-end sequencing data
Last synced: 19 Jun 2025
https://github.com/astrazeneca/selfpad
The official implementation of "Improving Antibody Humanness Prediction using Patent Data".
antibody antibody-design antibody-sequence attention contrastive-learning humanness immunogenicity-prediction patent-data transformer
Last synced: 08 May 2025
https://github.com/astrazeneca/siamese-regression-pairing
Siamese Neural Networks for Regression: Similarity-Based Pairing and Uncertainty Quantification
Last synced: 08 May 2025
https://github.com/astrazeneca/tendril
This repository contains R package code for calculating tendril plots.
Last synced: 10 Mar 2026
https://github.com/astrazeneca/ctelc-patient-attrition-model
Clinical Trial Enrollment Life Cycle (CTELC) modeling project aims to leverage "industry-wide" data to understand key drivers and build predictive models. Patient attrition, also referred to as dropout or patient withdrawal, occurs when patients enrolled in a clinical trial either withdraw or are lost to follow-up by the clinical site and trial sponsor.
Last synced: 04 Oct 2025
https://github.com/astrazeneca/multimodal_nsclc
multi-omics data integration helps improving patient survival prediction. We provide a pipeline allowing for early integration of multiple omics plus clinical modalities in order to predict patient survival for NSCLC. The pipeline utilizes autoencoders, and helps identify main driving factor in survival prediction
Last synced: 08 May 2025
https://github.com/astrazeneca/mvda_exploration_tools
Multivariate data analysis (MVDA) exploration tool is a Python library utilizing the scikit-learn library for partial least squares (PLS) and principal components analysis (PCA).
Last synced: 08 May 2025
https://github.com/astrazeneca/maraca
R package for the creation of "maraca" plots
Last synced: 08 May 2025
https://github.com/astrazeneca/convcaps-dr
Tensorflow-Keras implementation of deep Convolutional Capsule Networks with Dynamic Routing algorithm
Last synced: 19 Jul 2025
https://github.com/astrazeneca/magnus-extensions
Extensions packages for magnus
Last synced: 09 Oct 2025
https://github.com/astrazeneca/oct_publication
This repository contains the source code for the image analysis of optical coherence tomography images, as stated in the publication of Volumetric wound healing by machine learning and optical coherence tomography in type 2 diabetes.
Last synced: 23 Jan 2026
https://github.com/astrazeneca/persist
This provides the necessary code and a short tutorial to run PersiST, an exploratory tool for spatial 'omics datasets.
Last synced: 07 Mar 2026
https://github.com/astrazeneca/survextrap-excesshazards
Demonstration of excess hazard and excess hazard cure models for survival extrapolation
Last synced: 28 Jun 2025
https://github.com/astrazeneca/trim21-bioprotac
Bioinformatics data analyses - Fletcher A. et al., Nature Communications 2023, doi: 10.1038/s41467-023-42546-2
Last synced: 19 Mar 2026
https://github.com/astrazeneca/qscheck
An R library to perform assertions and decision on input arguments.
Last synced: 02 Jul 2025
https://github.com/astrazeneca/multitask_impute
Supplementary code for 'Deep Learning Imputation for Multi Task Learning'
Last synced: 21 Apr 2026
https://github.com/astrazeneca/gim
gene interaction matrices, a novel approach to using ConvNets on gene expression data
Last synced: 13 Apr 2026
https://github.com/astrazeneca/dpp_imp
Improved clinical data imputation via classical and quantum determinantal point processes
Last synced: 20 Apr 2026
https://github.com/astrazeneca/inspectumours
This is a shiny tool to classify and analyse pre-clinical tumour data automatically.
Last synced: 26 Oct 2025
https://github.com/astrazeneca/qsuse
R library that provides an import mechanism like python to import local source files. It is not meant to replace library(), or doublecolon:: prefixing
Last synced: 24 Jul 2025
https://github.com/astrazeneca/cellatria
An Agentic AI Framework for Ingestion and Standardization of Single-Cell RNA-seq Data Analysis
Last synced: 10 Aug 2025
https://github.com/astrazeneca/arrayedcrisprscreener
The goal of arrayedCRISPRscreener is to simulate arrayed CRISPR screening data for the purpose of benchmarking data analysis tools as well as power calculation.
Last synced: 06 Oct 2025
https://github.com/astrazeneca/gonogo
Implement Go/No-Go policies using multiple endpoints, and simulate the outcome under different scenarios.
Last synced: 20 Apr 2026