Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/ffstghc/caco2ml
Main code chunks used for models in the publication "Exploring the Potential of Adaptive, Local Machine Learning (ML) in Comparison ton the Prediction Performance of Global Models: A Case Study from Bayer's Caco-2 Permeability Database"
https://github.com/ffstghc/caco2ml
caco-2 local-models machine-learning pharmacokinetics scikit-learn
Last synced: about 3 hours ago
JSON representation
Main code chunks used for models in the publication "Exploring the Potential of Adaptive, Local Machine Learning (ML) in Comparison ton the Prediction Performance of Global Models: A Case Study from Bayer's Caco-2 Permeability Database"
- Host: GitHub
- URL: https://github.com/ffstghc/caco2ml
- Owner: ffstghc
- Created: 2024-10-16T21:21:26.000Z (4 months ago)
- Default Branch: main
- Last Pushed: 2024-11-21T10:11:09.000Z (3 months ago)
- Last Synced: 2024-12-19T02:23:10.626Z (about 2 months ago)
- Topics: caco-2, local-models, machine-learning, pharmacokinetics, scikit-learn
- Language: Python
- Homepage:
- Size: 27.3 KB
- Stars: 1
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
# "Exploring the Potential of Adaptive, Local Machine Learning (ML) in Comparison to the Prediction Performance of Global Models: A Case Study from Bayer's Caco-2 Permeability Database"
## _American Chemical Society (ACS): Journal of Chemical Information and Modeling (JCIM)_
### **_Frank Filip Steinbauer, Thorsten Lehr, Andreas Reichel_**
### http://pubs.acs.org/doi/abs/10.1021/acs.jcim.4c01083Repository for archiving the main code chunks used for the local and global machine learning models in the publication **_"Exploring the Potential of Adaptive, Local Machine Learning (ML) in Comparison ton the Prediction Performance of Global Models: A Case Study from Bayer's Caco-2 Permeability Database"_** published in 2024 in **_ACS Journal of Chemical Information and Modeling (JCIM)_** as 1st publication of my doctoral studies at Bayer.
The five different included files contain the main code chunks for:
1. Data preparation (SMILES/molecule object standardization; PaDEL descriptor calculation)
2. Global models (including other descriptor calculations and recursive feature elimination with cross-validation as well as external TDC benchmarking[1])
3. Local model (training data selection via fixed tanimoto similarity criteria)
4. Local model (training data selection via fixed amounts of most similar structuress)
5. Local model (training data selection via kNN[2] as control/proof of superiority of the chosen tanimoto similarity approach)If you have further questions or need additional parts of the utilized code for your own studies, feel free to contact [email protected].
[1]: https://tdcommons.ai/single_pred_tasks/adme#caco-2-cell-effective-permeability-wang-et-al
[2]: https://scikit-learn.org/dev/modules/generated/sklearn.neighbors.KNeighborsClassifier.html