Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/rdkit/CheTo

CheTo - Chemical Topic Modeling
https://github.com/rdkit/CheTo

Last synced: 1 day ago
JSON representation

CheTo - Chemical Topic Modeling

Awesome Lists containing this project

README

        

CheTo - RC(=O)R
--------

CheTo (ChemicalTopic) allows to apply topic modeling, a method developed in the text-mining field, to chemical data. Please see our recent publication for detailed information:

Schneider, N.; Fechner, N.; Landrum, G. A.; Stiefl, N. *Chemical Topic Modeling: Exploring Molecular Data Sets Using a Common Text-Mining Approach*. J. Chem. Inf. Model. 2017, [http://pubs.acs.org/doi/10.1021/acs.jcim.7b00249](http://pubs.acs.org/doi/10.1021/acs.jcim.7b00249)

The [supplementary](http://pubs.acs.org/doi/suppl/10.1021/acs.jcim.7b00249) of the paper contains exemplary data sets extracted from the [ChEMBL database](https://www.ebi.ac.uk/chembl/) and Jupyter notebooks to run the experiments described in the paper.

An interactive web page showing an exemplary topic model of data set A from our paper can be found here [http://www.t5informatics.com/Papers/InteractiveTopicModelDatasetA.html](http://www.t5informatics.com/Papers/InteractiveTopicModelDatasetA.html)

**Installation**

To install CheTo using Conda, simply run:

`conda install -c rdkit cheto`

**Further reading**

Using CheTo in KNIME: [http://rdkit.blogspot.ch/2017/08/chemical-topic-modeling-with-rdkit-and.html](http://rdkit.blogspot.ch/2017/08/chemical-topic-modeling-with-rdkit-and.html)

After publication of our article we were made aware that applying topic modeling to chemical data was also suggested by Rajarshi Guha in 2012 in his blog ([http://blog.rguha.net/?p=997](http://blog.rguha.net/?p=997)).