https://github.com/theochem/selector
Python library of algorithms for selecting diverse subsets of data for machine-learning. Webserver is hosted at https://huggingface.co/spaces/QCDevs/Selector.
https://github.com/theochem/selector
chemical-diversity chemical-library-design compound-acquisition compound-selection maximum-dissimilarity-search maximum-diversity-molecule variable-selection
Last synced: 10 months ago
JSON representation
Python library of algorithms for selecting diverse subsets of data for machine-learning. Webserver is hosted at https://huggingface.co/spaces/QCDevs/Selector.
- Host: GitHub
- URL: https://github.com/theochem/selector
- Owner: theochem
- License: gpl-3.0
- Created: 2022-01-25T18:52:07.000Z (about 4 years ago)
- Default Branch: main
- Last Pushed: 2025-02-05T05:35:33.000Z (12 months ago)
- Last Synced: 2025-03-31T08:12:08.758Z (10 months ago)
- Topics: chemical-diversity, chemical-library-design, compound-acquisition, compound-selection, maximum-dissimilarity-search, maximum-diversity-molecule, variable-selection
- Language: Jupyter Notebook
- Homepage: https://selector.qcdevs.org
- Size: 24 MB
- Stars: 22
- Watchers: 9
- Forks: 22
- Open Issues: 12
-
Metadata Files:
- Readme: README.md
- Contributing: .github/CONTRIBUTING.md
- License: LICENSE
- Code of conduct: .github/CODE_OF_CONDUCT.md
Awesome Lists containing this project
README
[](https://python.org/downloads)
[](https://opensource.org/licenses/)
[](https://github.com/theochem/Selector/actions/workflows/ci_tox.yaml)
[](https://codecov.io/gh/theochem/Selector)
The `Selector` library provides methods for selecting a diverse subset of a (molecular) dataset.
## Citation
Please use the following citation in any publication using the `selector` library:
```md
@article{
TO BE ADDED LATER
}
```
## Web Server
We have a web server for the `selector` library at https://huggingface.co/spaces/QCDevs/Selector.
For small and prototype datasets, you can use the web server to select a diverse subset of your
dataset and compute the diversity metrics, where you can download the selected subset and the
computed diversity metrics.
## Installation
It is recommended to install `selector` within a virtual environment. To create a virtual
environment, we can use the `venv` module (Python 3.3+,
https://docs.python.org/3/tutorial/venv.html), `miniconda` (https://conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html), or
`pipenv` (https://pipenv.pypa.io/en/latest/).
### Installing from PyPI
To install `selector` with `pip`, we can install the latest stable release from the Python Package Index (PyPI) as follows:
```bash
# install the stable release.
pip install qc-selector
```
### Installing from The Prebuild Wheel Files
To download the prebuilt wheel files, visit the [PyPI page](https://pypi.org/project/qc-selector/)
and [GitHub releases](https://github.com/theochem/Selector/tags).
```bash
# download the wheel file first to your local machine
# then install the wheel file
pip install file_path/qc_selector-0.0.2b12-py3-none-any.whl
```
### Installing from the Source Code
In addition, we can install the latest development version from the GitHub repository as follows:
```bash
# install the latest development version
pip install git+https://github.com/theochem/Selector.git
```
We can also clone the repository to access the latest development version, test it and install it as follows:
```bash
# clone the repository
git clone git@github.com:theochem/Selector.git
# change into the working directory
cd Selector
# run the tests
python -m pytest .
# install the package
pip install .
```
## More
See https://selector.qcdevs.org for full details.