https://github.com/ctuavastlab/datasets
Datasets, currently containing: Mutagenesis
https://github.com/ctuavastlab/datasets
Last synced: 5 months ago
JSON representation
Datasets, currently containing: Mutagenesis
- Host: GitHub
- URL: https://github.com/ctuavastlab/datasets
- Owner: CTUAvastLab
- License: cc0-1.0
- Created: 2021-10-21T10:41:58.000Z (over 4 years ago)
- Default Branch: main
- Last Pushed: 2021-10-21T11:22:56.000Z (over 4 years ago)
- Last Synced: 2025-02-05T19:12:28.583Z (over 1 year ago)
- Homepage:
- Size: 49.8 KB
- Stars: 0
- Watchers: 2
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Datasets
This repository contains various datasets used in CTUAvastLab.
Currently it contains only mutagenesis dataset.
## Mutagenesis dataset:
### Summary
The dataset comprises of 230 molecules trialed for mutagenicity on Salmonella typhimurium. A subset of 188 molecules
is learnable using linear regression. This subset was later termed the ”regression friendly” dataset. The remaining
subset of 42 molecules is named the ”regression unfriendly” dataset.
(taken from [relational.fit.cvut.cz/](https://relational.fit.cvut.cz/dataset/Mutagenesis)).
Currently, this repository contains only `Mutagenesis_188`.
### Website
[relational.fit.cvut.cz/](https://relational.fit.cvut.cz/dataset/Mutagenesis) where the original data is hosted as
SQL database.
[Original source](http://www.cs.ox.ac.uk/activities/machlearn/mutagenesis.html)
### [License](LICENSE)
see separate file.
### Data structure
[mutagenesis/data.json](mutagenesis/data.json) contains data from dataset Mutagenesis_188, as list of 188 strucures,
each representing one molecule, as a json.
[mutagenesis/meta.json](mutagenesis/meta.json) contains metadata about the dataset, as a json.