Ecosyste.ms: Awesome
An open API service indexing awesome lists of open source software.
https://github.com/gregorybchris/surv
Dynamic survey generator
https://github.com/gregorybchris/surv
algorithm decision gain generator id3 information question survey theory tree
Last synced: 13 days ago
JSON representation
Dynamic survey generator
- Host: GitHub
- URL: https://github.com/gregorybchris/surv
- Owner: gregorybchris
- Created: 2024-08-22T22:22:58.000Z (5 months ago)
- Default Branch: main
- Last Pushed: 2024-12-24T10:33:49.000Z (17 days ago)
- Last Synced: 2024-12-24T11:43:25.036Z (17 days ago)
- Topics: algorithm, decision, gain, generator, id3, information, question, survey, theory, tree
- Language: Python
- Homepage:
- Size: 105 KB
- Stars: 0
- Watchers: 1
- Forks: 0
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
Surv
Dynamic survey generator
## About
Based on your answers to survey questions, Surv will select the next question that maximizes expected information gain. This means that in most cases you don't need to fill out every question in the survey before your result is known to a high degree of confidence.
You must have a feature you are trying to predict to use Surv. You need something information gain can be relative to. You can use Surv to collect a dataset, but if you don't have a seed dataset already collected, then each participant will need to take the survey in full.
## Installation
[Poetry](https://python-poetry.org/) is a requirement
```bash
poetry install
```## CLI usage
This command will start up an interactive session where you can fill out a survey and see how the entropy decreases as Surv becomes more certain about your most likely classification.
```bash
export SURV_DATA_DIRPATH=""
poetry run surv run --info
```### Example
The housing market example is loosely based on the [Boston housing dataset](https://scikit-learn.org/0.16/modules/generated/sklearn.datasets.load_boston.html). By filling out information about a hypothetical house (like the yard size, garage status, and presence of mold) Surv will decrease its uncertainty about the value of the house.
```bash
export SURV_DATA_DIRPATH="/surv/tests/algo/data"
poetry run surv run house --info
```## Dataset representation
Surv also comes with a dataset feature metadata system. Feature types and feature purposes help you to avoid common bugs when processing structured data with heterogeneous features.
### Feature types
Track metadata like the cardinality of categorical features, whether features are continuous or discrete, and whether values should be treated like integers or floating point numbers.
| Feature Type | Feature Subtype |
| ------------ | --------------- |
| Categorical | Binary |
| Categorical | Multiclass |
| Numeric | Ordinal |
| Numeric | Interval |
| Numeric | Ratio |
| Datetime | - |
| Text | - |### Feature purposes
Tagging the purpose of features can help you filter down features for certain use cases like training or identifying a unique sample across multiple datasets.
| Feature Purpose |
| --------------- |
| Training |
| Identifier |
| Target |
| Metadata |
| SampleWeight |
| Evaluation |
| Stratification |
| SubjectWise |
| Grouping |
| Sensitive |## Running tests
```bash
pytest tests
```