https://github.com/datasets/cervical-cancer
Cervical cancer occurrences
https://github.com/datasets/cervical-cancer
Last synced: 11 months ago
JSON representation
Cervical cancer occurrences
- Host: GitHub
- URL: https://github.com/datasets/cervical-cancer
- Owner: datasets
- Created: 2018-01-03T15:42:56.000Z (over 8 years ago)
- Default Branch: main
- Last Pushed: 2025-02-03T16:19:50.000Z (over 1 year ago)
- Last Synced: 2025-04-12T02:37:09.906Z (about 1 year ago)
- Language: Python
- Homepage: https://datahub.io/machine-learning/cervical-cancer
- Size: 20.5 KB
- Stars: 6
- Watchers: 7
- Forks: 9
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
Awesome Lists containing this project
README
This is dataset about cervical cancer occurrences. Cervical cancer is
one the most frequent cancer diseases that occur to women. This dataset
is showing some factors that might influence cervical cancer.
## Data
This dataset was found on UCI under the name [Cervical cancer (Risk Factors) Data Set ](https://archive.ics.uci.edu/ml/datasets/Cervical+cancer+%28Risk+Factors%29#)
The dataset was collected at 'Hospital Universitario de Caracas' in Caracas, Venezuela.
The dataset comprises demographic information, habits, and historic medical records of
858 patients. Several patients decided not to answer some of the questions because of
privacy concerns (missing values).
* 835 instances
* 36 attributes
* Missing values: yes
Output data is located in directory called `data`
`data/cervical-cancer.csv`
Attributes are the same as they were in input data.
## Preparation
To get our output data several things are done to input data:
* missing values marked with "?" are replaced with ""(empty space)
Python scripts are located in directory `scripts`
`scripts/main.py`
## License
Licensed under the [Public Domain Dedication and License][pddl] (assuming
either no rights or public domain license in source data).
[pddl]: http://opendatacommons.org/licenses/pddl/1.0/