https://github.com/inseefrlab/ckm
https://github.com/inseefrlab/ckm
data-protection-management statistical-disclosure-control
Last synced: about 2 months ago
JSON representation
- Host: GitHub
- URL: https://github.com/inseefrlab/ckm
- Owner: InseeFrLab
- License: other
- Created: 2025-05-28T10:00:02.000Z (10 months ago)
- Default Branch: main
- Last Pushed: 2025-10-21T15:07:14.000Z (5 months ago)
- Last Synced: 2025-10-21T17:18:02.973Z (5 months ago)
- Topics: data-protection-management, statistical-disclosure-control
- Language: R
- Homepage: https://inseefrlab.github.io/ckm/
- Size: 20.6 MB
- Stars: 1
- Watchers: 0
- Forks: 0
- Open Issues: 1
-
Metadata Files:
- Readme: README.md
- Changelog: NEWS.md
- License: LICENSE
Awesome Lists containing this project
README
[](https://lifecycle.r-lib.org/articles/stages.html#stable)
[](https://gitlab.insee.fr/outilsconfidentialite/ckm/-/blob/main/DESCRIPTION)
[](https://cran.r-project.org/package=ckm)
[](https://github.com/InseeFrLab/ckm/actions/workflows/R-CMD-check.yaml)
# Applying the Cell Key Method
## Overview
The Cell Key Method (CKM) is a statistical technique used to protect the confidentiality of tabular data by perturbating all cells in it. This package provides tools to apply the CKM in `R`, enabling users to select the best set of parameters and to generate perturbed counting tables from microdata.
For more information on the Cell Key Method, you can refer to the chapter 5.4 of the [Handbook on Statistical Disclsoure Control](https://sdctools.github.io/HandbookSDC/05-frequency-tables.html#sec-CKM_freq).
The package is designed to perturb only frequency tables only for the moment.
## Documentation
For detailed documentation, please refer to the [package vignette](https://inseefrlab.github.io/ckm/articles/quickstart-ckm.html).
The transition matrices are built using the [`ptable`](https://cran.r-project.org/web/packages/ptable/index.html) package.
For French readers, you can also refer to a [methdological document](https://www.insee.fr/fr/statistiques/fichier/2838097/12-fiche_methodologique_ckm.pdf) for more information on the Cell Key Method.
## Package Installation
```r
# install.packages("remotes")
remotes::install_github("inseefrlab/ckm", dependencies = TRUE)
```
## Applying the Cell Key Method Step by Step
### Assigning a Random Key to the Microdata
```r
library(ckm)
data("dtest", package = "ckm")
set.seed(4081789) # Ensure reproducibility
dtest_with_keys <- build_individual_keys(dtest)
hist(dtest_with_keys$rkey)
```
### Generating the Counting Table with Cell Keys
```r
tab_before <- tabulate_cnt_micro_data(
df = dtest_with_keys,
cat_vars = c("DIPLOME", "SEXE", "AGE"),
hrc_vars = list(GEO = c("REG", "DEP")),
marge_label = "Total"
)
```
### Applying the Perturbation
```r
res_ckm <- apply_ckm(tab_before, D = 5, V = 2)
```
## Applying the Cell Key Method in One Step
After generating the individual key on your dataset, you can directly build the perturbed table:
```r
res_ckm <- tabulate_and_apply_ckm(
df = dtest_with_keys,
cat_vars = c("DIPLOME", "SEXE", "AGE"),
hrc_vars = list(GEO = c("REG", "DEP")),
marge_label = "Total",
D = 5, V = 2
)
```