Ecosyste.ms: Awesome

An open API service indexing awesome lists of open source software.

Awesome Lists | Featured Topics | Projects

https://github.com/lextuga007/eventregistrations

Cleaning data from Eventbrite registration extracts
https://github.com/lextuga007/eventregistrations

Last synced: about 2 months ago
JSON representation

Cleaning data from Eventbrite registration extracts

Awesome Lists containing this project

README

        

---
output: github_document
---

```{r, include = FALSE}
knitr::opts_chunk$set(
collapse = TRUE,
comment = "#>",
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# eventRegistrations

eventRegistrations is an R package with functions created to clean and help
filter Eventbrite registration data for healthcare events.

## Installation

You can install the development version of eventRegistrations from [GitHub](https://github.com/) with:

```{r}
# install.packages("pak")
pak::pak("Lextuga007/eventRegistrations")
```

## Functions

`clean_company()` contains cleaning code for health and social care organisations
as were entered by people registering for the 2024 HACA conference.
The cleaning was only done for tickets related to the in person event and not
online which had a greater number of typing errors as the field was freetext.

```{r}
library(eventRegistrations)
library(tidyverse)

# Getting sample data based on the Eventbrite export
reg_data <- sample_data()

# Clean the company column - note this is specific to health and care
# organisations in the UK
reg_data <- clean_company(data = reg_data)
```

`filter_data()` helps to filter by various scenarios which are useful
particularly if multiple options are available on the tickets and some are
used but not all.

For Eventbrite all tickets start out as `Attending` meaning they are registered,
and if scanned in or used through a linked in online service, become
`Checked In`.
The default filter for `checked_in` is to return all tickets, to filter just by
those tickets used the parameter `checked_in = "Y"` is needed.

The 2024 HACA event was a 2 day in person and online event which also had the
option of a Networking evening and all options were available through one
registration.

```{r}
# Renaming the longer readable ticket names
day_1 <- "In-person Ticket Day 1 - 23rd July 2024"
day_2 <- "In-person Ticket Day 2 - 24th July 2024"

# Tickets from the first day that were used
filter_data(
data = reg_data,
tickets = day_1,
checked_in = "Y"
) |> # this returns the email
left_join(reg_data, join_by(email)) # join back to data for all the information

# Tickets from the second day
filter_data(
data = reg_data,
tickets = day_2,
checked_in = "Y"
) |>
left_join(reg_data, join_by(email))

# Tickets from both days (and return just unique people)
filter_data(
data = reg_data,
tickets = c(day_1, day_2),
checked_in = "Y"
) |>
left_join(reg_data, join_by(email))

```

Countries can be added and this is free text and can be any country
name that is in the data:

```{r}
filter_data(data = reg_data,
tickets = day_2,
country = "Wales")
```

The `add_company` parameter will return the company_cleaned column (which is
generated from the `clean_company()` function that is specific to health and
care UK organisations):

```{r}
filter_data(data = reg_data,
tickets = day_2,
add_company = TRUE,
country = "Wales")

```

The `mlcsu` parameter default is `FALSE` and will leave the data unaffected.
It is used when looking to combine multiple organisations to be counted under
MLCSU as it also includes: The Health Economics Unit, The Strategy Unit,
NHS Transformation Unit as well as MLCSU.

```{r}
filter_data(data = reg_data,
tickets = c(day_1, day_2),
add_company = TRUE,
mlcsu = TRUE,
checked_in = "Y") |>
left_join(reg_data |>
select(email,
original_company = company))
```