https://github.com/palday/freiburg2022

Materials for FIAS Statistics Workshop "Modelling Diversity in Language and Cognition"
https://github.com/palday/freiburg2022

Last synced: 3 months ago
JSON representation

Materials for FIAS Statistics Workshop "Modelling Diversity in Language and Cognition"

Host: GitHub
URL: https://github.com/palday/freiburg2022
Owner: palday
License: mit
Created: 2022-04-29T15:38:48.000Z (about 3 years ago)
Default Branch: main
Last Pushed: 2022-09-07T23:01:59.000Z (over 2 years ago)
Last Synced: 2025-01-12T09:21:38.102Z (5 months ago)
Language: Julia
Homepage: https://palday.github.io/freiburg2022/
Size: 1.16 MB
Stars: 1
Watchers: 3
Forks: 1
Open Issues: 0
Metadata Files:
- Readme: README.md
- License: LICENSE

Awesome Lists containing this project

README

# Statistics Workshop "Modelling Diversity in Language and Cognition"

> Real-world data present a number of challenges for data analysis.
> Simulation provides a way to examine the impact of the quirks of real-world data on your analysis.
> In this tutorial, I will introduce simulation with mixed models as a tool for planning your analysis (e.g.,
power analysis) and as a way to re-consider your plan after contact
with real-world data (e.g., what impact does imbalance in my sample have on
my inferences?).
> Simulation and real-world data analysis should not be rivals, but rather partners in inference.

## Simulation introduction / review of mixed models

- How do you simulate some data based on your hypotheses and assumptions about the world? (This will also involve a moderate amount of programming to create the fake data -- it's the same manipulations that go into wrangling real data, so it should hopefully be familiar to most attendees.)
- What are reasonable assumptions from a statistical perspective?
- How do you check alternative assumptions?
- Contrast coding review - how do you map your hypothesized effects to a set of numbers in the model?
- "degrees of freedom" and the whole matter of p-values in MixedModels

## Power analysis
- How to take this "ground truth" and use it to simulate data of
different sample sizes?
- Does varying the number of items or subjects matter more?

## Imbalance, collinearity and rank deficiency in the fixed effects

(This comes after power analysis because the biggest practical impact of
imbalance is on statistical power.)

- How do you simulate imbalance? What are the impacts of imbalance on the data?

- How do you simulate collinearity? What are the impacts of scollinearity? Variance inflation factors

- Sometimes the state of the world means that you remain uncertain -- statistics can't save you from some fundamental uncertainties related to the structure of the world / the world as represented in your data.

## Rank deficiency in the random effects and random effects selection

- What does it mean for the model to be rank deficient?
- Practical impacts of rank deficiency for various tests/convergence
warnings
- Variance-bias tradeoff; over- vs. underfitting and what that means for
your inference.
- How to deal with rank deficiency

## Extending this all to GLMM

- Simulating GLMM data
- Interpreting contrasts in GLMMs
- Why GLMMs generally have lower power
- Why we usually don't worry about degrees of freedom in GLMMs
- Multicollinearity transfers directly
- Random effects transfer directly

ecosyste.ms

Data

Tools

Indexes

Applications

Experiments

Awesome

https://github.com/palday/freiburg2022

Awesome Lists containing this project

README