https://github.com/palday/freiburg2022
Materials for FIAS Statistics Workshop "Modelling Diversity in Language and Cognition"
https://github.com/palday/freiburg2022
Last synced: 3 months ago
JSON representation
Materials for FIAS Statistics Workshop "Modelling Diversity in Language and Cognition"
- Host: GitHub
- URL: https://github.com/palday/freiburg2022
- Owner: palday
- License: mit
- Created: 2022-04-29T15:38:48.000Z (about 3 years ago)
- Default Branch: main
- Last Pushed: 2022-09-07T23:01:59.000Z (over 2 years ago)
- Last Synced: 2025-01-12T09:21:38.102Z (5 months ago)
- Language: Julia
- Homepage: https://palday.github.io/freiburg2022/
- Size: 1.16 MB
- Stars: 1
- Watchers: 3
- Forks: 1
- Open Issues: 0
-
Metadata Files:
- Readme: README.md
- License: LICENSE
Awesome Lists containing this project
README
# Statistics Workshop "Modelling Diversity in Language and Cognition"
> Real-world data present a number of challenges for data analysis.
> Simulation provides a way to examine the impact of the quirks of real-world data on your analysis.
> In this tutorial, I will introduce simulation with mixed models as a tool for planning your analysis (e.g.,
power analysis) and as a way to re-consider your plan after contact
with real-world data (e.g., what impact does imbalance in my sample have on
my inferences?).
> Simulation and real-world data analysis should not be rivals, but rather partners in inference.## Simulation introduction / review of mixed models
- How do you simulate some data based on your hypotheses and assumptions about the world? (This will also involve a moderate amount of programming to create the fake data -- it's the same manipulations that go into wrangling real data, so it should hopefully be familiar to most attendees.)
- What are reasonable assumptions from a statistical perspective?
- How do you check alternative assumptions?
- Contrast coding review - how do you map your hypothesized effects to a set of numbers in the model?
- "degrees of freedom" and the whole matter of p-values in MixedModels## Power analysis
- How to take this "ground truth" and use it to simulate data of
different sample sizes?
- Does varying the number of items or subjects matter more?## Imbalance, collinearity and rank deficiency in the fixed effects
(This comes after power analysis because the biggest practical impact of
imbalance is on statistical power.)- How do you simulate imbalance? What are the impacts of imbalance on the data?
- How do you simulate collinearity? What are the impacts of scollinearity? Variance inflation factors
- Sometimes the state of the world means that you remain uncertain -- statistics can't save you from some fundamental uncertainties related to the structure of the world / the world as represented in your data.
## Rank deficiency in the random effects and random effects selection
- What does it mean for the model to be rank deficient?
- Practical impacts of rank deficiency for various tests/convergence
warnings
- Variance-bias tradeoff; over- vs. underfitting and what that means for
your inference.
- How to deal with rank deficiency## Extending this all to GLMM
- Simulating GLMM data
- Interpreting contrasts in GLMMs
- Why GLMMs generally have lower power
- Why we usually don't worry about degrees of freedom in GLMMs
- Multicollinearity transfers directly
- Random effects transfer directly