Statistical Analysis Cheat Sheet: Formulas and Methods
Posted by Anonymous and classified in Mathematics
Written on in
English with a size of 313.94 KB
Sample Size Calculation
To find what n must be:
n ≥ (Z1-α/2 / MOE)2 * p(1-p)
Wald Confidence Interval
SE^ = sqrt((p^ * (1 – p)) / n)
Odds Ratio Analysis
Calculated by
criss-cross of 2x2 table.
Example: The odds for someone to smoke ≥ 5 cigarettes and have lung cancer is 3.40 times that of someone who smokes < 5 cigarettes daily to have lung cancer.
Fisher's Exact Test
Used when cell frequencies are < 5 in a 2x2 table. Tests for independence.
- H0: Odds ratio = 1
- HA: Odds ratio ≠ 1
Chi-Square Test of Independence
- H0: The two variables are independent.
- HA: The two variables are not independent.
Logit Model
To find the number of odds used in the model, find all combinations of levels (e.g., a 2x3 model has 6 combinations).
Exp(estimate) = odds for that factor.
This means that for a specific FACTOR, you have ODDS times higher odds of being in the CONTEXT than that of the REFERENCE LEVEL, while adjusting for OTHER FACTORS.
Model Formulas
- Predicted probability of success: P(x) = exp(b0 + B1X) / (1 + exp(b0 + B1X))
- Odds of success: p(x) / (1 - p(x)) = exp(B0 + B1X)
- Odds ratio: odds A / odds B = exp(B1(a - b))
Simple Linear Regression
Degrees of freedom in the complete model:
DFc = n − number of estimated parameters
DFc = n − 2 (For the model Y = b0 + b1x + E)
Experimental Design
One-Way ANOVA
Research Question: Is the mean for the variable equal among the different groups observed?
- H0: μ1 = μ2 = μ3
- HA: At least one μ is different
Complex Model: Yij = μi + Eij
Reduced Model: Yij = μ + Eij
Assumptions
- Independence of observations: Cannot be proved through data; must be ensured during experimental design.
- Normality of residuals: Errors are normally distributed N(0, σ2). Check via Q-Q plot (data should follow the 45-degree line). Use Shapiro-Wilk test to confirm.
- Equal Variances: Variance among groups must be equal. Check via box plots (similar IQRs/medians) or Levene's test.
Violated Assumptions
- Kruskal-Wallis Test: H0: All medians are equal. HA: At least one median is not equal.
- Randomization/Permutation: Shuffle labels to see if grouping has a real effect.
Effects Model
Assumes groups have a non-random effect on the response variable. All variability is attributed to factors.
- Complex: Yij = μ + ai + Eij
- Reduced: Yij = μ + Eij
- H0: ai = 0 for all i
- HA: At least one ai ≠ 0
Random Effects Model
Used to find differences in populations rather than specific treatment effects.
- Complex: Yij = μ + Ai + Eij
- Reduced: Yij = μ + Eij