Statistical Inference & Hypothesis Testing Concepts

Classified in Mathematics

Written on in English with a size of 3.52 KB

Parametric Inference Fundamentals

The probability distribution of the population under study is known, except for a finite number of parameters. Its goal is to estimate those parameters. Examples include the T-test and ANOVA.

Non-Parametric Inference Basics

The distribution of the population is not known. It is used to test the assumptions of parametric methods, for example, to check if the population distribution is normal.

What is a Statistic?

A random variable function of the sample that does not depend on the unknown parameter.

Understanding Estimators

A statistic whose values are acceptable for estimating an unknown parameter.

Unbiasedness in Estimation

We do not allow systematic overestimation or underestimation of the parameter, which would result in a bias of zero in the estimation.

Efficiency of Estimators

The risk of deviating too much from the actual parameter value should be low.

Consistency in Statistical Estimation

If we increase the sample size, the corresponding estimation should become more exact, as we have more information.

Defining Confidence Intervals

To provide an appropriate approximation of the parameter under study, we use a confidence interval.

Hypothesis Testing Explained

It attempts to decide if a specific hypothesis about the distribution under study is confirmed or rejected.

Null Hypothesis (H0)

H0, the hypothesis being tested. It is named null because H0 represents the hypothesis that we will consider true unless the sample data clearly show the opposite.

Alternative Hypothesis (H1)

If H1 is accepted, it is because the sample data clearly indicate that H0 is not true; thus, H1 is the opposite of H0.

Type I Error (Alpha Level)

Also known as the significance level (α), this occurs when we reject H0 when it is actually true.

Type II Error (Beta Level)

This occurs when we do not reject H0 when it is actually false. It is represented by beta (β). The statistical power of a test is calculated as 1 - β.

Defining the Critical Region

The region in which we reject H0.

ANOVA: Main Objective

To contrast if there are differences between the different factor levels.

ANOVA: Core Problem Addressed

Given n elements differing only in one factor, a continuous feature (the response variable) is observed, which randomly varies from one element to another. We want to know if there is a relationship between the mean value of this feature and the factor.

Regression Analysis Purpose

To estimate an unknown model and check if it is appropriate. This often involves assumptions like normality, homoscedasticity, randomness, and linearity.

ANOVA vs. Kruskal-Wallis Test

Parametric inference (ANOVA) assumes that the obtained means are random and normally distributed, and that an unknown parameter exists. The Kruskal-Wallis test does not make these assumptions.

Understanding Residuals

The difference between the observed and predicted values in a statistical model.

SC-Inter (SSTR): Between-Treatment Variability

Represents the variability between the treatment means and the global mean.

SC-Intra (SSE): Within-Treatment Variability

Represents the part of the variability not explained by the treatment.

Related entries: