Understanding Random Variables and Probability Distributions
Classified in Mathematics
Written on in
English with a size of 6.01 MB
Random Variables
A random variable is a numerical quantity that takes on different values depending on chance. There are two primary types:
- Discrete (PMF): A countable set of possible outcomes (e.g., the number of cases in an SRS from the population).
- Continuous (PDF): An unbroken continuum of possible outcomes (e.g., the average weight of an SRS of newborns selected from the population).
Key Statistical Definitions
- Population: The set of all possible values for a random variable.
- Event: An outcome or a set of outcomes.
- Probability: The proportion of times an event is expected to occur in the population.
Note: Ideas about probability are founded on relative frequencies (proportions) in populations.
Probability Calculation Example
In a given year, there were 42,636 traffic fatalities in a population of N = 293,655,000. If you randomly select a person from this population, what is the probability they will experience a traffic fatality by the end of that year?
Answer: The relative frequency of this event is 42,636 / 293,655,000 = 0.0001452. Thus, Pr(traffic fatality) = 0.0001452 (about 1 in 6,887).
Population in which 20% of observations are positive:
- In the long run, the proportion approaches the true probability.
- The bigger the sample, the more accurate the probability.
- Probability is repetitive.
- Probability can be used to quantify a level of belief.
Probability Mass Function (PMF)
A Probability Mass Function (PMF) is a mathematical relation that assigns probabilities to all possible outcomes for discrete random variables.
Example: 4 patients are treated with an intervention that is successful 75% of the time. Let X = the number of successes in this experiment.
Rules of Probability
- All probabilities are between 0 and 1.
- The sum of all probabilities must be 1.
- 1 minus event A equals everything that is not event A.
- The probability of event A or event B (disjoint) is the sum of both to determine the union.
Characterizing PMF Location and Spread
Location: Look at the mean.
Spread: Look at the variance (σ²).
Area Under the Curve (AUC)
- The AUC on a PMF corresponds to probability.
- So, Pr(x=2) = area of shaded region = height × base = 0.2109 × 1.0 = 0.2109.
Cumulative Probability
- The probability of that value or less (Notation: Pr(X ≤ x)).
- Corresponds to the AUC to the left of the point (left tail).
Continuous Random Variables
- Defined by a continuum of possible values (e.g., a spinner generating random numbers between 0 and 1).
Probability Density Function (PDF)
- A mathematical relation that assigns probabilities to all possible outcomes for a continuous random variable.
- The shaded area under the curve represents probability (e.g., Pr(0 ≤ X ≤ 0.5) = 0.5).
- PDFs obey all rules of probability and come in many shapes, the most common being the Normal distribution.
AUC in PDFs
Like PMFs, PDFs display probability with the AUC. This histogram shades bars corresponding to ages greater than or equal to 9 (40%). This shaded AUC on the normal PDF curve also corresponds to about 40% of the total.
Bayes Theorem
Binomial Distributions
Binomial: A family of discrete random variables.
Binomial Random Variables: The random number of successes in n independent Bernoulli trials.
- Parameters: n = number of trials; p = probability of success for each trial.
- Bernoulli trials: Two possible outcomes (pass or fail).
Binomial Probabilities
The probability of exactly x successes on n repeated trials in an experiment with two possible outcomes.