Optimal Estimators, Dice Posterior & Statistical Problems

Posted by Anonymous and classified in Mathematics

Written on January 10, 2026 in English with a size of 63.89 KB

Combine Independent Unbiased Estimators

Let d₁ and d₂ be independent unbiased estimators of θ with variances σ₁² and σ₂², respectively:

E[d_i] = θ for i = 1,2.
Var(d_i) = σ_i².

Any estimator of the form d = λ d₁ + (1 - λ) d₂ is also unbiased for any constant λ.

The variance (mean square error for an unbiased estimator) is
Var(d) = λ²σ₁² + (1 - λ)²σ₂².

To minimize Var(d) with respect to λ, differentiate and set to zero:

d/dλ Var(d) = 2λσ₁² - 2(1 - λ)σ₂² = 0.

Solving gives the optimal weight

λ_* = σ₂² / (σ₁² + σ₂²).

Question 1: Posterior PMF for a Third Dice Roll

Assume there are five dice with sides {4, 6, 8, 12, 20}. One of these five dice is selected uniformly at random (probability 1/5) and rolled twice. The two observed results are 5 and 9. What is the posterior probability mass function (pmf) for the outcome of a third roll?

Since only the 12-sided and 20-sided dice can produce both results 5 and 9, the posterior probabilities for the chosen die D are:

P(D = 12 | 5,9) = 25/34
P(D = 20 | 5,9) = 9/34
P(D = k | 5,9) = 0 for k in {4,6,8}.

Therefore the pmf of a third roll X₃ given the two observations is, for integer x:

For x in {1,2,...,12}:
P(X₃ = x | 5,9) = (25/34) * (1/12) + (9/34) * (1/20) = 19/255.
For x in {13,14,...,20}:
P(X₃ = x | 5,9) = (9/34) * (1/20) = 9/680.
Otherwise: P(X₃=x|5,9) = 0.

Question 2: Sum of 1000 Fair Dice Rolls (CLT Expression)

Consider a fair six-sided die with faces 1 through 6. Let S be the sum of 1000 independent tosses. Using the Central Limit Theorem, express the approximate probability P(3000 ≤ S ≤ 4000) as an integral over the standard normal density.

For a single fair die:

Mean per roll: μ = 3.5.
Variance per roll: σ² = Var(X) = 35/12.

For S = sum of 1000 iid rolls: E[S] = 1000μ = 3500 and Var(S) = 1000 * (35/12). Let σ_S = sqrt(1000 * 35/12).

By the CLT, approximately

P(3000 ≤ S ≤ 4000) ≈ integral from a to b of phi(z) dz, where

a = (3000 - 3500) / σ_S,
b = (4000 - 3500) / σ_S,
and phi(z) is the standard normal density phi(z) = (1 / sqrt(2π)) e^-z²/2.

Equivalently:

P(3000 ≤ S ≤ 4000) ≈ ∫_{(3000-3500)/σ_S}^{(4000-3500)/σ_S} (1 / √(2π)) e^-z²/2 dz.

ADZWJ7ymOq+HAAAAAElFTkSuQmCC

xFouZXT9EIIIYQQwmdkZlQIIYQQQviMFKNCCCGEEMJnfgGC2wRoo728lAAAAABJRU5ErkJggg==

Question 3: Convex Combination of Two Unbiased Estimators

Let T₁ and T₂ be two independent unbiased estimators of the mean μ (i.e., E[T₁] = E[T₂] = μ).

Show that for any 0 ≤ λ ≤ 1, the estimator T = λ T₁ + (1 - λ) T₂ is unbiased.
Assume Var(T₁) = σ₁² and Var(T₂) = σ₂². Find the value of λ that minimizes the mean square error (equivalently, the variance) of T.

Solution sketch:

Unbiasedness: E[T] = λ E[T₁] + (1-λ) E[T₂] = λμ + (1-λ)μ = μ.
Variance: Var(T) = λ²σ₁² + (1-λ)²σ₂². Minimize with respect to λ to obtain
λ = σ₂² / (σ₁² + σ₂²).

Question 4: Hypothesis Test for a Fair Coin

A null hypothesis tests whether a coin is fair: H₀: ψ = 1/2. The rejection rule is: in 5 independent flips, reject H₀ if at least 4 outcomes are of the same type (i.e., 4 or 5 heads, or 4 or 5 tails).

Significance level (α):
Under H₀ (p = 1/2),
α = 2 * [C(5,4) (1/2)⁵ + C(5,5) (1/2)⁵] = 2 * (5/32 + 1/32) = 12/32 = 3/8 ≈ 0.375.
Power when H_A: ψ = 2/3:
Let p = P(heads) = 2/3. The power is P(reject H₀ | p = 2/3) = P(4 heads) + P(5 heads) + P(4 tails) + P(5 tails).
Explicitly:
P = 5 (2/3)⁴ (1/3) + (2/3)⁵ + 5 (1/3)⁴ (2/3) + (1/3)⁵.
This simplifies to 123/243 = 41/81 ≈ 0.50617.

Question 5: Location PDF f(x) = 0.5 e^{-|x - θ|}

Assume the probability density function is the Laplace (double exponential) location family:

f(x) = 0.5 e^{-|x - θ|}, for x in R and parameter θ (location).

Mean: Because the distribution is symmetric about θ, E[X] = θ.
MSE for an estimator given N iid samples:
The maximum likelihood estimator (and the minimum-mean-absolute-deviation estimator) for the location parameter is the sample median. For symmetric distributions, the sample median is an unbiased estimator of θ (for odd N) and the MSE is Var(median) (since bias = 0). A closed-form variance depends on N and the underlying density; in general
MSE(median) = Var(median) = (1 / (4 N f(θ)²)) + o(1/N) asymptotically, where f(θ) is the pdf evaluated at the true location (here f(θ) = 0.5).
N = 3 and samples 1, 3, 5:
The sample median is 3, so the MLE is θ_MLE = 3. The MSE of this estimator as a random quantity depends on the true θ, but the point estimate is 3.
General MLE for N samples:
The MLE of θ given iid Laplace samples is any median of the sample (for odd N the unique sample median). For even N, any value between the two middle order statistics maximizes the likelihood.

Question 6: Combining Two Measurements to Estimate Building Height

We measure a building height parameter β using two measurements at distances with known functions a(ρ). For simplicity denote a₁ = a(ρ₁) and a₂ = a(ρ₂). Observations are:

y₁ = a₁ β + ε₁,
y₂ = a₂ β + ε₂,

where ε₁, ε₂ are independent N(0,1) errors.

An engineer forms the weighted sum c₁ y₁ + c₂ y₂ to estimate β. We seek weights that minimize MSE.

Minimize MSE under unbiasedness (BLUE):
For unbiasedness require c₁ a₁ + c₂ a₂ = 1. Minimize Var(c₁ y₁ + c₂ y₂) = c₁² + c₂² subject to the constraint. The solution (Gauss–Markov / least squares) is:
c₁ = a₁ / (a₁² + a₂²),
c₂ = a₂ / (a₁² + a₂²).
This estimator equals (a₁ y₁ + a₂ y₂) / (a₁² + a₂²).
Maximum Likelihood Estimate (MLE):
Because the errors are Gaussian with equal known variance and independent, the MLE for β is identical to the BLUE above. Thus the MLE uses the same weights c₁, c₂ as given.

Related entries:

Tags: