## AM-GM-HM Inequality: A Statistical Point of View

In this post we shall give another proof of the famous AM-GM-HM inequality:

If $x_1,\cdots,x_n$ are positive real numbers, then AM $\geq$ GM $\geq$ HM, precisely

$\displaystyle \frac{1}{n} \sum_{i=1}^n x_i\geq \sqrt[n]{\prod_{i=1}^n x_i} \geq \frac{n}{\sum_{i=1}^n x_i^{-1}}$.

I remember in high school the AM-GM inequality is proved using direct induction or calculus. Later on I knew proofs using backward induction, convexity, Jensen’s inequality etc (See the above Wikipedia link for details). The following proof, which I learnt in a statistics class, comes from Stefanski (1996). The only prerequisite is 100-200 level calculus.

We first make the proof mathematics-like. After that, we describe its meaning in statistics.

Proof. Fix $x_1,\cdots,x_n>0$. Consider a map $L_1: \mathbb{R}_{++}^n \rightarrow \mathbb{R}$ given by

$\displaystyle L_1 (\lambda_1,\cdots,\lambda_n) = \left(\prod_{i=1}^n \lambda_i \right) \exp \left( - \sum_{i=1}^n \lambda_i x_i\right)$,

where $\mathbb{R}_{++} \triangleq \{\lambda\in \mathbb{R}: \lambda > 0\}$.

By standard differentiation techniques we learnt in calculus,

$\displaystyle \max_{(\lambda_1,\cdots,\lambda_n) \in \mathbb{R}_{++}^n} L_1(\lambda_1,\cdots,\lambda_n) = \left(\prod_{i=1}^n x_i \right)^{-1} e^{-n}$

and

$\displaystyle \max_{\lambda>0} L_1(\lambda,\cdots,\lambda) = \left(\frac{1}{n} \sum_{i=1}^n x_i \right)^{-n} e^{-n}$.

Using the fact $\displaystyle \max_{\lambda>0} L_1(\lambda,\cdots,\lambda)\leq \max_{(\lambda_1,\cdots,\lambda_n) \in \mathbb{R}_{++}^n} L_1(\lambda_1,\cdots,\lambda_n)$ we get the AM-GM inequality.

Next, we consider $L_2: \mathbb{R}_{++}^n \rightarrow \mathbb{R}$ given by

$\displaystyle L_2 (\lambda_1,\cdots,\lambda_n) = \frac{\prod_{i=1}^n \lambda_i}{\left(\prod_{i=1}^n x_i\right)^2}\exp\left( -\sum_{i=1}^n \frac{\lambda_i}{x_i}\right)$.

Then

$\displaystyle \max_{(\lambda_1,\cdots,\lambda_n) \in \mathbb{R}_{++}^n} L_2(\lambda_1,\cdots,\lambda_n)= \left(\prod_{i=1}^n x_i\right)^{-1}e^{-n}$

and

$\displaystyle \max_{\lambda>0} L_2(\lambda,\cdots,\lambda)=\frac{n^n}{\left(\sum_{i=1}^n x_i^{-1}\right)^n \left( \prod_{i=1}^n x_i\right)^2}e^{-n}$.

The fact $\displaystyle \max_{\lambda>0} L_2(\lambda,\cdots,\lambda)\leq \max_{(\lambda_1,\cdots,\lambda_n) \in \mathbb{R}_{++}^n} L_2(\lambda_1,\cdots,\lambda_n)$ implies the GM-HM inequality. The proof is completed.          $\square$

The function $L_1$ is indeed the joint probability density function (also called likelihood) of independent random variables $X_i\sim$ Exponential($\lambda_i$) with density $x_i \mapsto \lambda_i e^{-\lambda_i x_i}\delta_{(0,+\infty)}(x_i)$ ($\delta$ is the indicator function). Then for a fixed $(x_1,\cdots,x_n)$,

$\displaystyle \Lambda_1 \triangleq \frac{\max_{\lambda>0} L_1(\lambda,\cdots,\lambda)}{\max_{(\lambda_1,\cdots,\lambda_n) \in \mathbb{R}_{++}^n} L_1(\lambda_1,\cdots,\lambda_n)}$

is the Likelihood Ratio Test statistic of testing the null hypothesis $H_0:\lambda_1=\cdots=\lambda_n$ against the alternative hypothesis $H_1:$ $\lambda_i$ are not all equal. Obviously $0\leq \Lambda_1\leq 1$. The test rejects the null hypothesis if and only if $\Lambda_1$ is too small. This is because, if the numerator of $\Lambda_1$, which is the likelihood when $H_0$ is true, is small then $H_0$ is less likely to occur.

The function $L_2$ is the likelihood of independent random variables $X_i^{-1}$ where $X_i\sim$ Exponential($\lambda_i$). Then given $(x_1,\cdots,x_n)$,

$\displaystyle \Lambda_2 \triangleq \frac{\max_{\lambda>0} L_2(\lambda,\cdots,\lambda)}{\max_{(\lambda_1,\cdots,\lambda_n) \in \mathbb{R}_{++}^n} L_2(\lambda_1,\cdots,\lambda_n)}$

is the Likelihood Ratio Test statistic of testing the null hypothesis $H_0:\lambda_1=\cdots=\lambda_n$ against the alternative hypothesis $H_1:$ $\lambda_i$ are not all equal. The test rejects the null hypothesis if and only if $\Lambda_2$ is too small.