# An interlacing property of order statistics

In this post, we present an interlacing property of order statistics.

Suppose we have a distribution ${F}$ on real line ${\mathbb{R}}$ and ${\theta_i \overset{\text{iid}}{\sim} F}$ where ${i=1,\dots, n}$ and ${\overset{\text{iid}}{\sim}}$ means “independently and identically distributed as”. We denote ${\theta_{(i)}^n}$ to be the ${n-i+1}$-th largest random variable of the ${n}$ many ${\theta_i}$s. So ${\theta_{(n)}^n}$ is the largest value and ${\theta_{(1)}^n}$ is the smallest of the ${n}$ many iid ${\theta}$s. Note here the subscript ${n}$ in ${\theta^n_{(i)}}$ does not mean taking to the ${n}$th power but means how many ${\theta_i}$s we are considering.

The ordered variables ${\{\theta^{n}_{(i)}\}_{i=1}^n}$ are called order statistics in Statistics. They are sufficient statistics for distribution ${F}$ when ${\theta_i}$s are independent and all follow ${F}$. We now present an interlacing property of order statistics.

Theorem 1 For any ${n,m>0}$, ${ l\leq n, m-n+1\leq i\leq m}$ and ${x\in \mathbb{R}}$, we have

\displaystyle \begin{aligned} \mathbb{P}( \theta_{(n+i)}^{n+m}\leq x)\leq \mathbb{P} (\theta^n_{(n-m+i)}\leq x)\text{ and } \mathbb{P}( \theta_{(l)}^n\leq x) \leq \mathbb{ P}(\theta_{(l)}^{n+m}\leq x) \end{aligned} \ \ \ \ \ (1)

If ${\theta_i}$ has finite mean, then we have

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n+i)}^{n+m})\geq \mathbb{E} (\theta^n_{(n-m+i)} )\text{ and } \mathbb{E}( \theta_{(l)}^n) \geq \mathbb{ E}(\theta_{(l)}^{n+m}) \end{aligned} \ \ \ \ \ (2)

The meaning of the right inequalities of the inequality (1) and the inequality (2) is that the larger the sample size ${n}$, the smaller the ${n-l+1}$-th largest value. The left inequalities of the inequality (1) and the inequality (2) show that the larger sample size ${n}$, the larger the ${m+i}$-th order statistic is in expectation.

The first inequality chain is the stochastic dominance version of the second inequality chain. The first chain is a stronger property than the second.

Let us now explain why we call Theorem 1 an interlacing property. Consider the case that ${n\geq 2}$ and take ${m=1}$, ${l =n}$, and ${i=0}$ in Theorem 1. The inequality (2) gives that

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n)}^n) \geq \mathbb{ E}(\theta_{(n)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-1)}). \end{aligned} \ \ \ \ \ (3)

If ${n\geq 3}$, then we can take ${l=n-1}$, ${m=1}$, and ${i=-1}$. Then equation (2) gives

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n-1)}^{n}) \geq \mathbb{ E}(\theta_{(n-1)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-2)}). \end{aligned} \ \ \ \ \ (4)

Thus from (3) and (4) we have

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n)}^n) \geq \mathbb{ E}(\theta_{(n)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-1)})\geq \mathbb{ E}(\theta_{(n-1)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-2)}). \end{aligned} \ \ \ \ \ (5)

Continuing the choices of ${l}$ and ${i}$ while fixing ${m=1}$, we see that

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n+1)}^{n+1})\geq \mathbb{E}( \theta_{(n)}^n) \geq \mathbb{ E}(\theta_{(n)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-1)})\geq\dots \geq\mathbb{E} (\theta^{n+1}_{(2)})\geq \mathbb{E} (\theta^{n}_{(1)})\geq \mathbb{E} (\theta^{n+1}_{(1)}) . \end{aligned} \ \ \ \ \ (6)

Thus we see the quantities ${\{\theta_{(i)}^n\}_{i=1}^n}$ and ${\{\theta_{(i)}^{n+1}\}_{i=1}^{n+1}}$ are interlacing. The main idea of the proof is putting ${\theta_{(l)}^n}$ and ${\theta_{(l)}^{n+m}}$ and ${\theta_{(l-m)}^n}$ in the same probability space and arguing that we actually have almost sure inequality instead of just expectation or stochastically dominance. This technique is known as coupling.

Proof: Let us first prove the right inequality. Suppose we first generate ${n}$ many iid ${\theta_i}$s following ${F}$ and order them so that we have ${\theta_{(i)}^n}$s. Now if we generate (independently) ${m}$ many more ${\theta_{i+n}}$ follows ${F}$ as well with ${i=1,\dots,m}$. We now consider all the ${\theta}$ we generated, then we find ${\theta_{(l)}^{n+m}}$ is at most ${\theta_{(l)}^n}$ since we have

$\displaystyle \theta_{(l)}^{n+m} = \theta_{(l)}^n \text{ if } \theta_{i+n} \geq \theta_{(l)}^n,\forall 1\leq i\leq m$

and

$\displaystyle \theta_{(l)}^{n+m} < \theta_{(l)}^n, \text{ if } \exists i \text{ s.t. } \theta_{i+m}< \theta_{(l)}^n.$

Since we have almost sure inequality, we surely have stochastic dominance and inequality in expectation. Now consider the left inequality. We still follow the previous ways of generating ${\theta_i}$s. After we have ${\theta_{(i)}^n}$s, the ${n-(l-m)+1}$-th largest value ${\theta_{(l-m)}^{n}}$ is always smaller than the ${n-(l-m)+1}$-th largest value ${\theta_{(l)}^{n+m}}$ since

$\displaystyle \theta_{(l)}^{n+m} = \theta_{(l-m)}^{n} \text{ if } \theta_{i+n} \leq \theta_{(l-m)}^n,\forall 1\leq i\leq m$

and

$\displaystyle \theta_{(l)}^{n+m} > \theta_{(l-m)}^{n} \text{ if } \exists i \text{ s.t. } \theta_{i+n} >\theta_{(l-m)}^n.$

Thus we have shown that we have stochastically dominance for the random variables we are considering. The proof here is a bit lengthy and maybe not so clear but correct. One may draw a picture of the data-generating process and then compare the ${\theta_i}$s to see the correctness. $\Box$

Interestingly, the above proof actually did not use the assumption that ${\theta}$s are iid. This means we can have the same conclusion for any joint distribution. We state this result explicitly below.

Theorem 2 Let ${\theta_1,\dots, \theta_{n+m}}$ have joint distribution ${F_{m+n}}$. Denote ${\theta_{(1)}\leq \theta_{(2)}\leq \dots \leq \theta_{(n)} \leq \dots \leq \theta_{(n+m)}}$ to be the order statistics. Now consider only the first ${n}$ many ${\theta_i}$s, i.e., ${\theta_1,\dots,\theta_n}$. Denote their marginal distribution as ${F_{n}}$. We denote the order statistics of these ${n}$ many ${\theta}$s to be ${\tilde{\theta}_{(1)} \leq \dots \leq \tilde{\theta}_{(n)}}$. Then for any ${n,m>0}$, ${m< l\leq n,1\leq i\leq m}$ and ${x\in \mathbb{R}}$, we have

\displaystyle \begin{aligned} \mathbb{P}( \theta_{(n+i)}\leq x)\leq \mathbb{P}( \tilde{\theta}_{(l)}\leq x) \leq \mathbb{ P}(\theta_{(l)}\leq x)\leq \mathbb{P} (\tilde{\theta}_{(l-m)}\leq x). \end{aligned} \ \ \ \ \ (8)

If each ${\theta_i}$ has finite mean, then we have

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n+i)})\geq \mathbb{E}( \tilde{\theta}_{(l)}) \geq\mathbb{ E}(\theta_{(l)})\geq\mathbb{E} (\tilde{\theta}_{(l-m)}). \end{aligned} \ \ \ \ \ (9)

Proof: The proof is a repetition of the previous argument. Let us first prove the middle inequality of (8) and (9). Suppose we first generate ${n}$ many ${\theta_i}$s following ${F_n}$ and order them so that we have ${\theta_{(i)}}$s. Now if we generate the extra ${m}$ many more ${\theta_{i+n}}$ follows the conditional distribution given by the conditional defined by ${F_{n+m}}$ and the first ${n}$ many thetas. Specifically, the conditional distribution is

$\displaystyle P(\theta_{i+m}\leq x_1,\dots, \theta_{n+m} \leq x_m | \theta_1,\dots,\theta_n ).$

We now consider all the ${\theta}$ we generated. Note that the unconditional distribution of all the ${\theta}$s are just ${F_{n+m}}$. We find ${\tilde{\theta}_{(l)}}$ is at most ${\theta_{(l)}}$ since we have

$\displaystyle \tilde{\theta}_{(l)}= \theta_{(l)} \text{ if } \theta_{i+m} \geq \theta_{(l)},\forall 1\leq i\leq m$

and

$\displaystyle \tilde{\theta}_{(l)} < \theta_{(l)}, \text{ if } \exists i \text{ s.t. } \tilde{\theta}_{i+m}< \theta_{(l)}.$

Since we have almost sure inequality, we surely have stochastic dominance and inequality in expectation. Now consider the right-most inequality of (8) and (9). . We still follow the previous ways of generating ${\theta_i}$s. After we have ${\tilde{\theta}_{(i)}}$s, the ${n-(l-m)+1}$ th largest value ${\theta_{(l-m)}}$ is always smaller than the ${n-(l-m)+1}$th largest value ${\theta_{(l-m)}}$ since

$\displaystyle \tilde{\theta}_{(l)} = \theta_{(l-m)}\text{ if } \tilde{\theta}_{i+m} \leq \theta_{(l-m)},\forall 1\leq i\leq m$

and

$\displaystyle \tilde{\theta}_{(l)} > \theta_{(l-m)} \text{ if } \exists i \text{ s.t. } \tilde{\theta}_{i+m} >\theta_{(l-m)}.$

Thus we have shown that we have stochastically dominance for the random variables we are considering. The left-most inequality of of (8) and (9) can be proved using the fact that ${\theta_{(n+i)}\geq \tilde{\theta}_{(l)}}$ for any ${l\leq n}$. $\Box$

I found this interlacing result while taking a take-home exam on game theory. It turns out this result is not useful for the exam though. I would also like to thank Yueyang Liu for capturing a mistake in the previous version of the post.