An interlacing property of order statistics

In this post, we present an interlacing property of order statistics.

Suppose we have a distribution ${F}$ on real line ${\mathbb{R}}$ and ${\theta_i \overset{\text{iid}}{\sim} F}$ where ${i=1,\dots, n}$ and ${\overset{\text{iid}}{\sim}}$ means “independently and identically distributed as”. We denote ${\theta_{(i)}^n}$ to be the ${n-i+1}$ -th largest random variable of the ${n}$ many ${\theta_i}$ s. So ${\theta_{(n)}^n}$ is the largest value and ${\theta_{(1)}^n}$ is the smallest of the ${n}$ many iid ${\theta}$ s. Note here the subscript ${n}$ in ${\theta^n_{(i)}}$ does not mean taking to the ${n}$ th power but means how many ${\theta_i}$ s we are considering.

The ordered variables ${\{\theta^{n}_{(i)}\}_{i=1}^n}$ are called order statistics in Statistics. They are sufficient statistics for distribution ${F}$ when ${\theta_i}$ s are independent and all follow ${F}$ . We now present an interlacing property of order statistics.

Theorem 1 For any ${n,m>0}$ , ${ l\leq n, m-n+1\leq i\leq m}$ and ${x\in \mathbb{R}}$ , we have

$\displaystyle \begin{aligned} \mathbb{P}( \theta_{(n+i)}^{n+m}\leq x)\leq \mathbb{P} (\theta^n_{(n-m+i)}\leq x)\text{ and } \mathbb{P}( \theta_{(l)}^n\leq x) \leq \mathbb{ P}(\theta_{(l)}^{n+m}\leq x) \end{aligned} \ \ \ \ \ (1)$

If ${\theta_i}$ has finite mean, then we have

$\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n+i)}^{n+m})\geq \mathbb{E} (\theta^n_{(n-m+i)} )\text{ and } \mathbb{E}( \theta_{(l)}^n) \geq \mathbb{ E}(\theta_{(l)}^{n+m}) \end{aligned} \ \ \ \ \ (2)$

The meaning of the right inequalities of the inequality (1) and the inequality (2) is that the larger the sample size ${n}$ , the smaller the ${n-l+1}$ -th largest value. The left inequalities of the inequality (1) and the inequality (2) show that the larger sample size ${n}$ , the larger the ${m+i}$ -th order statistic is in expectation.

The first inequality chain is the stochastic dominance version of the second inequality chain. The first chain is a stronger property than the second.

Let us now explain why we call Theorem 1 an interlacing property. Consider the case that ${n\geq 2}$ and take ${m=1}$ , ${l =n}$ , and ${i=0}$ in Theorem 1. The inequality (2) gives that

$\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n)}^n) \geq \mathbb{ E}(\theta_{(n)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-1)}). \end{aligned} \ \ \ \ \ (3)$

If ${n\geq 3}$ , then we can take ${l=n-1}$ , ${m=1}$ , and ${i=-1}$ . Then equation (2) gives

$\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n-1)}^{n}) \geq \mathbb{ E}(\theta_{(n-1)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-2)}). \end{aligned} \ \ \ \ \ (4)$

Thus from (3) and (4) we have

$\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n)}^n) \geq \mathbb{ E}(\theta_{(n)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-1)})\geq \mathbb{ E}(\theta_{(n-1)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-2)}). \end{aligned} \ \ \ \ \ (5)$

Continuing the choices of ${l}$ and ${i}$ while fixing ${m=1}$ , we see that

$\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n+1)}^{n+1})\geq \mathbb{E}( \theta_{(n)}^n) \geq \mathbb{ E}(\theta_{(n)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-1)})\geq\dots \geq\mathbb{E} (\theta^{n+1}_{(2)})\geq \mathbb{E} (\theta^{n}_{(1)})\geq \mathbb{E} (\theta^{n+1}_{(1)}) . \end{aligned} \ \ \ \ \ (6)$

Thus we see the quantities ${\{\theta_{(i)}^n\}_{i=1}^n}$ and ${\{\theta_{(i)}^{n+1}\}_{i=1}^{n+1}}$ are interlacing. The main idea of the proof is putting ${\theta_{(l)}^n}$ and ${\theta_{(l)}^{n+m}}$ and ${\theta_{(l-m)}^n}$ in the same probability space and arguing that we actually have almost sure inequality instead of just expectation or stochastically dominance. This technique is known as coupling.

Proof: Let us first prove the right inequality. Suppose we first generate ${n}$ many iid ${\theta_i}$ s following ${F}$ and order them so that we have ${\theta_{(i)}^n}$ s. Now if we generate (independently) ${m}$ many more ${\theta_{i+n}}$ follows ${F}$ as well with ${i=1,\dots,m}$ . We now consider all the ${\theta}$ we generated, then we find ${\theta_{(l)}^{n+m}}$ is at most ${\theta_{(l)}^n}$ since we have

$\displaystyle \theta_{(l)}^{n+m} = \theta_{(l)}^n \text{ if } \theta_{i+n} \geq \theta_{(l)}^n,\forall 1\leq i\leq m$

and

$\displaystyle \theta_{(l)}^{n+m} < \theta_{(l)}^n, \text{ if } \exists i \text{ s.t. } \theta_{i+m}< \theta_{(l)}^n.$

Since we have almost sure inequality, we surely have stochastic dominance and inequality in expectation. Now consider the left inequality. We still follow the previous ways of generating ${\theta_i}$ s. After we have ${\theta_{(i)}^n}$ s, the ${n-(l-m)+1}$ -th largest value ${\theta_{(l-m)}^{n}}$ is always smaller than the ${n-(l-m)+1}$ -th largest value ${\theta_{(l)}^{n+m}}$ since

$\displaystyle \theta_{(l)}^{n+m} = \theta_{(l-m)}^{n} \text{ if } \theta_{i+n} \leq \theta_{(l-m)}^n,\forall 1\leq i\leq m$

and

$\displaystyle \theta_{(l)}^{n+m} > \theta_{(l-m)}^{n} \text{ if } \exists i \text{ s.t. } \theta_{i+n} >\theta_{(l-m)}^n.$

Thus we have shown that we have stochastically dominance for the random variables we are considering. The proof here is a bit lengthy and maybe not so clear but correct. One may draw a picture of the data-generating process and then compare the ${\theta_i}$ s to see the correctness. $\Box$

Interestingly, the above proof actually did not use the assumption that ${\theta}$ s are iid. This means we can have the same conclusion for any joint distribution. We state this result explicitly below.

Theorem 2 Let ${\theta_1,\dots, \theta_{n+m}}$ have joint distribution ${F_{m+n}}$ . Denote ${\theta_{(1)}\leq \theta_{(2)}\leq \dots \leq \theta_{(n)} \leq \dots \leq \theta_{(n+m)}}$ to be the order statistics. Now consider only the first ${n}$ many ${\theta_i}$ s, i.e., ${\theta_1,\dots,\theta_n}$ . Denote their marginal distribution as ${F_{n}}$ . We denote the order statistics of these ${n}$ many ${\theta}$ s to be ${\tilde{\theta}_{(1)} \leq \dots \leq \tilde{\theta}_{(n)}}$ . Then for any ${n,m>0}$ , ${m< l\leq n,1\leq i\leq m}$ and ${x\in \mathbb{R}}$ , we have

$\displaystyle \begin{aligned} \mathbb{P}( \theta_{(n+i)}\leq x)\leq \mathbb{P}( \tilde{\theta}_{(l)}\leq x) \leq \mathbb{ P}(\theta_{(l)}\leq x)\leq \mathbb{P} (\tilde{\theta}_{(l-m)}\leq x). \end{aligned} \ \ \ \ \ (8)$

If each ${\theta_i}$ has finite mean, then we have

$\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n+i)})\geq \mathbb{E}( \tilde{\theta}_{(l)}) \geq\mathbb{ E}(\theta_{(l)})\geq\mathbb{E} (\tilde{\theta}_{(l-m)}). \end{aligned} \ \ \ \ \ (9)$

Proof: The proof is a repetition of the previous argument. Let us first prove the middle inequality of (8) and (9). Suppose we first generate ${n}$ many ${\theta_i}$ s following ${F_n}$ and order them so that we have ${\theta_{(i)}}$ s. Now if we generate the extra ${m}$ many more ${\theta_{i+n}}$ follows the conditional distribution given by the conditional defined by ${F_{n+m}}$ and the first ${n}$ many thetas. Specifically, the conditional distribution is

$\displaystyle P(\theta_{i+m}\leq x_1,\dots, \theta_{n+m} \leq x_m | \theta_1,\dots,\theta_n ).$

We now consider all the ${\theta}$ we generated. Note that the unconditional distribution of all the ${\theta}$ s are just ${F_{n+m}}$ . We find ${\tilde{\theta}_{(l)}}$ is at most ${\theta_{(l)}}$ since we have

$\displaystyle \tilde{\theta}_{(l)}= \theta_{(l)} \text{ if } \theta_{i+m} \geq \theta_{(l)},\forall 1\leq i\leq m$

and

$\displaystyle \tilde{\theta}_{(l)} < \theta_{(l)}, \text{ if } \exists i \text{ s.t. } \tilde{\theta}_{i+m}< \theta_{(l)}.$

Since we have almost sure inequality, we surely have stochastic dominance and inequality in expectation. Now consider the right-most inequality of (8) and (9). . We still follow the previous ways of generating ${\theta_i}$ s. After we have ${\tilde{\theta}_{(i)}}$ s, the ${n-(l-m)+1}$ th largest value ${\theta_{(l-m)}}$ is always smaller than the ${n-(l-m)+1}$ th largest value ${\theta_{(l-m)}}$ since

$\displaystyle \tilde{\theta}_{(l)} = \theta_{(l-m)}\text{ if } \tilde{\theta}_{i+m} \leq \theta_{(l-m)},\forall 1\leq i\leq m$

and

$\displaystyle \tilde{\theta}_{(l)} > \theta_{(l-m)} \text{ if } \exists i \text{ s.t. } \tilde{\theta}_{i+m} >\theta_{(l-m)}.$

Thus we have shown that we have stochastically dominance for the random variables we are considering. The left-most inequality of of (8) and (9) can be proved using the fact that ${\theta_{(n+i)}\geq \tilde{\theta}_{(l)}}$ for any ${l\leq n}$ . $\Box$

I found this interlacing result while taking a take-home exam on game theory. It turns out this result is not useful for the exam though. I would also like to thank Yueyang Liu for capturing a mistake in the previous version of the post.

Three Squirrels

Little Discoveries of Optimization, Statistics, and Probability

An interlacing property of order statistics

Leave a comment Cancel reply

Share this:

Related

Leave a comment Cancel reply