An interlacing property of order statistics

In this post, we present an interlacing property of order statistics.

Suppose we have a distribution {F} on real line {\mathbb{R}} and {\theta_i \overset{\text{iid}}{\sim} F} where {i=1,\dots, n} and {\overset{\text{iid}}{\sim}} means “independently and identically distributed as”. We denote {\theta_{(i)}^n} to be the {n-i+1}-th largest random variable of the {n} many {\theta_i}s. So {\theta_{(n)}^n} is the largest value and {\theta_{(1)}^n} is the smallest of the {n} many iid {\theta}s. Note here the subscript {n} in {\theta^n_{(i)}} does not mean taking to the {n}th power but means how many {\theta_i}s we are considering.

The ordered variables {\{\theta^{n}_{(i)}\}_{i=1}^n} are called order statistics in Statistics. They are sufficient statistics for distribution {F} when {\theta_i}s are independent and all follow {F}. We now present an interlacing property of order statistics.

Theorem 1 For any {n,m>0}, { l\leq n, m-n+1\leq i\leq m} and {x\in \mathbb{R}}, we have

\displaystyle \begin{aligned} \mathbb{P}( \theta_{(n+i)}^{n+m}\leq x)\leq \mathbb{P} (\theta^n_{(n-m+i)}\leq x)\text{ and } \mathbb{P}( \theta_{(l)}^n\leq x) \leq \mathbb{ P}(\theta_{(l)}^{n+m}\leq x) \end{aligned} \ \ \ \ \ (1)

If {\theta_i} has finite mean, then we have

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n+i)}^{n+m})\geq \mathbb{E} (\theta^n_{(n-m+i)} )\text{ and } \mathbb{E}( \theta_{(l)}^n) \geq \mathbb{ E}(\theta_{(l)}^{n+m}) \end{aligned} \ \ \ \ \ (2)

The meaning of the right inequalities of the inequality (1) and the inequality (2) is that the larger the sample size {n}, the smaller the {n-l+1}-th largest value. The left inequalities of the inequality (1) and the inequality (2) show that the larger sample size {n}, the larger the {m+i}-th order statistic is in expectation.

The first inequality chain is the stochastic dominance version of the second inequality chain. The first chain is a stronger property than the second.

Let us now explain why we call Theorem 1 an interlacing property. Consider the case that {n\geq 2} and take {m=1}, {l =n}, and {i=0} in Theorem 1. The inequality (2) gives that

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n)}^n) \geq \mathbb{ E}(\theta_{(n)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-1)}). \end{aligned} \ \ \ \ \ (3)

If {n\geq 3}, then we can take {l=n-1}, {m=1}, and {i=-1}. Then equation (2) gives

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n-1)}^{n}) \geq \mathbb{ E}(\theta_{(n-1)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-2)}). \end{aligned} \ \ \ \ \ (4)

Thus from (3) and (4) we have

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n)}^n) \geq \mathbb{ E}(\theta_{(n)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-1)})\geq \mathbb{ E}(\theta_{(n-1)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-2)}). \end{aligned} \ \ \ \ \ (5)

Continuing the choices of {l} and {i} while fixing {m=1}, we see that

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n+1)}^{n+1})\geq \mathbb{E}( \theta_{(n)}^n) \geq \mathbb{ E}(\theta_{(n)}^{n+1})\geq \mathbb{E} (\theta^n_{(n-1)})\geq\dots \geq\mathbb{E} (\theta^{n+1}_{(2)})\geq \mathbb{E} (\theta^{n}_{(1)})\geq \mathbb{E} (\theta^{n+1}_{(1)}) . \end{aligned} \ \ \ \ \ (6)

Thus we see the quantities {\{\theta_{(i)}^n\}_{i=1}^n} and {\{\theta_{(i)}^{n+1}\}_{i=1}^{n+1}} are interlacing. The main idea of the proof is putting {\theta_{(l)}^n} and {\theta_{(l)}^{n+m}} and {\theta_{(l-m)}^n} in the same probability space and arguing that we actually have almost sure inequality instead of just expectation or stochastically dominance. This technique is known as coupling.

Proof: Let us first prove the right inequality. Suppose we first generate {n} many iid {\theta_i}s following {F} and order them so that we have {\theta_{(i)}^n}s. Now if we generate (independently) {m} many more {\theta_{i+n}} follows {F} as well with {i=1,\dots,m}. We now consider all the {\theta} we generated, then we find {\theta_{(l)}^{n+m}} is at most {\theta_{(l)}^n} since we have

\displaystyle \theta_{(l)}^{n+m} = \theta_{(l)}^n \text{ if } \theta_{i+n} \geq \theta_{(l)}^n,\forall 1\leq i\leq m

and

\displaystyle \theta_{(l)}^{n+m} < \theta_{(l)}^n, \text{ if } \exists i \text{ s.t. } \theta_{i+m}< \theta_{(l)}^n.

Since we have almost sure inequality, we surely have stochastic dominance and inequality in expectation. Now consider the left inequality. We still follow the previous ways of generating {\theta_i}s. After we have {\theta_{(i)}^n}s, the {n-(l-m)+1}-th largest value {\theta_{(l-m)}^{n}} is always smaller than the {n-(l-m)+1}-th largest value {\theta_{(l)}^{n+m}} since

\displaystyle \theta_{(l)}^{n+m} = \theta_{(l-m)}^{n} \text{ if } \theta_{i+n} \leq \theta_{(l-m)}^n,\forall 1\leq i\leq m

and

\displaystyle \theta_{(l)}^{n+m} > \theta_{(l-m)}^{n} \text{ if } \exists i \text{ s.t. } \theta_{i+n} >\theta_{(l-m)}^n.

Thus we have shown that we have stochastically dominance for the random variables we are considering. The proof here is a bit lengthy and maybe not so clear but correct. One may draw a picture of the data-generating process and then compare the {\theta_i}s to see the correctness. \Box

Interestingly, the above proof actually did not use the assumption that {\theta}s are iid. This means we can have the same conclusion for any joint distribution. We state this result explicitly below.

Theorem 2 Let {\theta_1,\dots, \theta_{n+m}} have joint distribution {F_{m+n}}. Denote {\theta_{(1)}\leq \theta_{(2)}\leq \dots \leq \theta_{(n)} \leq \dots \leq \theta_{(n+m)}} to be the order statistics. Now consider only the first {n} many {\theta_i}s, i.e., {\theta_1,\dots,\theta_n}. Denote their marginal distribution as {F_{n}}. We denote the order statistics of these {n} many {\theta}s to be {\tilde{\theta}_{(1)} \leq \dots \leq \tilde{\theta}_{(n)}}. Then for any {n,m>0}, {m< l\leq n,1\leq i\leq m} and {x\in \mathbb{R}}, we have

\displaystyle \begin{aligned} \mathbb{P}( \theta_{(n+i)}\leq x)\leq \mathbb{P}( \tilde{\theta}_{(l)}\leq x) \leq \mathbb{ P}(\theta_{(l)}\leq x)\leq \mathbb{P} (\tilde{\theta}_{(l-m)}\leq x). \end{aligned} \ \ \ \ \ (8)

If each {\theta_i} has finite mean, then we have

\displaystyle \begin{aligned} \mathbb{E}( \theta_{(n+i)})\geq \mathbb{E}( \tilde{\theta}_{(l)}) \geq\mathbb{ E}(\theta_{(l)})\geq\mathbb{E} (\tilde{\theta}_{(l-m)}). \end{aligned} \ \ \ \ \ (9)

Proof: The proof is a repetition of the previous argument. Let us first prove the middle inequality of (8) and (9). Suppose we first generate {n} many {\theta_i}s following {F_n} and order them so that we have {\theta_{(i)}}s. Now if we generate the extra {m} many more {\theta_{i+n}} follows the conditional distribution given by the conditional defined by {F_{n+m}} and the first {n} many thetas. Specifically, the conditional distribution is

\displaystyle P(\theta_{i+m}\leq x_1,\dots, \theta_{n+m} \leq x_m | \theta_1,\dots,\theta_n ).

We now consider all the {\theta} we generated. Note that the unconditional distribution of all the {\theta}s are just {F_{n+m}}. We find {\tilde{\theta}_{(l)}} is at most {\theta_{(l)}} since we have

\displaystyle \tilde{\theta}_{(l)}= \theta_{(l)} \text{ if } \theta_{i+m} \geq \theta_{(l)},\forall 1\leq i\leq m

and

\displaystyle \tilde{\theta}_{(l)} < \theta_{(l)}, \text{ if } \exists i \text{ s.t. } \tilde{\theta}_{i+m}< \theta_{(l)}.

Since we have almost sure inequality, we surely have stochastic dominance and inequality in expectation. Now consider the right-most inequality of (8) and (9). . We still follow the previous ways of generating {\theta_i}s. After we have {\tilde{\theta}_{(i)}}s, the {n-(l-m)+1} th largest value {\theta_{(l-m)}} is always smaller than the {n-(l-m)+1}th largest value {\theta_{(l-m)}} since

\displaystyle \tilde{\theta}_{(l)} = \theta_{(l-m)}\text{ if } \tilde{\theta}_{i+m} \leq \theta_{(l-m)},\forall 1\leq i\leq m

and

\displaystyle \tilde{\theta}_{(l)} > \theta_{(l-m)} \text{ if } \exists i \text{ s.t. } \tilde{\theta}_{i+m} >\theta_{(l-m)}.

Thus we have shown that we have stochastically dominance for the random variables we are considering. The left-most inequality of of (8) and (9) can be proved using the fact that {\theta_{(n+i)}\geq \tilde{\theta}_{(l)}} for any {l\leq n}. \Box

I found this interlacing result while taking a take-home exam on game theory. It turns out this result is not useful for the exam though. I would also like to thank Yueyang Liu for capturing a mistake in the previous version of the post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s