Suppose you have a sequence of independent data , how do you test that s all come from the same distribution, i.e., how do you test homogeneity of the data?

To make the problem more precise, suppose we have a distribution family indexed by , i.e., a set

and each follows the distribution for some . Our problem is

Is ?

If we have that (known as the identifiability of ), then our question becomes

Is ?

Now suppose further that each has a density (so that we can write down the likelihood), the likelihood of seeing the independent sequence is

To test our question in a statistical way, we use hypothesis testing. Our null hypothesis is

and our alternative hypothesis is

Further denote the space of the null as and the space of the alternative as . A popular and natural approach is the likelihood ratio test. We construct the test statistic which is called** likelihood ratio** as

Intuitively, if our null hypothesis is indeed true, i.e., there is some such that and follows , then this ratio should be large and we have confidence that our null hypothesis is true. This means we should reject our null hypothesis if we find is small. Thus if we want to have a significance level test of our null hypothesis, we should reject null hypothesis when where satisfies

However, the main issue is that we don’t know the distribution of under even if we know how to sample from each and the functional form of for each . The reason is that did not specify which (which equals to ) generates the data. So the distribution of may depend on as well and the real thing we need for is

Thus even if we want to know approximate the through computational methods, we have to simulate for each . As could be rather large (in fact as large as ), approximation can be time consuming as well.

Fortunately, if is the so called location-scale family, we find that the distribution of is independent of and we are free to chose whichever we like. Let us define what is location-scale family, then state the theorem and prove it.

Definition 1Suppose we have a family of probability densities on indexed by where and , the set of symmetric positive definite matrices in . The family is a local-scale family if there is a family member (calledpivot) such that for any other with ,

Thus if follows , then has probability density . Indeed, for any Borel set

where we use a change of variable in the last equality and the last equality shows follows . We are now ready to state the theorem and prove it.

Theorem 2Suppose our family of distribution is a local-scale family, then under the null hypothesis, there is a such that each follows and the distribution of is independent of .

Since the distribution of is independent of under the null. This means that for any , and any

Thus we can choose any family member of to sample and approximates the distribution of using empirical distribution as long as is a location-scale family!

*Proof:* We need to show that the ratio has distribution independent of . Since and is a location scale family, we can assume they are generated via where follows a pivot and . Then the likelihood of is

Thus the likelihood ratio reduces to

Now let’s define , , and . Note that since , can vary all over the space , so is , and . The equality (10) can be rewritten as

As we just argued, , and can vary all over the space without any restriction, the supremum in the numerator and denominator thus does not depend on the choice and at all. So our theorem is proved.