We consider the following minimization problem:
The weights are all positive and
. We call the optimziation problem (1) the least absolute deviation problem. Note the optimziation problem (1) has a single variable
. If
are all equal, then it is well-known that least absolute deviation has median of
as its minimizer. In the case of unequal weights
, we are going to show the median defined by the weights
is again the minimizer.
To make the presentation simpler, we assume the weights are normalized:
Without loss of generality, we assume the values are ordered and distinct:
Next, we give the definition of median of with respect to the weights
. For each
, we denote
,
, and
be the set of indices
such that
,
, and
respectively. Note that
is either a single index or an empty set.
Definition 1 We say an is a median of
with respect to the weights
if
and
, where
if
is empty.
Note that under this definition, the median is acutally not always unique.
We now state the theorem that asserts the optimal solution to (1) is the weighted median.
Theorem 2 Under assumption (2) and (3), the optimal solution to (1) is any median in the sense of Definition 1.
The key of the proof is to note that (1) is convex and a point is optimal if and only if
, where
is the subdifferential of
at
, a generalization of derivative at
.
Proof: For any , the subdifferential at
is
where if
is nonempty and
if
is empty. The addition in (4) should be understood as set addition:
.
Since the function is a convex function with whole domain
. Hence, an
is optimal if and only if
From the the definition of median and the assumption that the weights added up to 1, it is immediate that the median satisfies the above condition and any solution satisfies the above condition is a median.
From the Theorem 2, we know that we only need to compute the median of w.r.t.
. This can be computed first by ordering the
so that condition (3) is satisfied (combining same quantities). This procedure only takes
comparisons plus
operations in creating a new set of values
and their weights
of combining terms and adding weights up for the same
. Here
. Then we add
up until the
threshold is reached to determine
. This procedure again only takes
additions.