One dimensional least absolute deviation problem

We consider the following ${\ell_1}$ minimization problem:

$\displaystyle \begin{aligned} & \underset{x \in \mathbb{R} }{\text{minimize}} & &f(x):\,= \sum_{i=1}^n w_i | x - b_i|. \end{aligned} \ \ \ \ \ (1)$

The weights ${w_i}$ are all positive and ${b_i \in \mathbb{R}}$ . We call the optimziation problem (1) the least absolute deviation problem. Note the optimziation problem (1) has a single variable ${x}$ . If ${w_i}$ are all equal, then it is well-known that least absolute deviation has median of ${\{b_i\}_{i=1}^n}$ as its minimizer. In the case of unequal weights ${w_i}$ , we are going to show the median defined by the weights ${w_i}$ is again the minimizer.

To make the presentation simpler, we assume the weights are normalized:

$\displaystyle \sum_{i=1}^n w_i = 1. \ \ \ \ \ (2)$

Without loss of generality, we assume the values ${b_i}$ are ordered and distinct:

$\displaystyle b_1 <b_2<\dots < b_n. \ \ \ \ \ (3)$

Next, we give the definition of median of ${\{b_i\}_{i=1}^n}$ with respect to the weights ${\{w_i\}_{i=1}^n}$ . For each ${x\in \mathbb{R}}$ , we denote ${I_x^-}$ , ${I_x^=}$ , and ${I_x^+}$ be the set of indices ${i}$ such that ${x<b_i}$ , ${x=b_i}$ , and ${x>b_i}$ respectively. Note that ${I_x^=}$ is either a single index or an empty set.

Definition 1 We say an ${x}$ is a median of ${\{b_i\}_{i=1}^n}$ with respect to the weights ${\{w_i\}_{i=1}^n}$ if ${\sum_{i\in I^+_x} w_i \leq 0.5}$ and ${\sum_{i\in I^+_x} w_i + w_{I^=_x} \geq 0.5}$ , where ${w_{I^=_x} =0}$ if ${I^=_x}$ is empty.

Note that under this definition, the median is acutally not always unique.

We now state the theorem that asserts the optimal solution to (1) is the weighted median.

Theorem 2 Under assumption (2) and (3), the optimal solution to (1) is any median ${x}$ in the sense of Definition 1.

The key of the proof is to note that (1) is convex and a point ${x}$ is optimal if and only if ${ 0 \in \partial f(x)}$ , where ${\partial f(x)}$ is the subdifferential of ${f}$ at ${x}$ , a generalization of derivative at ${x}$ .

Proof: For any ${x\in \mathbb{R}}$ , the subdifferential at ${x}$ is

$\displaystyle \partial f(x) = - \sum_{i\in I^-_x} w_i + \sum_{i\in I^+_x} w_i + w_{I_x^=} \times [-1,1], \ \ \ \ \ (4)$

where ${w_{I_x^=} \times [-1,1] = [-w_{I_x^=}, w_{I_x^=}]}$ if ${I_x^=}$ is nonempty and ${w_{I_x^=} \times [-1,1] = 0}$ if ${I_x^=}$ is empty. The addition in (4) should be understood as set addition: ${A+B = \{a+b \mid a\in A, b\in B\}}$ .

Since the function ${f}$ is a convex function with whole domain ${\mathbb{R}}$ . Hence, an ${x}$ is optimal if and only if

$\displaystyle 0 \in \partial f(x) = - \sum_{i\in I^-_x} w_i + \sum_{i\in I^+_x} w_i + w_{I_x^=} \times [-1,1]. \ \ \ \ \ (5)$

From the the definition of median and the assumption that the weights added up to 1, it is immediate that the median satisfies the above condition and any solution satisfies the above condition is a median. $\Box$

From the Theorem 2, we know that we only need to compute the median of ${\{b_i\}_{i=1}^n}$ w.r.t. ${\{w_i\}_{i=1}^n}$ . This can be computed first by ordering the ${b_i}$ so that condition (3) is satisfied (combining same quantities). This procedure only takes ${\mathcal{O}(n\log n)}$ comparisons plus ${\mathcal{O}(n)}$ operations in creating a new set of values ${\{b'_i\}_{i=1}^{n'}}$ and their weights ${\{w_i'\}_{i=1}^{n'}}$ of combining terms and adding weights up for the same ${b_i}$ . Here ${n'\leq n}$ . Then we add ${w_i'}$ up until the ${0.5}$ threshold is reached to determine ${i}$ . This procedure again only takes ${\mathcal{O}(n)}$ additions.