# Sandwich inequality of optimal gap and distance to solutions of LP

Suppose we have a general optimization problem.

\displaystyle \begin{aligned} & \underset{x \in \mathbb{R}^n}{\text{minimize}} & & f(x)\\ & {\text{subject to}} & & x \in \Omega. \\ \end{aligned} \ \ \ \ \ (1)

Also, suppose problem (1) has a minimum and the minimum can be achieved by a unique minimizer ${x^*}$.

Now if I have a point ${x}$ such that ${f(x) - f(x^*) }$ is very small, then how small is the distance ${\|x-x^*\|}$. We might expect that ${f(x) - f(x^*) \rightarrow 0}$ will imply that ${x \rightarrow x^*}$. This is true if ${\Omega}$ is compact and ${f}$ is continuous. But this does not tell what is the quantitative relationship between the optimal gap, i.e., ${f(x) -f(x^*)}$, and the distance to the solution, i.e., ${\|x-x^*\|}$.

In this post, I am going to show that for linear programming (LP), the optimal gap and distance to solutions are the same up to a multiplicative constant which only depends on the problem data.

To start, consider an LP in the standard form, i.e.,

\displaystyle \begin{aligned} & \underset{x \in \mathbb{R}^n}{\text{minimize}} & & c^Tx\\ & {\text{subject to}} & & Ax =b \\ & & & x\geq 0. \end{aligned} \ \ \ \ \ (2)

where the decision variable is ${x}$ and problem data are ${A\in \mathbb{R}^{m\times n},b\in \mathbb{R}^m,c\in \mathbb{}R^n}$. ${x\geq 0}$ means each coordinate of ${x}$ is nonnegative.

Denote the solution set of problem (1) to be ${X^* = \arg \min\{ c^Tx| Ax = b, x\geq 0\} }$, and the distance to the solution set ${X^*}$ to be ${\text{dist}(x,X^*) = \inf_{x^* \in X^*} \|x-x^*\|}$. Note that the norm here is arbitrary, not necessarily the Euclidean norm.

We are now ready to state the theorem.

Theorem 1 (Sandwich inequality of optimal gap and distance to solutions of LP) For problem (1), theres exist constants ${C_0, c_0>0}$ which depends only on ${A,b,c}$ and ${\|\cdot\|}$ such that for all feasible ${x}$, i.e., ${Ax= b, x\geq 0}$,

$\displaystyle C_0 \text{dist}(x,X^*)\geq c^Tx - c^Tx^* \geq c_0 \text{dist}(x,X^*).$

The above theorem shows that the role of optimal gap, i.e., ${c^Tx - c^Tx^*}$, and the distance to the solution set, i.e. ${\text{dist}(x,X^*)}$, are the same up to a multiplicative constant. The right inequality of the theorem is usually referred to as linear growth in the optimization literature.

The proof below is constructive and we can in fact take ${c_0 = \frac{\epsilon}{\max\{2B,1\}}}$ where ${ B = \max_{ 1\leq i \leq l} \|x_i\|}$ and ${\epsilon = \min_{ m+1\leq i\leq l, n+1\leq j \leq s} \{ c^Tx_i-c^Tx^*, c^T\gamma_j\}}$ for ${x^*\in X^*}$ and ${C_0 = \|c\|_*}$. Here ${\| y\|_* = \sup \{ u^Ty\mid \|u\|\leq 1\}}$ is the dual norm and $x_i,\gamma_j$ are extreme points and extreme rays. We assume there are $l$ many extreme points and $s$ many extreme rays with first $m\; x_i$s and first $n \; \gamma_j$ are in the optimal set $X^*$. See the proof for more detail.

The idea of the proof mainly relies on the extreme point and extreme rays representation of the feasible region and the optimal set,i.e., ${\{x : Ax = b, x\geq 0\}}$ and $X^*$.

Proof: The feasible region can be written as

$\displaystyle \{x|Ax=b,x\geq 0\} = \{ \sum_{i=1}^l \alpha_i x_i + \sum_{j=1}^s \beta_j \gamma_j | \sum_i \alpha_i = 1, \alpha_i\geq 0, \beta_j\geq 0\}.$

Here ${x_i}$s are extreme points of the feasible region and ${\gamma_j}$s are extreme rays. By scaling the extreme rays, we can assume that ${\|\gamma_j\| =1}$ for all ${j}$.

The optimal set can also be written as

$\displaystyle X^* = \{ \sum_{i=1}^m \alpha_i x_i + \sum_{j=1}^n \beta_j \gamma_j | \sum_i \alpha_i = 1, \alpha_i\geq 0, \beta_j\geq 0\}.$

We assume here the first ${m}$ many ${x_i}$ and ${n}$ many ${\gamma_j}$ are in the optimal set and the rest of ${x_i}$ and ${\gamma_j}$s are not for notation simplicity.

We denote ${B = \max_{ 1\leq i \leq l} \|x_i\|}$ and ${\epsilon = \min_{ m+1\leq i\leq l, n+1\leq j \leq s} \{ c^Tx_i-c^Tx^*, c^T\gamma_j\}}$ where ${x^*\in X^*}$. Note ${\epsilon>0}$ since the ${\gamma_j}$s not in the optimal set should have inner product with ${c}$ to be positive.

We first prove the second inquality, i.e., ${c^Tx - c^Tx^* \geq c_0 \text{dist}(x,X^*)}$.

Now take an arbitrary feasible ${x}$, it can be written as

$\displaystyle x = \sum_{i=1}^m a_i x_i + \sum_{i=m+1}^l a_i x_i + \sum _{j=1}^n b_j \gamma_j + \sum_{j=n+1}^s b_j \gamma_j$

for some ${a_i \geq 0, \sum_i a_i =1}$ and ${b_j\geq 0}$.

The objective value of ${x}$ is then

$\displaystyle c^Tx = \sum_{i=1}^m a_i c^T x_i + \sum_{i=m+1}^l a_i c^T x_i +\sum_{j=n+1}^s b_j c^T \gamma_j.$

We use the fact that ${c^T\gamma_j=0}$ for all ${j\leq m}$ here.

Subtract the above by ${c^Tx^*}$. We have

\displaystyle \begin{aligned} c^Tx -c^Tx^*& = (\sum_{i=1}^m a_i-1)c^T x_i + \sum_{i=m+1}^l a_i c^T x_i +\sum_{j=n+1}^s b_j c^T \gamma_j \\ & = (-\sum_{i=m+1}^l a_i)c^T x_i + \sum_{i=m+1}^l a_i c^T x_i +\sum_{j=n+1}^s b_j c^T \gamma_j \\ &\geq (\sum_{i=m+1}^l a_i) \epsilon + (\sum_{j=n+1}^s b_j )\epsilon \end{aligned} \ \ \ \ \ (3)

The second equality is due to ${\sum_i a_i =1}$ and the inequality is because of the definition of ${\epsilon}$ and the ${b_j,a_i}$s are positive.

The distance between ${x}$ and ${X^*}$ is the infimum of

$\displaystyle \| \sum_{i=1}^m( a_i-\alpha_i)x_i + \sum_{i=m+1}^l a_i x_i + \sum _{j=1}^n (b_j-\beta_j) \gamma_j + \sum_{j=n+1}^s b_j \gamma_j \|.$

By taking ${\alpha_i \geq a_i}$ with ${\sum_i^m \alpha_i =1}$ and ${\beta_j = b_j}$ and apply triangular inequality to the above quantity, we have

\displaystyle \begin{aligned} &\| \sum_{i=1}^m( a_i-\alpha_i)x_i + \sum_{i=m+1}^l a_i x_i + \sum _{j=1}^n (b_j-\beta_j) \gamma_j + \sum_{j=n+1}^s b_j \gamma_j \| \\ &\leq \sum_{i=1}^m( \alpha_i-a_i)\|x_i \|+ \sum_{i=m+1}^l a_i\|x_i \|+\sum_{j=n+1}^s b_j \|\gamma_j \|\\ &\leq \sum_{i=1}^m (\alpha_i -a_i) B + \sum_{i=m+1}^l a_i B + \sum_{j=n+1}^s b_j\\ &= (1-\sum_{i=1}^m a_i)B + (\sum_{i=m+1}^l a_i)B + \sum_{j=n+1}^s b_j\\ &= 2B (\sum_{i=m+1}^l a_i) +\sum_{j=n+1}^s b_j. \end{aligned} \ \ \ \ \ (4)

The first inequality is the triangular inequality and ${\alpha_i\geq a_i, a_i\geq 0, b_j\geq 0}$. The second inequality is applying the definition of ${B}$ and ${\|\gamma_j\|=1}$. The first equality is due to ${\sum_i \alpha_i =1}$ and the second equality is due to ${\sum_i a_i =1}$.

Thus the distance between ${x}$ and ${X^*}$ is bounded above by

$\displaystyle \text{dist}(x,X^*) \leq 2B(\sum_{i=m+1}^l a_i) +\sum_{j=n+1}^s b_j.$

Since ${c^Tx -c^Tx^* \geq( \sum_{i = m+1}^l a_i + \sum_{j=n+1}^s b_j)\epsilon}$ by our previous argument, we see that setting

$\displaystyle c_0 = \frac{\epsilon}{\max\{2B,1\}}$

should give us

$\displaystyle c^Tx - c^Tx^* \geq c_0 \text{dist}(x,X^*).$

We now prove the inequality

$\displaystyle C_0 \text{dist}(x,X^*)\geq c^Tx - c^Tx^*.$

Note that the infimum in ${\text{dist}(x,X^*) = \inf_{x^* \in X^*} \|x-x^*\|}$ is actually achieved by some ${x^*}$. The reason is that we can first pick a ${x'\in X^*}$, then

$\displaystyle \inf_{x^* \in X^*} \|x-x^*\| = \inf_{x^* \in X^*, \|x-x^*\| \leq \|x-x'\|} \|x-x^*\|.$

But the set ${X^* \cap \{x^* | \|x-x^*\| \leq \|x-x'\| \}}$ is actually bounded and closed (${X^*}$ is closed as it is a convex combination of finite points plus a conic combination of extreme vectors), thus Weierstrass theorems tells us that the infimum is actually achieved by some ${x^*\in X}$.

Now take ${x^*}$ such that ${\|x-x^*\| =\text{dist}(x,X)}$. We have

$\displaystyle c^Tx -c^Tx^* \leq \|c\|_* \|x-x^*\|=\|c\|_*\text{dist}(x,X^*)$

where ${\|\cdot\|_*}$ is the dual norm of ${\|\cdot\|}$. Thus letting ${C_0 = \|c\|_*}$ finishes the proof. $\Box$

From the proof, we see that two possible choice of ${c_0}$ and ${C_0}$ are ${c_0 = \frac{\epsilon}{\max\{2B,1\}}}$ where ${ B = \max_{ 1\leq i \leq l} \|x_i\|}$ and ${\epsilon = \min_{ m+1\leq i\leq l, n+1\leq j \leq s} \{ c^Tx_i-c^Tx^*, c^T\gamma_j\}}$ for ${x^*\in X^*}$ and ${C_0 = \|c\|_*}$. These are not optimal and can be sharpened. I probably will give a sharper constant in a future post.