CSE 535: Homework #4

Due: Sun, Dec 12
Gradescope Link

1. Interior point methods (IPM) [42 points]

In this problem, you will give an alternate proof of convergence of IPM for linear programs. Define the reasible region $\mathsf{K} = \{ x \in \mathbb{R}^n : Ax \leq b \}$ , where $A \in \mathbb{R}^{m \times n}$ . Our goal is to solve the problem: Given a cost vector $c \in \mathbb{R}^n$ ,

$\begin{equation}\label{eq:lp} \min_{x \in \mathsf{K}} \ \langle c,x\rangle. \end{equation}$

Define the slack function for the $i$ th constraint:

$s_i(x) = b_i - (Ax)_i,$

and the IPM objective

$\begin{equation}\label{eq:ipm} F_t(x) = t \langle c,x\rangle + \sum_{i=1}^m \log \left(\frac{1}{s_i(x)}\right). \end{equation}$

Define also the central path:

$x^*(t) = \mathrm{argmin} \left\{ F_t(x) : x \in \mathbb{R}^n \right\}.$

Note that we have changed the weighting so that $t \to \infty$ corresponds to solving the original problem $\eqref{eq:lp}$ , i.e., $x^*(t) \to x^*$ as $t \to \infty$ . The second term in \eqref{eq:lp} is the barrier function that ensures we have $x^*(t) \in \mathrm{int}(\mathsf{K}) = \left\{ x \in \mathbb{R}^n : Ax < b \right\}.$

Let us calculate the gradient ahd Hessian of \eqref{eq:ipm}:

$\begin{align*} \nabla F_t(x) &= t c + \sum_{i=1}^m \frac{A_i}{s_i(x)} \\ \nabla^2 F_t(x) &= \sum_{i=1}^m \frac{A_i A_i^{\top}}{s_i(x)^2}, \end{align*}$

where $A_i$ is the $i$ th row of $A$ .

Definition (Local ellipsoid): Define

$\mathcal{E}(x,R) = \left\{ y \in \mathbb{R}^n : \langle y-x, \nabla^2 F_t(x) (y-x)\rangle \leq R^2 \right\}.$

This is the $R$ -ball around $x$ in the local norm $\|v\|_x = \sqrt{\langle v, \nabla^2 F_t(x) v\rangle }$ .

[2 points] Show that
$\mathcal{E}(x,R) = \left\{ y \in \mathbb{R}^n : \sum_{i=1}^m \frac{(s_i(y)-s_i(x))^2}{s_i(x)^2} \leq R^2 \right\}.$
[4 points] Prove that for any $x \in \mathrm{int}(K)$ , we have $\mathcal{E}(x,1) \subseteq \mathsf{K}$ . Therefore if we sit at some $x \in \mathrm{int}(K)$ , it is safe to move to any $x' \in \mathcal{E}(x,1)$ .
[12 points] Fix a value of $t \geq 0$ . The goal of this part is to prove that for any $x \in \mathcal{E}(x^*(t),R)$ and $0 \leq R \leq \frac{1}{12}$ , if $x' = x - \left(\nabla^2 F_t(x)\right)^{-1} \nabla F_t(x)$ , then

This means that if we are close to $x^*(t)$ , then taking a Newton step gets us even closer (in terms of the local norms).
1. [4 points] For any vector $y$ on the line segment joining $x$ and $x^*(t)$, prove that $$ (1-3R) \nabla^2 F_t(x) \preceq \nabla^2 F_t(y) \preceq (1+3R) \nabla^2 F_t(x). $$
2. [4 points] Prove that $$ \nabla F_t(x) = \left(\nabla^2 F_t(x) + E\right) (x-x^*(t)), $$ for some matrix $E$ satisfying $|E| \preceq 3R\ \nabla^2 F_t(x)$. [Hint: Use the Fundamental Theorem of Calculus.]
3. *[4 points] Use the previous two parts to conclude that \eqref{eq:goal} holds.
[12 points] You will now show that we can increase $t$ by a factor of $1 + \Theta(\frac{1}{\sqrt{m}})$ .
1. [4 points] Consider $x \in \mathrm{int}(\mathsf{K})$ and $x+h$ on the boundary of $\mathcal{E}(x,R)$ for some $0 \leq R \leq \frac{1}{12}$. Show that $$ F_t(x+h) = F_t(x) + \langle \nabla F_t(x), h\rangle + \frac{R^2}{2} \pm \frac32 R^3. $$
2. [4 points] Prove that $$ \max \left\{ t \langle c, x-x^*(t)\rangle : x \in \mathcal{E}(x^*(t),R) \right\} \leq R \sqrt{m}. $$
3. [4 points] Combine the preceding two parts to show that if $t' \leq t\left(1+\frac{R}{4 \sqrt{m}}\right)$, then $x^*(t') \in \mathcal{E}(x^*(t),R)$.
[4 points] Combine the results of parts 3 and 4 to show that $x' \in \mathcal{E}(x^*(t'), \frac{1}{12})$ , where $t' = t \left(1+\frac{R}{4\sqrt{m}}\right)$ , and $x \in \mathcal{E}(x^*(t), \frac{1}{12})$ , and where $x'$ is the result of a Newton step
$x' = x - \left(\nabla^2 F_t(x)\right)^{-1} \nabla F_t(x).$
[4 points] Prove that for any $t > 0$ and $x \in \mathcal{E}(x^*(t), 1/12)$ ,
$\langle c,x\rangle - \min \left\{ \langle c,y\rangle : y \in \mathsf{K} \right\} \leq \frac{3m}{t}.$
[Hint: Start with D(ii).]
[4 points] Combine all these results to conclude a bound on the iteration complexity of IPM to obtain a point within $\epsilon$ of the optimum in terms of $m$ , $\epsilon$ , and $t_0 > 0$ , where $t_0$ is the initial value of $t$ . (I.e., you may assume that you are given $x \in \mathcal{E}(x^*(t_0),1/12)$ to start.)

2. JL for low-space computation [20 points]

Let $\Pi \in \mathbb{R}^{k \times d}$ be a random matrix where the entries are i.i.d. $N(0,1)$ random variables. In this problem, you may assume that for $k \geq 100 \frac{\log(1/\delta)}{\varepsilon^2}$ , it holds that with probability at least $1-\delta$ , for any fixed vector $v \in \mathbb{R}^n$ , we have

$(1-\varepsilon) \|v\|^2 \leq \frac{1}{k} \|\Pi v\|^2 \leq (1+\varepsilon) \|v\|^2$

[10 points] Suppose we maintain a vector $v \in \mathbb{R}^n$ as follows. Initially, $v = 0$ . At the $k$ th step, we update
$v \leftarrow v + a_k e_{i_k},$
where $\{e_1,\ldots,e_n\}$ are the standard basis vectors.

Suppose that you have a memory that only allows you to store $O(\frac{1}{\varepsilon^2} \log (n))$ numbers as the updates $(a_k,i_k)$ arrive one by one. Show how to maintain an estimate for $\|v\|^2$ so that by the final iteration, the estimate is correct up to $(1\pm \varepsilon)$ multiplicative error with probability at least $1-n^{-3}$ .
[10 points] In the same “online” setting as the previous part, suppose you are told that the final vector $v$ has some index $i \in \{1,2,\ldots,n\}$ such that $v_i^2 \geq (2/3) \|v\|^2$ . Show that by using only $(\log n)^{O(1)}$ bits of memory, we can find the index $i$ with probability at least $1-n^{-3}$ . (Note: You may assume that each number $a_k$ uses only $O(\log n)$ bits of precision.)