Suppose that \(A\) is an \(m \times n\) matrix and recall that \(\sigma_i(A) = \langle a_i, (A^{\top} A)^{-1} a_i\rangle\) is the \(i\)th leverage score.
Show that if \(A = U S V^{\top}\) is the SVD of \(A\), then \(\sigma_i(A) = \sum_{j=1}^n U_{ij}^2\).
Show if \(s_1,\ldots,s_n\) are the singular values and \(v_1\ldots,v_n\) are the rows of \(V^{\top}\), then
\[\sigma_i(A) = \sum_{j=1}^n \frac{\langle v_j, a_i\rangle^2}{s_j^2}\]Suppose that \(G=(V,E,w)\) is a weighted (undirected) graph, where \(w : E \to \R_+\) is a nonnegative weight on edges. A weighted graph \(\tilde{G}=(V,E,\tilde{w})\) is an \(\e\)-approximate cut sparsifier for \(G\) if, for every \(S \subseteq V\), it holds that
\[\left|\sum_{e \in E(S,\bar{S})} (w_e - \tilde{w}_e)\right| \leq \e \sum_{e \in E(S,\bar{S})} w_e\,,\]where \(E(S,\bar{S}) \subseteq E\) is the set of edges with one endpoint on each side of the cut \((S,\bar{S})\). The number of edges in the sparsifier is \(\#\{ e \in E : \tilde{w}_e > 0 \}\).
Given a weighted graph \(G=(V,E,w)\), the corresponding graph Laplacian is the \(\abs{V} \times \abs{V}\) matrix \(L_G\) that acts on a vector \(x \in \R^V\) via
\[(L_G x)_u = w_u x_u - \sum_{v \in V} w_{uv}\, x_v\,,\quad u \in V\,,\]where we define \(w_u \seteq \sum_{v : uv \in E} w_{uv}\).
It may help to note that
\[L_G = \sum_{uv \in E} w_{uv} \chi_{uv} \chi_{uv}^{\top}\,,\]where \(\chi_{uv} = e_u - e_v\) is the vector with a \(1\) in the \(u\) coordinate and a \(-1\) in the \(v\) coordinate.
A weighted graph \(\tilde{G}=(V,E,\tilde{w})\) is an \(\e\)-approximate spectral sparsifier for \(G\) if holds that
\[(1-\e) L_{\tilde{G}} \preceq L_G \preceq (1+\e) L_{\tilde{G}}\,,\]where \(A \preceq B\) is the Loewner ordering on matrices, i.e., \(A \preceq B\) if and only if \(B-A\) is positive semidefinite.
Recall that if \(\{X_t : t \in T \}\) is a centered subgaussian process with respect to the metric \((T,d)\), then Dudley’s entropy bound asserts that
\[\begin{equation}\label{eq:newdud} \E \max_{t \in T} X_t \lesssim \sum_{h \geq 0} 2^{h/2} e_h(T,d)\,, \end{equation}\]where \(\{ e_h(T,d) \}\) are the entropy numbers of \((T,d)\).
Define \(\mathcal{N}(T, d, r)\) as the minimum number of balls of radius \(r\) in \((T,d)\) that are needed to cover \(T\). These numbers are dual to the entropy numbers.
The classical formulation of Dudley’s bound asserts that
\[\begin{equation}\label{eq:old-dud} \E \max_{t \in T} X_t \lesssim \int \sqrt{\log \mathcal{N}(T,d,r)}\,dr\,. \end{equation}\]Prove that \eqref{eq:newdud} and \eqref{eq:old-dud} are equivalent upper bounds, i.e., that the right hand sides are related to each other by universal constants.
Let \(\|\cdot\|\) be a norm on \(\R^n\) and define its unit ball by \(B \seteq \{ x \in \R^n : \|x\| \leq 1 \}\).
Use a volume argument to show that \(B\) can be covered by at most \((C/\e)^n\) translates of \(\e B\) for every \(\e > 0\), where \(C \geq 1\) is some universal constant.
Use this to give an upper bound of \(\log \mathcal{N}(T,d,r)\) when \(T \subseteq \R^n\) and \(d(x,y) = \|x-y\|\).
Use this to show that if \(T \subseteq \R^n\), then there is a constant \(C \geq 1\) such that
\[\sum_{h \geq 0} 2^{h/2} e_h(T,d) \lesssim \sum_{h=0}^{C \log n} 2^{h/2} e_h(T,d)\,.\][Hint: For some sufficiently large \(h_0\) and \(h \geq h_0\), bound \(e_h(T,d)\) by first covering \((T,d)\) with balls of radius \(e_{h_0}(T,d)\), and then covering those balls by smaller balls.]