Out: Saturday, Feb 18
Due: Saturday, Feb 25 at midnight (in the dropbox)
You should use the guidelines for HW#3, stated in slide 2 here. In particular, you may work with a partner on this homework.
See the proof of Theorem 2 (analysis of the power method) in these notes. Now suppose that the matrix \(A\) has eigenvalues \(\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n\), and that \(\lambda_1=\lambda_2=\cdots=\lambda_k\), but \(\lambda_{k+1} < \lambda_k\). Give a formal analysis of the power method in this setting, analogous to the statement of Theorem 2. Be careful to specify what replaces the vector \(\mathbf{v}_1\). One cannot specify a single vector ahead of time. Make sure you explain why.
Suppose that \(A=X^T X\) is an \(n\times n\) symmetric matrix with real entries formed from an \(m \times n\)-dimensional matrix of data points \(X\). Let \(\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n\) denote the eigenvalues of \(A\).
Show that all the eigenvalues of \(A\) are non-negative.
In class, we argued that if we have an algorithm to find the top component \(\mathbf{v}_1\) of \(A\), then we can use this recursively to find the top \(k\) components. State formally an algorithm that achieves this (assuming a function that computes the top component).
Under the assumption that all the eigenvalues of \(A\) are distinct, use induction to prove that your algorithm works.
You should do only Part I from this homework assignment.
If you attempt the bonus questions, I will also assign some extra credit value to your answers.
Resources: The data set from the 1000 genomes project.