For the problems in this homework, turn in all your plots, answers, and discussion. The corresponding code should be placed at the end of each part.
Consider the graphs given below. For this part, take \(n=7\). For each graph, write down the Laplacian matrix \(L=D-A\) where \(D\) is the diagonal degree matrix and \(A\) is the adjacency matrix. (Actually write out or print out the matrices.)
For each of the graphs in part (a), compute the eigenvectors and eigenvalues of the Laplacian matrix \(L\) and the adjacency matrix \(A\) when there are \(n=150\) vertices. Describe the smallest and the second-smallest eigenvalues (for both \(L\) and \(A\)) and plot the corresponding eigenvectors. Describe the largest and second-largest eigenvalues (for both \(L\) and \(A\)), and plot the corresponding eigenvectors.
Include four plots for each graph and clearly label the plots. When plotting an eigenvector \(\mathbf{v}\), the \(x\)-axis ranges from \(1\) to \(n\) and the \(i\)th vertex is plotted at location \((i,\mathbf{v}(i))\). In light of the fact that \(\langle \mathbf{v}, L \mathbf{v}\rangle = \sum_{\{i,j\} \in E} (v_i-v_j)^2\), explain why these eigenvectors make sense.
In addition to your plots and discussion, include the code you used to generate the graphs and plots.
Consider again the four graphs from part (a), again with \(n=150\). For each graph, plot the embedding of the graph using the eigenvectors corresponding to the 2nd and 3rd smallest eigenvalues. Overlay the edges of the graph, i.e., for every edge \(\{i,j\} \in E\), draw an edge between \((\mathbf{v}_2(i), \mathbf{v}_3(i))\) and \((\mathbf{v}_2(j),\mathbf{v}_3(j))\).
Include the four spectral embedding plots, along with the code you wrote to generate them.
Pick 600 random points in the unit square by independently choosing their \(x\) and \(y\) coordinates from the interval \([0,2]\). Form a graph by adding an edge between every pair of points whose Euclidean distance is at most \(1/2\). Compute the eigenvectors of the Laplacian of this graph and plot the embedding of the graph onto the 2nd and 3rd smallest eigenvectors. This time, do not overlay the edges of the graph, just plot the vertices. For all points in the original graph with \(x\) and \(y\) coordinates both less than \(1\), plot their images in a different color. What do you observe? Why does this make sense?
Include a single plot (with two colors) and your discussion.
Consider the 100 x 100 grid graph whose 10,000 vertices are \(\{ (i,j) : 1 \leq i \leq 100, 1 \leq j \leq 100 \}\) and which has an edge \(\{(i,j),(i',j')\}\) whenever \(|i-i'| \leq 1, |j-j'| \leq 1\). Draw the spectral embedding (with edges present) as in part (c) using the second and third eigenvectors. Now remove 100 random vertices from this graph and again draw the spectral embedding. What do you observe?
Include two plots of the spectral embedding and your answer.
In this part, you will analyze part of the (anonymized) Faceboook friend network.
The file friends.csv
contains a graph with 1495 vertices
and 61796 rows each containing a pair \(i,j\) and representing the fact
that person \(i\) is friends with person \(j\).
Compute the smallest 12 eigenvalues and corresponding eigenvectors of the Laplacian of the friendship graph. Print a list of the 12 smallest eigenvalues.
How many connected components does this graph have? Justify your answer using the eigenvalues you computed in part (a). (Keep in mind that your numerical linear algebra package might have some rounding errors and a value less than \(10^{-12}\) should probably be regarded as \(0\).) Using the eigenvectors, how can you tell which nodes are in which components?
Include your code for this part in the writeup.
The conductance of a set of nodes in a graph is a natural measure of how tightly knit/insular that set it, with a lower conductance indicating a more tightly-knit set. Given a graph \(G=(V,E)\) with adjacency matrix \(A\) and a subset of the nodes \(S \subseteq V\), the conductance of \(S\) is defined to be
\[\mathrm{cond}(S) = \frac{\sum_{u \in S} \sum_{v \in V \setminus S} A_{uv}}{\min(A(S),A(V\setminus S))},\]where \(A(S)\) is the sum of the degrees of vertices inside \(S\).
In the context of a friend graph, the conductance of a set of individuals \(S\) corresponds to the ratio of the number of friendships between \(S\) and the outside world to the total number of friendships involving \(S\). So looking for tightly-knit communities corresponds to looking for sets \(S\) with small conductance. If \(\mathrm{cond}(S)=0\), then the set \(S\) is disconnected from the rest of the graph, and if \(\mathrm{cond}(S)=1\), then there are no internal friendships within \(S\).
Find at least 3 sets \(S_1,S_2,S_3\) of people in the friendship graph such that each set has at least \(200\) people and conductance at most \(0.1\). The three sets should be as close to disjoint as you can manage. For each step, report its size, 10 of its members, and the conductance of that set.
Explain how you found each set. If you used eigenvectors (hint!), please include a plot of those eigenvectors, the index of the eigenvector (i.e., “12” if it corresponds to the 12th smallest eigenvalue of the Laplacian), and explain how you identified the set of nodes in the component from that eigenvector.
Now select a random set of \(200\) nodes and compute the conductance of that set. Do the sets you found in part (d) seem tight-knit compared to a random set of people?