Question: Problem 1 ( 2 0 Points ) : Lloyd's Method Given a dataset with seven data points { x 1 , cdots, x 7 }

Problem 1(20 Points): Lloyd's Method
Given a dataset with seven data points {x1,cdots,x7} and the distances between all pairs of data points are in
the following table.
Assume the number of clusters k=2, and the cluster centers are initialized to be x3 and x6.
5 Points. What's the two clusters formed at the end of the first iteration of Lloyd's algorithm?
5 Points. What's the two clusters formed at the end of the second iteration of Lloyd's algorithm?
10 Points. What's the two clusters formed when the Lloyd's algorithm converges?
Problem 2(15 Points): Guassian Mixture Model (GMM): Latent Variable View
Consider a GMM in which the marginal distribution p(z) for the latent variable z is given by
p(z)=prodk=1Kkzk
where k=1Kk=1;z=[z1,z2,cdots,zK] and zk satisfies zkin{0,1} and k=1Kzk=1.
Moreover, the conditional distribution (z|) for the observed variable x is given by:
(z|)(k,k|)
Prove that p(x), obtaining by summing (z|) over all possible values of z, is a GMM. That is,
(z|)(k,k|)
Problem 4(20 Points): Generating GMMs
In this problem, you will write code to generate a mixture of 3 Gaussians satisfying the following requirements,
respectively. Please specify the mean vector and covariance matrix of each Gaussian in your answer
6 Points. Draw a data set where a mixture of 3 spherical Gaussians (where the covariance matrix is
the identity matrix times some positive scalar) can model the data well, but K-means cannot.
6 Points. Draw a data set where a mixture of 3 diagonal Gaussians (where the covariance matrix can
have non-zero values on the diagonal, and zeros elsewhere) can model the data well, but K-means and a
mixture of spherical Gaussians cannot.
8 Points. Draw a data set where a mixture of 3 Gaussians with unrestricted covariance matrices can
model the data well, but K-means and a mixture of diagonal Gaussians cannot.
Problem 4(45 Points): Implementing K-Means and Spectral Clustering
Given a number of 7 toy datasets (in toydata. zip). Each dataset contains a number of clusters as shown in
Figure 1 and your task is to find these clusters using your implemented clustering algorithms.
20 Points. Implementing Lloyd's K-means: Your submitted function should be function [label]
= my kmeans (data, K), where label returns the N-dimensional clustering result, where N is the total
number of data points. data is with size Nd and K is the number of (known) clusters. To initialize,
randomly select K samples to initialize your cluster centroids. Iterate your algorithm until convergence.
Use Euclidean distance as the distance measure. Name your file my
ckeeans.py.
20 Points. Implementing Spectral Clustering: Your submitted function should be function
[label]= my_spectralclusting(data, K , sigma), where label, data and K are the same as above
and sigma is the bandwith for Gaussian kernel used in spectral clustering. You will see sigma
important for your clustering performance. Adjust it case-by-case for every toy dataset to output the
best results. Name your file
my-spectralclusting.py.
5 Points. Compare your spectral clustering results with k-means. It is natural that on certain hard
toy example, both method won't generate perfect results. In your report, briefly analyze what is the
advantage or disadvantage of spectral clustering over k-means. Why it is the case? (You do not need to
mathematically prove it but just need to give answers in your own language.)
Remarks: Write a file named Run_clustering.py at top level to give the clustering results. In
your code should:
Load all the mat file data and generate clustering labels with your k-means and spectral clust
In the mat files, "D" is the data matrix and "L" is the ground truth label matrix. To run you
algorithms, you cannot use "L" as they only serve as a reference.
algorithms, you cannot use "L" as they only serve as a reference.
2. Show your clustering results in the solution (if you use word or latex) and indicate which
(k-means or spectral
Problem 1 ( 2 0 Points ) : Lloyd's Method Given a

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!