Question: from matplotlib.offsetbox import OffsetImage, AnnotationBbox np . random.seed ( 0 ) plt . figure ( figsize = ( 4 0 , 4 0 ) )

from matplotlib.offsetbox import OffsetImage, AnnotationBbox
np.random.seed(0)
plt.figure(figsize=(40,40))
# Scatter plot to help with positioning images
plt.scatter(tsne_results[:,0], tsne_results[:,1], alpha=0.5)
# Loop over each image in the subset and plot at the corresponding t-SNE position
for i in range(subset_size):
# Get the image corresponding to this t-SNE point
image = train_images[i].reshape(32,32,3)
# Create an OffsetImage object
imagebox = OffsetImage(image, zoom=0.7)
# Create the annotation box with the image
ab = AnnotationBbox(imagebox,(tsne_results[i,0], tsne_results[i,1]), frameon=False)
# Add it to the plot
plt.gca().add_artist(ab
# Title and labels
plt.title('t-SNE Visualization with Images')
plt.xlabel('t-SNE Component 1')
plt.ylabel('t-SNE Component 2')
# Show the plot
plt.show()Q: Based on the last part, how many clusters do you think is good for kMeans? Why?
T: Set up KMeans with some rasonable k based on the previous part (no worries: there is no single best answer).
Tip: to speed things up, import and use MiniBatchKMeans instead of KMeans. Browse the documentation to know how it differs.
[]
#...
T: Fit kMeans with your data.
[]
#...
Plot tSNE embedding using clusters' labels
Let's see to what extent the kMeans clusters resemble the structure of the tSNE output.
T: Plot the tSNE embedding again -- but this time assign colors corresponding to the kMeans cluster of each image.
Q: Can you see significant groups of points with the same color (label)?(If not, something is wrong.) How many do you see, roughly?
[]
#...
T: Repeat the plot above but define the color of each point as the mean color of the images in the cluster to which the image belongs to.
Hint: You should see some blue and orange parts. Also some almost white and quite dark parts? If yes -- good.
If you don't see them -- something is likely wrong. Maybe too few iterations? If everything is gray, something is very wrong -- maybe way too few clusters (k). Tune the parameters until happy. Do you see why we wanted to use the faster, approximate version of k-means? Data analysis is often done iteratively/interactively -- so efficient algorithms save your time.
[]
#...
If you're not satisfied with the quality you can tune the parameters some more.
T: If everything looks acceptable, rerun kMeans on the full dataset -- something which we couldn't realistically do with tSNE!
[]
#...
Let's veriify if the clusters we got on the entire dataset are reasonable.
T: For each cluster center, plot, say, 10 images which are closest in the sense of the Euclidean metric to it.
Q: Looks good? Or maybe you see sometihng suspicious?
For example: if any cluster center look like a single image in the dataset, you likely chose too many clusters!
[]
#...

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!