Question: In Python, implement k-means clustering, a process by which data is organized into a small number (k) of clusters of similar values. Your algorithm will
In Python, implement k-means clustering, a process by which data is organized into a small number (k) of clusters of similar values. Your algorithm will process a set of 2-dimensional points, assigning each of them to a cluster. The functions to visualize them are defined in the starter code below.
The program needs to:
-
Set the number of clusters (i.e., the value of k in k-means clustering). Well use 4.
-
Choose 4 random points to be your centroids. Our centroids will be chosen from our initial data. Choose 4 points at random from the input data set.
-
For each point in the data set, find the closest centroid. Assign that point to the centroid. Once all the points are assigned to a centroid, we have our clusters!
-
Draw the centroids, and the clusters, using the functions defined in starter code
-Indicate which is the closest centric by using Euclidean distance algorithm. In case of a tie (i.e., two or more centroids are equidistant from a given point), choose any one of them.
The only output should be the Turtle graphics that plot your centroids and data points. I should see a slightly-different one each time, because the centroids are randomly chosen.
Each centroid must have a unique color, and all of its corresponding data points should be drawn with that color. In our example, we used red, green, blue, and purple, but you can choose any four turtle-friendly colors you like
The functions will not change, and you cannot add your own to starter code.
Starter Code:
import turtle NUM_COORDINATES = 2 def draw_centroids(centroids, colors): ''' Function draw_centroids Input: Centroids (a list of lists), and List of colors (strings) Returns: nothing Does: Iterates through the list of centroids. Each list in centroids is expected to be a 2-dimensional point. For example, centroids might contain [[0, 0], [1, 2]], and this funciton will draw a point at coordinates (0, 0) and a point at coordinates (1, 2). Repeatedly calls helper funciton draw_centroid, once per point. There should be as many centroids as there are colors (each centroid gets a unique color on the screen). If the input format is not as expected, no error is reported, it just stops drawing. ''' if len(centroids) != len(colors): return for i in range(len(centroids)): if len(centroids[i]) == NUM_COORDINATES: draw_centroid(colors[i], centroids[i][0], centroids[i][1]) turtle.hideturtle() def draw_assignment(centroids, data, assignment, colors): ''' Function draw_assignment Input: centroids (a list of lists), data (a list of lists), assignment (a list of lists, organized by index), and a list of colors (strings). Returns: Nothing Does: Uses the nested list assignment to find the indices of each data point in data, and its corresponding centroid in centroids. Once the correct point is found, we draw it using the corresponding centroid's color. For example, we expect assignment to have the format [[0, 1], [1, 3], ...] indicating that data[0] maps to centroids[1], and data[1] maps to centroids[1]. Therefore, we draw the point data[0] with color colors[1]. And we draw the point data[1] with the color colors[3]. There should be as many centroids as there are colors (each cluster gets a unique color corresponding with its centroid). If the input format is not as expected, no error is reported, it just stops drawing. ''' if len(centroids) != len(colors): return for lst in assignment: if len(lst) = len(data) or cent_index >= len(centroids): return draw_point(colors[cent_index], data[data_index][0], data[data_index][1]) def draw_centroid(color, x, y): ''' Function draw_centroid Input: a color (string), an x-coord (float) and a y-coord (float) Returns: nothing Does: Draws a cross-shape, of size 10, with center at the (x,y) indicated. ''' turtle.color(color) turtle.penup() turtle.goto(x, y-10) turtle.pendown() turtle.goto(x, y+10) turtle.penup() turtle.goto(x-10, y) turtle.pendown() turtle.goto(x+10, y) def draw_point(color, x, y): ''' Function draw_point Input: a color (string), an x-coord (float) and a y-coord (float) Returns: nothing Does: Draws a single point at the (x,y) indicated. ''' turtle.color(color) turtle.penup() turtle.goto(x, y) turtle.pendown() turtle.dot()
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
