Question: Dataset Generation: First we are going to generate the data which can be used in our experimentation. We are going to assume that the data

Dataset Generation: First we are going to generate the data which

can be used in our experimentation. We are going to assume that

Dataset Generation: First we are going to generate the data which can be used in our experimentation. We are going to assume that the data is actually samples taken from three different Gaussian distributions. Please follow the following steps, i) Take these three mean values (3, 70), (7, 150) and (13,250). Take these values to be the mean of three different Gaussian distributions, generate 100 random data samples for each mean. Generate the data using standard deviation to be 3 in each dimension, for each distribution. (hint: numpy.random.normal) Page 2 of 4 If we stack all the samples together, this should result in a 2x300 matrix, here each feature vector has dimension 2 and total number of feature samples are 300 (Remember: When you stack all the feature vectors together in a matrix, you already know the order in which you stacked them. In this way, you will always know which feature vector came from which distribution) Now generate 300 samples of a Gaussian distribution with mean (0,0), where standard deviation in each dimension is 1. This should also give you a 2x300 samples of Gaussian noise, add this result to the feature vector matrix generated in step (i). After addition, this result becomes our data, which we are going to utilize for clustering. K-means Algorithm Deployment: Using the data generated in the 'Dataset Generation' step, we can perform k-means clustering, Take the value of k=3, as we are required to make three group of experts for relief efforts. You are required to code the k-means clustering algorithm in Python, making sure that you do these things, i) Please make sure that you do not use any library which have k-means algorithm already implemented. Only use mathematical equations in your code to implement the algorithm iteratively Please draw a figure against each step so that the evolution of your code is visible. Please show the data as 'o' empty circles. Please show the cluster centers as Please show the cluster center history as '+'. Please use the red, green and blue colors for each cluster Please only make one figure and keep on updating it, DO NOT make multiple figures. Once the algorithm converges, paste the final cluster center values on the figure as well. iii) iv) Dataset Generation: First we are going to generate the data which can be used in our experimentation. We are going to assume that the data is actually samples taken from three different Gaussian distributions. Please follow the following steps, i) Take these three mean values (3, 70), (7, 150) and (13,250). Take these values to be the mean of three different Gaussian distributions, generate 100 random data samples for each mean. Generate the data using standard deviation to be 3 in each dimension, for each distribution. (hint: numpy.random.normal) Page 2 of 4 If we stack all the samples together, this should result in a 2x300 matrix, here each feature vector has dimension 2 and total number of feature samples are 300 (Remember: When you stack all the feature vectors together in a matrix, you already know the order in which you stacked them. In this way, you will always know which feature vector came from which distribution) Now generate 300 samples of a Gaussian distribution with mean (0,0), where standard deviation in each dimension is 1. This should also give you a 2x300 samples of Gaussian noise, add this result to the feature vector matrix generated in step (i). After addition, this result becomes our data, which we are going to utilize for clustering. K-means Algorithm Deployment: Using the data generated in the 'Dataset Generation' step, we can perform k-means clustering, Take the value of k=3, as we are required to make three group of experts for relief efforts. You are required to code the k-means clustering algorithm in Python, making sure that you do these things, i) Please make sure that you do not use any library which have k-means algorithm already implemented. Only use mathematical equations in your code to implement the algorithm iteratively Please draw a figure against each step so that the evolution of your code is visible. Please show the data as 'o' empty circles. Please show the cluster centers as Please show the cluster center history as '+'. Please use the red, green and blue colors for each cluster Please only make one figure and keep on updating it, DO NOT make multiple figures. Once the algorithm converges, paste the final cluster center values on the figure as well. iii) iv)

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Dataset Generation: First we are going to generate the data which can be used in our experimentation. We are going to assume that the data is actually samples taken from three different Gaussian...

After reading the ted talk by A ndy Yen - Think your email's private? Think Again transcript bellow, answer the questions: (3 sentences for each questions) What did you learn from reviewing the TED...

We are faced with an emerging disaster situation, where an earthquake has wreak havoc in the ICT region of our capital. We are receiving reports that there are people with critical, severe and minor...

Do you expect robots to have a bigger impact inside or outside of factories in the next 15 years, and what implications does your answer have for the kinds of strategies that will gain competitive...

\fJournal of Mixed Methods Research http://mmr.sagepub.com Mixed Methods Sampling: A Typology With Examples Charles Teddlie and Fen Yu Journal of Mixed Methods Research 2007; 1; 77 DOI:...

Here is the case study At Booking.com, Innovation Means Constant Failure Professor Stefan Thomke discusses how past experience and intuition can be misleading when attempting to launch an innovative...

nodes, but at least its bias can be quantified by Markov Chain L. INTRODUCTION analysis and thus can be corrected via appropriate re-weighting The popularity of online social networks (OSNs) in...

Question: Analyze the actions of Ranbaxy Laboratories and US Food and Drug Administration (FDA) using the following concept 1. Leadership behavior 1. The assignment On the morning of Aug. 18, 2004,...

07-Bickman-45636:07-Bickman-45636 7/28/2008 6:13 PM Page 214 CHAPTER 7 Designing a Qualitative Study Joseph A. Maxwell T raditionally, works on research design (most of which focus on quantitative...

A cantilever beam AB, loaded by a uniform load and a concentrated load (see figure), is constructed of a channel section. Find the maximum tensile stress Ït and maximum compressive stress...

Suppose that 10% of the adult population has blood chemistry parameters consistent with a diagnosis of a pre-diabetic condition. Of four volunteer participants in a health screening study, what is...

You may rent a house, or you may rent a condominium.

In true direct co - ownership the owners have equal values of owner interest share the same bundle of rights own through a business entity for which they hold shares must dispose of all of their...

KEY QUESTION Assume that a hypothetical economy with an MPC of .8 is experiencing severe recession. By how much would government spending have to increase to shift the aggregate demand curve...

KEY QUESTION Define the standardized budget, explain its significance, and state why it may differ from the actual budget. Suppose the full-employment, noninflationary level of real output is GDP 3...

ADVANCED ANALYSIS (For students who were assigned Chapter 9) Assume that, without taxes, the consumption schedule for an economy is as shown below: GDP, Consumption, Billions Billions $100 $120 200...