Question: Google Colab Clustering Assignment Instructions Objective: Perform unsupervised learning on a vehicular dataset using k - means clustering to identify cluster centroids for three different

Google Colab Clustering Assignment Instructions

Objective: Perform unsupervised learning on a vehicular dataset using k

-

means clustering to identify cluster centroids for three different ECU signatures, namely steering, speed, and RPM

.

Note: This data set was obtained from three sedan vehicles of a single make

(

Nissan

) .

It has been pre

-

processed to obtain the columns relevant to the signatures you will need to use as inputs. These columns are ECU

300 (

steering

),

ECU

1

9 (

tachometer

),

and ECU

280 (

speed

) .

They contain physical

(

or actual

)

values of these signatures at different time instants. For those interested, the units of speed and tachometer are in miles per hour

(

mph

)

and revolutions per minute

(

RPM

) .

Instructions:

1)

Navigate to colab.research.google.com in your browser and open the

ECU Clustering.ipynb

Python notebook file using Google Colab

(

File

>

Open Notebook

(

or ctrl

+

)) .

2)

Execute cells individually by clicking on the

Run cell

icon. Alternatively, after you select a cell, you can hit

(

ctrl

+

enter

)

to execute it

.

3)

The notebook has been segregated into three sections: Section

1, 2,

and

3

contain the k

-

means clustering implementations for ECU signatures speed, tachometer, and steering, respectively.

4)

The following are cells where you need to make modifications for completing the table.

)

Code cells

3, 7,

and

11

need to be modified to accommodate a MinMax scaling function to normalize the input data

(

ECU signatures. Use the same scaling function for all the ECU signatures.

)

Identify the optimal number of clusters

(

)

and the sum of squared error

(

SSE

)

using the elbow method for each of the three ECU signatures.

)

Verify your choice of clusters by comparing the results of a clustering metric called the Calinski

-

Harabasz

(

)

score by using different numbers of clusters.

)

Provide descriptive statistics, that is

,

the minimum, maximum, and mean, for each cluster and for each ECU signature. Use subscripts to designate the statistics for that particular cluster. For instance, the mean value for cluster

1

could be written as Mean

1 .

Populate the following three tables with your observations.

Table I: Evaluating number of clusters for speed ECU signature

Number of Clusters

(

)

SSE

(

Elbow Method

)

CH Score Min, Max, Mean

= 3

= 4

= 5

Table II: Evaluating number of clusters for RPM ECU signature

Number of Clusters

(

)

SSE

(

Elbow Method

)

CH Score Min, Max, Mean

= 3

= 4

= 5

Table III: Evaluating number of clusters for steering ECU signature

Number of Clusters

(

)

SSE

(

Elbow Method

)

CH Score Min, Max, Mean

= 3

= 4

= 5

Answer the following questions based on your findings.

1 .

How does the CH score change as the number of clusters

(

)

is increased? Provide a justification for your answer.

2 .

Why can

t metrics such as precision or recall be used to evaluate the performance of clustering algorithms like k

-

means

+ + ?

3 .

What is the optimal number of clusters

(

)

that shows consensus among the elbow evaluation method and the CH score?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

In this assignment, you will be assisting automakers analyze information about cars. Information about various models of cars has been loaded into a . csv file. There is also a Word file that will...

- Google Chrome on.com/myitlab-business-service/project/launch Assignment Instructions 6 15 On the 3-D pie chart, edit the Chart Title to read Semi- Annual Advertising Costs. Apply Style 8 to the...

Solution by Google colab This assignment considers the following hypothetical scenario. A real estate agency would like to use artificial intelligence to better predict whether a certain customer...

Big Data: The Case of Google Flu Trends Assignment Instructions Influenza remains a world - wide health problem. In the U . S . , the Center for Disease Control and Prevention ( CDC ) gets reports...

Al 691 - Deep Learning Lab 1 Setting up the environment 1) Google Colab Colab notebooks allow you to run executable code implemented in Python with machine learning libraries. For this laboratory,...

I want the solution in Python language Al 691 - Deep Learning Lab 1 Setting up the environment 1) Google Colab Colab notebooks allow you to run executable code implemented in Python with machine...

GLG 110 Natural Disasters Name: INVESTIGATION 7: HOW BIG WAS THE CANYON DIABLO METEORITE? Instructions: You'll need to use the Google Colab Notebook file (inv7_impact_calc.ipynb) provided on the...

Please solve using python Sixth Lab Assignment, Thursday i Instructions: Model the following problem(s) then code it/them using google colab and submit your code by the end of the lab session. You...

Consider production ratios of 2:1:1, 3:2:1, and 5:3:2 for oil, gasoline, and heating oil. Assume that other costs are the same per gallon of processed oil. a. Which ratio maximizes the per-gallon...

Euroam is a U.S. corporation that purchases motors from European manufacturers for distribution in the United States. A recent purchase involved the following events: Dec. 1 Purchased tractors from...

If a dealer helps a consumer get financing for a vehicle, the dealer has to make sure the consumer gets: YzRCd3VmOWVaVktVcERuR21qWkVQQT09 O An extended warranty certificate O The Used Vehicle...

CT Corp Comprehensive Question Canadian Tire Corporation, Limited (Canadian Tire) is a family of companies that includes a retail segment and a financial services division, among others. The retail...

design a simple performance appraisal system

2. What are the main advantages and disadvantages of using 360 degree appraisal?

4. How can social media be used to check a candidates experience and qualifications?