Question: 1 4 . 9 Project 5 : Machine Learning Overview In this project, we will explore some basic concepts in artificial intelligence. Using the concepts

14.9

Project

5

: Machine Learning Overview In this project, we will explore some basic concepts in artificial intelligence. Using the concepts you have learned thus far in the course, you will design a machine learning method which will be able to identify a flower based on four characteristics: Sepal length Sepal width Petal length Petal width Your program will differentiate between three types of iris flowers: Iris

-

setosa Iris

-

versicolor Iris

-

virginica You must write these

9

functions, but you can write more if you wish: readData: read data from data files display: display the loaded data mean: calculate the average across an array of values stddev: calculate the standard deviation across an array of values stats: display mean and standard deviation of each characteristic distance: how similar two flowers are based on euclidean distance nearestNeighbor: find the flower most similar to another accuracy: calculate how accurate your machine learning method is main: the main function These functions are discussed in more detail in the following sections. Functions can and should make use of each other. For example, the stddev function would call mean as a part of calculating the standard deviation. Command

-

line Arguments The program will accept three command

-

line arguments: training data filename testing data filename action display stats accuracy classify Example:

. /

.

out train.data test.data display If the number of arguments is less than or greater than expected, print the following message and terminate: Usage:

. /

a train

_

filename test

_

filename

[

display

|

stats

|

accuracy

|

classify

]

If the the action is invalid, print the following message and terminate: Invalid action Usage:

. /

a train

_

filename test

_

filename

[

display

|

stats

|

accuracy

|

classify

]

Read Data You have been given two example files: train.data and test.data. Now, we will focus on train.data. Your first task will be to read the data in this file. There will be up to

1, 000

entries

(

inclusive

)

in this file. The first four columns correspond to the four characteristics mentioned in the overview. The fifth column is the flower type these characteristics describe, called the 'label'. You will write a function to read the data in this file into five arrays. The first four arrays are for the four characteristics, and the fifth array stores the flower type. The function definition should be: int readData

(

char filename

[],

double sepal

_

lengths

[],

double sepal

_

widths

[],

double petal

_

lengths

[],

double petal

_

widths

[],

int labels

[],

int

*

length

)

; You will notice the labels array is an array of integers. This is because in machine learning, it's common to number each label, as numbers are easier to work with than strings. When reading the file, store Iris

-

setosa as

0,

Iris

-

versicolor as

1,

and Iris

-

virginica as

2 .

The number of records read from the file is returned as the final reference parameter. For the example files, train.data would be

120

and test.data would be

30 .

If the file does not exist, return a value of

0,

else return a

1 .

In your main method, you should read the data for both files before doing anything else. If either method returns a

0,

immediately print the following error and terminate: Unable to open file FILENAME where FILENAME is the filename passed to the function. If both files cannot be opened, only print the training data error. Examples:

. /

.

out not

_

_

file.txt another

_

fake

_

file.txt display Unable to open file not

_

_

file.txt

. /

.

out train.data another

_

fake

_

file.txt display Unable to open file another

_

fake

_

file.txt Display Data To ensure the data was loaded properly, you will write a function to print out all the stored values. The display function will iterate over each flower and print its sepal length, sepal width, petal length, petal width, and label. Formatted as:

(

sepal length, sepal width, petal length, petal width

) = >

label The function definition should be: display

(

double sepal

_

lengths

[],

double sepal

_

widths

[],

double petal

_

lengths

[],

double petal

_

widths

[],

int labels

[],

int length

)

; where the last parameter, length, is how many flowers there are

(

length of the arrays

) .

Example

(

first three lines when calling display on the train.data data

) (5.100000, 3.500000, 1.400000, 0.200000) = > 0 (4.900000, 3.000000, 1.400000, 0.200000) = > 0 (4.700000, 3.200000, 1.300000, 0.200000) = > 0

Statistics When working on a machine learning project, it's always important for the data scientist to become familiar with their data. One way to do this is to look at the statistics of your dataset. In this case, we will be interested in the mean and standard deviation for each of the values for each flower. Mean double mean

(

double values

[],

int labels

[],

int filter, int length

)

; The mean method will take an array of values and an array of labels. However, we want to know the mean for a specific flower type. The desired flower type will be passed as filter.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!

MATHEMATICIANS RISE TO A CHALLENGE ne of the theorems we teach in eighth grade is a + b= *, where c is the length of the hypotenuse of a right triangle in Euclidean space, and a and b are the lengths...

Unit Information PEN593 Energy Economics Teaching Period: S2 2022 This guide should be used in conjunction with the Handbook as the official source of information about this unit. Refer to myMurdoch...

Literature Review Examples Find a peer-reviewed literature review article that you will use as a source in your literature review. In three hundred words provide a critical analysis of the article...

Welcome! Please read this page (in particular) very carefully. Instructions You need to understand how to send your assignments (deliverables) to your instructor. The tabs (bottom of each sheet) in...

If you need assistance using Excel, you can access a tutorial that is appropriate for your experience level and your version of Excel. Access these tutorials at Atomic Learning using your SNHU login...

MGMT 2100 - Assignment #1: Spreadsheet Skills General Instructions: For this assignment you will complete a tutorial introduction to some important features of Microsoft Excel, and then search for...

During the off season, Hotel Orlando records show that: at $50 per night they average 75 occupied rooms; at $54 per night 69 rooms are occupied, while at $58 only 63 rooms are filled. Create a table...

EXPLAIN HOW THIS ARTICLE RELATES TO THE TOPIC: BARRIERS TO TRADE. OTTAWA Canada's trade balance shifted back to a deficit in March following two consecutive months of surplus as imports jumped...

Which of the following statements is FALSE? Net Working Capital = Cash + Inventory + Payables minus Receivables. Prior to 2 0 1 8 , companies could "carry back" losses for two years and "carry...

22 Considering the classification of consumer products, which of the following products will have the most limited distribution? a. Secret antiperspirant b. Fuji disposable camera c. BP gasol