Question: 1 4 . 9 Project 5 : Machine Learning Overview In this project, we will explore some basic concepts in artificial intelligence. Using the concepts
Project : Machine Learning Overview In this project, we will explore some basic concepts in artificial intelligence. Using the concepts you have learned thus far in the course, you will design a machine learning method which will be able to identify a flower based on four characteristics: Sepal length Sepal width Petal length Petal width Your program will differentiate between three types of iris flowers: Irissetosa Irisversicolor Irisvirginica You must write these functions, but you can write more if you wish: readData: read data from data files display: display the loaded data mean: calculate the average across an array of values stddev: calculate the standard deviation across an array of values stats: display mean and standard deviation of each characteristic distance: how similar two flowers are based on euclidean distance nearestNeighbor: find the flower most similar to another accuracy: calculate how accurate your machine learning method is main: the main function These functions are discussed in more detail in the following sections. Functions can and should make use of each other. For example, the stddev function would call mean as a part of calculating the standard deviation. Commandline Arguments The program will accept three commandline arguments: training data filename testing data filename action display stats accuracy classify Example: aout train.data test.data display If the number of arguments is less than or greater than expected, print the following message and terminate: Usage: a trainfilename testfilename displaystatsaccuracyclassify If the the action is invalid, print the following message and terminate: Invalid action Usage: a trainfilename testfilename displaystatsaccuracyclassify Read Data You have been given two example files: train.data and test.data. Now, we will focus on train.data. Your first task will be to read the data in this file. There will be up to entries inclusive in this file. The first four columns correspond to the four characteristics mentioned in the overview. The fifth column is the flower type these characteristics describe, called the 'label'. You will write a function to read the data in this file into five arrays. The first four arrays are for the four characteristics, and the fifth array stores the flower type. The function definition should be: int readDatachar filename double sepallengths double sepalwidths double petallengths double petalwidths int labels int length; You will notice the labels array is an array of integers. This is because in machine learning, it's common to number each label, as numbers are easier to work with than strings. When reading the file, store Irissetosa as Irisversicolor as and Irisvirginica as The number of records read from the file is returned as the final reference parameter. For the example files, train.data would be and test.data would be If the file does not exist, return a value of else return a In your main method, you should read the data for both files before doing anything else. If either method returns a immediately print the following error and terminate: Unable to open file FILENAME where FILENAME is the filename passed to the function. If both files cannot be opened, only print the training data error. Examples: aout notafile.txt anotherfakefile.txt display Unable to open file notafile.txt aout train.data anotherfakefile.txt display Unable to open file anotherfakefile.txt Display Data To ensure the data was loaded properly, you will write a function to print out all the stored values. The display function will iterate over each flower and print its sepal length, sepal width, petal length, petal width, and label. Formatted as: sepal length, sepal width, petal length, petal width label The function definition should be: displaydouble sepallengths double sepalwidths double petallengths double petalwidths int labels int length; where the last parameter, length, is how many flowers there are length of the arrays Example first three lines when calling display on the train.data data Statistics When working on a machine learning project, it's always important for the data scientist to become familiar with their data. One way to do this is to look at the statistics of your dataset. In this case, we will be interested in the mean and standard deviation for each of the values for each flower. Mean double meandouble values int labels int filter, int length; The mean method will take an array of values and an array of labels. However, we want to know the mean for a specific flower type. The desired flower type will be passed as filter.
Step by Step Solution
There are 3 Steps involved in it
1 Expert Approved Answer
Step: 1 Unlock
Question Has Been Solved by an Expert!
Get step-by-step solutions from verified subject matter experts
Step: 2 Unlock
Step: 3 Unlock
