Question: Checkpoint 1 ( HW 8 ) TestGeneraton - Due one week from release. Checkpoint 2 ( HW 9 ) TestGP - Due two weeks from

Checkpoint 1(HW8) TestGeneraton - Due one week from release.
Checkpoint 2(HW9) TestGP - Due two weeks from release.
The next step in the genetic programming project is to read and store a data file (such as the wine quality data set). Each of the trees in a generation of trees will be tested against the selected data file, and its "fitness" thus measured. Here is a simple example. Suppose there is just a single independent variable x0. A (very short) data file might look like this:
y , x0
1.2,1.0
4.1,2.0
8.8,3.0
Now suppose the tree is ((x0+0.25)* x0). We want to find out how close the tree value, for each x0 value, is to the given y value in the data. One standard way of doing this is to add up (over the rows of data) the square of the "deviation". For this example, the result is
Fitness = Math.pow((((1.0+0.25)*1.0)1.2),2)
+ Math.pow((((2.0+0.25)*2.0)4.1),2)
+ Math.pow((((3.0+0.25)*3.0)8.8),2)
=1.065
The work youve done so far allows you to evaluate any tree on a double[], an array of values for x0, x1,..., corresponding to a single data row. The next step is to be able to evaluate over multiple rows (subtracting and squaring for each row, as above) and sum the results for each row.
To do this, use the two classes DataRow and DataSet, from the Linear Regression project.
GPTree
Once the tabular.DataRow and tabular.DataSet classes are working correctly, the next step is to modify the GPTree class to implement Comparable and Cloneable and to have the following methods:
public void evalFitness(DataSet dataSet)- accepts a DataSet object as its argument. Since you already have an eval() method that takes a double[], it shouldn't be too hard to extract each DataRow's array of x values and feed it to your existing method (code reuse at work). The GPTree eval() method should run through each of the DataRows, evaluate the tree, subtract the y value, and square the result, all the while keeping a running sum of the squared differences. The final sum is the GPTree's fitness value.
public double getFitness()- return the fitness computed after evalFitness() is called.
public int compareTo(GPTree t)- compare the fitness values and return -1 for less than, 1 for greater than, and 0 when the values are equal.
public boolean equals(Object o)- return true when compareTo((GPTree) o) is 0, and false otherwise. Make sure to check to see if the object is not null, and is a GPTree first, and if it's not a GPTree or it is null then return false.
public Object clone()- in addition to calling clone on super similar to clone() in Node, and then make sure to clone root since it is a Cloneable Object (or if you are using the Algebra implementation from HW8, then you can use the copy constructor to copy root).
Generation
The last steps are the creation of the Generation class. The Generation class should probably have the following constructor and methods:
Generation(int size, int maxDepth, String fileName)- creates a DataSet from using the fileName, then creates the factories and random number generator necessary to construct a GPTree. Then creates an array of size GPTrees each with a maxDepth maximum depth.
public void evalAll()- This evaluates the current generation of GPTrees by evaluating the Fitness of each tree against the current DataSet and then sorts the array in place using Arrays.sort()
public ArrayList getTopTen()- this returns an ArrayList of the top 10 GPTrees. (i.e. the trees with the lowest fitness in increasing order of fitness.)(only works after evaluating all.)
public void printBestFitness()- prints the best fitness value (only works after evaluating all.)
public void printBestTree()- prints the best Tree (only works after evaluating all).
public void evolve()-(For Checkpoint 2) select 2 of the more fit trees at random, clone() each tree and then call crossover. Add these to the new array of children. Repeat (array size)/2 times until the new array has the same number of trees in the next generation.
Checkpoint 1- TestGeneration
As in the last homework, write a test class that demonstrates your stuff. Call it TestGeneration. Have this classs main() method prompt the user for a data file. Then create a generation of 500 GPTrees. Get the data into a DataSet object, and evaluate each GPTree. Print out the GPTree with the smallest fitness. After all, this is the tree that best fits the data. Then print the fitnesses of each of the top ten GPTrees and make sure that they are in increasing order. The output for the top ten fitnesses should start:

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Programming Questions!