# Question: Compute the true utility function and the best linear

Compute the true utility function and the best linear approximation in x and y (as in Equation (21.9)) for the following environments:

a. A l0 x 10 world with a single +1 terminal state at (10, 10).

b. As in (a), but add a —1 terminal state at (10, 1).

c. As in (b), but add obstacles in 10 randomly selected squares.

d. As in (b), but place a wall stretching from (5, 2) to (5, 9).

e. As in (a), but with the terminal state at (5, 5).

The actions are deterministic moves in the four directions. In each case, compare the results using three-dimensional plots. For each environment, propose additional features (besides x and y) that would improve the approximation and show the results.

a. A l0 x 10 world with a single +1 terminal state at (10, 10).

b. As in (a), but add a —1 terminal state at (10, 1).

c. As in (b), but add obstacles in 10 randomly selected squares.

d. As in (b), but place a wall stretching from (5, 2) to (5, 9).

e. As in (a), but with the terminal state at (5, 5).

The actions are deterministic moves in the four directions. In each case, compare the results using three-dimensional plots. For each environment, propose additional features (besides x and y) that would improve the approximation and show the results.

**View Solution:**## Answer to relevant Questions

Extend the standard game-playing environment to incorporate a reward signal. Put two reinforcement learning agents into the environment (they may, of course, share the agent program) and have them play against each other. ...Augment the E1 grammar so that it handles article—noun agreement. That is, make sure that “agents” is an NP, but “agent” and agents” are not.Draw a discourse parse tree for the story about John going to a fancy restaurant, use to the two grammar rules for Segment giving the proper Coherence Relation for each node. (You needn’t show the parse for individual ...Consider the sample x1, x2, . . . , xn with sample x mean and sample standard deviation s. Let zi = (xi – x)/s, i = 1, = 2, . . . , n. What are the values of the sample mean and sample standard deviation of the zi?Consider the quantity.For what value of a is this quantityminimized?Post your question