Question: Please help modify car.py to fit the test case .test_car.py in python: #!/usr/bin/python import argparse import logging import sys import numpy as np import gym

Please help modify car.py to fit the test case .test_car.py in python:

#!/usr/bin/python

import argparse import logging import sys

import numpy as np

import gym #import gym.scoreboard.scoring from gym import wrappers, logger

#Global variables OUTOFBOUNDSTATE = -1; X_MAX = 0.6 X_MIN = -1.2 X_RANGE = 1.8;

XDOT_MAX = 0.7 XDOT_MIN = -0.7 XDOT_RANGE = 1.4

# Function to descritize state, could potientially be parallelized with # mapReduce technique def discretize_state( x, xdot, xRes, xdotRes ): # Return -1 for out of bounds state if X_MIN > x > X_MAX: return OUTOFBOUNDSTATE

if XDOT_MIN > xdot > XDOT_MAX: return OUTOFBOUNDSTATE

#Calculates x and y coordinates of state. s_x = discretize_state_helper(x, xRes, X_MAX, X_MIN, X_RANGE) s_y = discretize_state_helper(xdot, xdotRes, XDOT_MAX, XDOT_MIN, XDOT_RANGE) #return flattened value which corresponds to unique index of state return s_x * xdotRes + s_y

# Helper function that bins state variables and returns state in 1D def discretize_state_helper( val, res, maxi, mini, rng ): for box in range(res): if val < ( rng*(1+box)/res + mini ): return box

if __name__ == '__main__': parser = argparse.ArgumentParser(description=None)

parser.add_argument('env_id', nargs='?', default='MountainCar-v0', help='Select the environment to run') args = parser.parse_args()

logger = logging.getLogger() formatter = logging.Formatter('[%(asctime)s] %(message)s') handler = logging.StreamHandler(sys.stderr) handler.setFormatter(formatter) logger.addHandler(handler)

# You can set the level to logging.DEBUG or logging.WARN if you # want to change the amount of output. logger.setLevel(logging.INFO)

env = gym.make(args.env_id) outdir = '/tmp/' + 'qagent' + '-results' env = wrappers.Monitor(env, outdir, write_upon_reset=True, force=True)

env.seed(0)

Q = np.zeros([41, env.action_space.n])

alpha = 0.7 gamma = 0.97 #Resolution variables for state space xres = 10 xdotres = 4

n_episodes = 50001 for episode in range(n_episodes): tick = 0 reward = 0 done = False state = env.reset() s = discretize_state(state[0], state[1], xres, xdotres ) while done != True: tick += 1 action = 0 ri = -999 for q in range(env.action_space.n): if Q[s][q] > ri: action = q ri = Q[s][q] state, reward, done, info = env.step(action) #print( reward, done) sprime = discretize_state(state[0], state[1], xres, xdotres ) predicted_value = np.max(Q[sprime]) if sprime < 0: predicted_value = 0 reward = -5 Q[s,action] += alpha*(reward + gamma*predicted_value - Q[s,action]) #print(Q[s,action], ri, sprime, Q[s][action]) s = sprime

if episode % 1000 == 0: alpha *= .99 #decay rate for alpha, each 1000 print reward if state[0] >= 0.5: print "success" else: if episode % 1000 ==0: print "fail ", state[0], Q[s,action]

Test Case :

#!/usr/bin/env python3 from car import MountainCar import unittest import numpy as np

class TestTicTacToe(unittest.TestCase): # def test_init_board(self): # ttt = TicTacToe3D() # # brd,winner = ttt.play_game() # self.assertEqual(ttt.board.shape, (3,3,3))

def test_1(self): player_first = 1 expected_winner = 1 env_id = 'MountainCar-v0' mountain_car = MountainCar(env_id, False, True, 'car.npy') all_states = mountain_car.run() max_ = np.max(all_states, axis=0) result = max_[0] > 0.5 print("Your highest attained position = {}".format(max_[0])) print("Position threshold for success >= {}".format(0.5)) self.assertEqual(result,True)

unittest.main()

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer

Step: 1 Unlock blur-text-image

Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock

Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!

Please help modify this code cart.py to fit the test case .test_cart.py in python: #!/usr/bin/python import argparse import logging import sys import numpy as np import gym #import...

can you please edit this code to fit the test in python: import argparse import logging import sys import os import time import numpy as np import gym from gym import wrappers, logger class...

Professor Instructions----Please follow instructions in the attached PDF. Please test on an ilab machine since i will be testing there. Internet Technology Rutgers assignment---PYTHON PLEASE HELP....

This is google colab. please solve problem 9 and 10 using the codes attached. + Code + Text Reconnect * Extracting Alpha ## First, import the library that contain the inregresa function so that we...

Training Linear Regression Models Q4) Training a Linear Regression Model. We will now train a linear regression model of the sales data to make useful predictions. Work through the steps below and...

Using Python to do this work: For your solution please include screenshots like i did for better understanding. These are instructions: TWITTER AIRLINE SENTIMENT ANALYSIS In class, we studied the...

Install VirtualBox and the VM given. Startup the VM. You can find instructions for installing the Virtual Machine (VM) on the VirtualBox software here: setting up VirtualBox Download the code from...

Copy your temptbl.c file from your Lab4 directory into your Lab5 directory. Modify your temptable() function to be: ********note/question at bottom. Thank You int temptable(float start, float stop,...

JAVA eclipse please help! Modify the Sorting lab program to allow for a range of sort. The left index is inclusive and the right index is exclusive. 1. public static void bubbleSort(int[] array, int...

Please help modify this JS / HTML program to fit the requirements: You MUST use the sample code provided. Modify the code to include the division operator ( ' / ' ) and change the power operator to '...

Claim The mean systolic blood pressure of all healthy adults is less than than 122 mm Hg Sample data. For 295 healthy adults, the mean systolic blood pressure level is 121.79 mm Hg and the standard...

The squared magnitude response of a linear channel, denoted by |H (f)| 2 is shown in Figure. Assume that the gap T = 1 and the noise variance ?2n = 1 for all sub-channels. (a) Derive the formulas for...

If the United States has inflation of 6 percent and Europe has inflation of 8 percent, the value of the euro should increase, and all else held constant. Group startsTrue or False

The trial balance for the General fund of the Mesa at the end of the year follows: The following information was not included in the trail balance and requires additional entries in the accounts. At...

5. Create realistic expectations for the trainees by communicating what will occur in training.

3. Provide advance organizersoutlines, texts, diagrams, and graphs that help trainees organize the information that will be presented and practiced.

4. Preparing materials that will be used in instruction (e.g., copies of overheads, cases).