Question: C CODING: The goal of this assignment is to create a Rock, Paper, Scissors simulator, which learns from previous games and becomes progressively better. The

C CODING:

The goal of this assignment is to create a Rock, Paper, Scissors simulator, which learns from previous games and becomes progressively better. The simulator begins with no knowledge of the rules of the game. It will learn how to play as the game progresses.

Your task is to write a program which asks the user for a move (Rock, Paper or Scissors) and then, based on the users move, chooses a response (Rock, Paper or Scissors). This process continues until the user decides to exit the game.

If the program knew the rules of the game, it would be trivial to make the program choose Paper when the user plays Rock, and Scissors when the user plays Paper, etc. Instead, your program should play randomly at first, and change its play based on previous games, through reinforcement learning.

After every game, the program should output the users move, the computers move, the outcome (win / draw), and the probability of the computers move.

Example Output:

 Game #1 User Move: Paper Computer Move: Rock User Wins. Probability of Move: 33.3% 
 Game #2 User Move: Paper Computer Move: Paper Draw. Probability of Move: 34.4% 

Methodology

Reinforcement learning uses a reward and punishment system to weed out bad moves and select good ones.

In Rock-Paper-Scissors, every combination of moves can be represented by the following table of values, in which one dimension is the Users move and the other is the Computers move. (ie. Vrr is a value associated with the

rock-rock combination). You should use a two-dimensional array to represent these values. The values are numbers assigned to indicate how good the move is.

Rock

Paper

Scissors

Rock

Vrr

Vpr

Vsr

Paper

Vrp

Vpp

Vsp

Scissors

Vrs

Vps

Vss

Initially, every combination has the same positive, non-zero initial value (for example, 10), since every combination is equally likely. This value is then increased (ex: +1) or decreased (ex: -1) based on whether the computer wins or loses (this is analogous to a reward or punishment).

Every time the user makes a new move, the computer must randomly select between 3 possibilities (Rock, Paper or Scissors), with the probability of selection being (Pij) (in which i is the users move and j is the computers move): Pij = Vij / (Vir + Vip + Vis)

Thus, the program must generate a random number (using rand(); ) and based on the value of this random number, should select one of the 3 choices. Then, it needs to assess whether the computer won or lost, and give itself a reward or punishment based on the result (which changes future probabilities).

Requirements

Your code must meet these requirements:

  • The program must be written in C

  • Use sensible variables names

  • Comment and indent your code

  • Use at least 1 function (besides main( )), for example to select the

    move or to check the winner.

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Databases Questions!