Question: A small store owner uses reinforcement learning to manage inventory for a single product. The goal is to maximize profit by balancing ordering enough stock

A small store owner uses reinforcement learning to manage inventory for a single product. The goal is to maximize profit by balancing ordering enough stock to meet customer demand and avoiding excessive inventory holding costs. Identify the state, action, and reward and answer the below questions by assuming necessary information required.
a) Design a simple reward function for this scenario considering both positive and negative rewards. [3 marks]
b) Suppose the current inventory level is 5 units, and the actor randomly chooses to order 3 units. The daily demand turns out to be 8 units. Calculate the daily profit for this scenario. [4 marks]
c) How can the critic potentially use this information to update its value estimates (considering rewards)? Elucidate. [3 marks]

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related General Management Questions!