Question: Consider a single-server system where there can be at most 2 customers in the system (including the one being served). In each hour, a new
Consider a single-server system where there can be at most 2 customers in the system (including the one being served). In each hour, a new customer enters to the system with probability 1/2 unless there are already 2 customers in the system. Assume that new arrival occurs at the end of each hour. At the beginning of each hour, the server can decide a configuration if there is a customer in the system. If the configuration is fast, with probability 0.8, one customer is served and he/she leaves the system in a given hour. On the other hand, if the configuration is slow, this probability decreases to 0.6. 50 TL revenue is obtained for each customer whose service is completed. The costs of slow and fast configurations are 3 and 9 TL per hour, respectively. The hourly discount rate is = 0:95. We would like to maximize total expected discounted profit over an infinite horizon. a) Formulate the problem as MDP model by defining states, decision sets, transition probabilities and expected rewards clearly. b) Find the optimal policy using Policy Iteration where the initial policy is to use fast configuration whenever there is at least one customer in the system.
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
