Question: Question 1. You are designing a software to control the operation of a service machine. At each period of time, a new job may arrive

Question 1. You are designing a software to control the operation of a service machine. At each period of time, a new job may arrive with probability 0.25. If the machine is processing another job, the new job is queued but no more than 10 jobs can fit in the queue; the job requests that cannot be queued are lost. Only one new job can arrive at each time period. The processing speed can be set (for each period) to normal level or to high-speed level. The probability that a job will be completed in one unit of time is 0.25 when the service is at a normal level and 0.75 at the high-speed level of service. The normal service costs in energy and amortization 1 unit per time interval, while the high-speed costs 3.6 units. For each job, waiting costs 0.3 per time interval. If a job is lost, then a penalty of 20 units is accessed. (Q1.a) Formulate an infinite-horizon average-cost Markov decision problem to determine the optimal speed of service at each time so that the average cost/per term is minimal. (Q1.b) Determine an optimal policy using policy iteration method. (Q1.c) Solve the problem using linear programming. (Q1.d) For what discount factor is the discounted infinite-horizon problem equivalent to the average reward problem in this context? Question 2. Each quarter the marketing manager of a retail store divides the customers into two groups based on their purchase behavior in the previous quarter. The classes are denoted by L and H . The manager wishes to determine to which group of customers he should sent a catalog. The cost of sending a catalog is $ 15 per customer. If a customer from group L receives a catalog, then the expected purchase in the current quarter is $ 20, otherwise it is $ 10. If a customer from group H receives a catalog, then the expected purchase in the current quarter is $ 50, otherwise it is $ 25. Furthermore, if a customer from group L receives a catalog, then the probability that he will stay in group L for the next quarter is 0.3, otherwise, it is 0.5. If a customer from group H receives a catalog, then the probability that s/he will stay in group H for the next quarter is 0.8, otherwise, it is 0.4. (Q2.a) Formulate an average reward problem to help the manager. (Q2.b) Determine an optimal policy using policy iteration method. (Q2.c) Solve the problem using linear programming. (Q2.d) Formulate the dual problem to the linear programming problem in (Q2.c). What is the optimal solution of the dual problem and what is its meaning?

Step by Step Solution

There are 3 Steps involved in it

1 Expert Approved Answer
Step: 1 Unlock blur-text-image
Question Has Been Solved by an Expert!

Get step-by-step solutions from verified subject matter experts

Step: 2 Unlock
Step: 3 Unlock

Students Have Also Explored These Related Mathematics Questions!