Question: Markov Decision Process There are two locations (location A and location B).Shipments of inventory are sent from A to B. There is a discount on
Markov Decision Process
There are two locations (location A and location B).Shipments of inventory are sent from A to B. There is a discount on the cost of the shipment if a certain amount is reached. However, there is a holding cost per kilogram per day. Look for a policy where you can combine orders to minimize shipping and holding costs. Px is the probability that the total weight of orders arriving in a day equals x.You can accumulate orders or ship them each evening.The parameter for discount is W which is the minimum weight for a discount. Shipments that weigh less than W have a cost cn per kg. Shipments for weights greater or equal to W cost cd per kg. Therefore cd is less than cn. Assume the cost of delaying a shipment of an order is E per kg per day.
Formulate the Markov Decision Process (determine transition probabilities, decision epochs, state space, action set, reward function).
Determine an optimality equation as if there was a finite amount of days. State assumptions
Step by Step Solution
There are 3 Steps involved in it
Get step-by-step solutions from verified subject matter experts
