Adapt the vacuum world for reinforcement learning by including rewards for picking up each piece of dirt

Question:

Adapt the vacuum world for reinforcement learning by including rewards for picking up each piece of dirt and for getting home and switching off. Make the world accessible by providing suitable percepts. Now experiment with different reinforcement learning agents. Is function approximation necessary for success? What sort of approximate works for this application?