The taxicab problem domain is well known in the area of reinforcement learning. Simply put, a taxicab driver is tasked with the problem of picking up a passenger and delivering him to his destination in as few steps as possible. Typically, the taxi is constrained by a limit on the amount of fuel that can be carried.

The canonical taxicab problem is a 5x5 gridworld. There are four cells which serve as possible starting locations and possible destinations for the passenger. There is a...