This agent demonstrates how to solve the water jug problem using reinforcement learning. It is a modification of the Simple Water Jug Agent. Two main changes are made: templates are used to generate the full space of possible moves that the agent can perform and rewards are set based on the problem solution.
This agent simulates an omniscient taxi driver that uses reinforcement learning and hierarchical task decomposition to improve its performance over runs.
agent incorporates both operator subgoaling/means ends analysis with reinforcement learning. All search control knowledge (operator evaluation rules) are removed from blocks-world-operator-subgoaling and instead there are RL rules supplemented with rules to compute reward, both in the top state and the substate. Implemented for four blocks.
This agent contains a version of blocks-world that involves one level of problem spaces and look-ahead but with two important extensions. It demonstrates how RL-rules can be learned by chunking and then updated in the future. The advantage over simple lookahead is that it doesn't lock on to the one path found during look-ahead after chunking. It will still do some exploration.
An agent that performs graph search using the selection-astar default rules. This approach can be modified to use any of the different selection approaches.
The basic idea behind this agent is that there is a mission to go from place to place (in the mission structure) using go-to-location for each go-to-location use an iterative form of A* search to find the minimal path between each place iteratively select go-to-waypoint to move through the graph Key data structures (initialized...
Various agents for learning and playing Infinite Mario domain from RLCompetition 2009. They encode different state representations and learning strategies.
Soar capabilities
Reinforcement learning
Downloads
This agent is packaged with the Infinite Mario environment download.
An agent that tests four rl-rule sequence cases to behaviorally test the Soar reinforcement learning update mechanism. Learning/discount-rate are set to known values and other parameters can be tweaked. Some of the expected behavior is documented in the README file included.