pomdp_py.problems.load_unload package¶
Load/Unload¶
Problem originally introduced in Solving POMDPs by Searching the Space of Finite Policies
Quoting from the original paper on problem description:
The load/unload problem with 8 locations: the agent starts in the “Unload” location (U) and receives a reward each time it returns to this place after passing through the “Load” location (L). The problem is partially observable because the agent cannot distinguish the different locations in between Load and Unload, and because it cannot perceive if it is loaded or not (\(|S| = 14\), \(|O| = 3\) and \(|A| = 2\)).
Figure from the paper:
python -m pomdp_py -r load_unload
Submodules¶
pomdp_py.problems.load_unload.load_unload module¶
The load unload problem. An agent is placed on a one dimensional grid world and is tasked with loading itself up on the right side of the world and unloading on the left. The agent can observe whether or not it is in the load or unload block but can not tell its exact location of whether it is loaded or unloaded. Therefore the agent must maintain belief about it’s location and load status.
States are defined by the location of the agent and whether or not it is loaded Actions: “move-left”, “move-right” Rewards:
+100 for moving into the unload block while loaded -1 otherwise
- class pomdp_py.problems.load_unload.load_unload.LUObservation(obs)[source]¶
Bases:
Observation
- class pomdp_py.problems.load_unload.load_unload.LUObservationModel[source]¶
Bases:
ObservationModel
This problem is small enough for the probabilities to be directly given externally
- probability(self, observation, next_state, action)[source]¶
Returns the probability of \(\Pr(o|s',a)\).
- Parameters:
observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)
- Returns:
the probability \(\Pr(o|s',a)\)
- Return type:
float
- class pomdp_py.problems.load_unload.load_unload.LUTransitionModel[source]¶
Bases:
TransitionModel
This problem is small enough for the probabilities to be directly given externally
- class pomdp_py.problems.load_unload.load_unload.LURewardModel[source]¶
Bases:
RewardModel
- probability(self, reward, state, action, next_state)[source]¶
Returns the probability of \(\Pr(r|s,a,s')\).
- class pomdp_py.problems.load_unload.load_unload.LUPolicyModel[source]¶
Bases:
RandomRollout
This is an extremely dumb policy model; To keep consistent with the framework.
- class pomdp_py.problems.load_unload.load_unload.LoadUnloadProblem(init_state, init_belief)[source]¶
Bases:
POMDP