pomdp_py.problems.tag.models package¶
Subpackages¶
Submodules¶
pomdp_py.problems.tag.models.observation_model module¶
- class pomdp_py.problems.tag.models.observation_model.TagObservationModel[source]¶
Bases:
ObservationModel
In this observation model, the robot deterministically observes the target location when it is in the same grid cell as the target. Ohterwise the robot does not observe anything.
- probability(self, observation, next_state, action)[source]¶
Returns the probability of \(\Pr(o|s',a)\).
- Parameters:
observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)
- Returns:
the probability \(\Pr(o|s',a)\)
- Return type:
float
pomdp_py.problems.tag.models.policy_model module¶
- class pomdp_py.problems.tag.models.policy_model.TagPolicyModel(grid_map=None)[source]¶
Bases:
RolloutPolicy
pomdp_py.problems.tag.models.reward_model module¶
- class pomdp_py.problems.tag.models.reward_model.TagRewardModel(small=1, big=10)[source]¶
Bases:
RewardModel
- probability(self, reward, state, action, next_state)[source]¶
Returns the probability of \(\Pr(r|s,a,s')\).
pomdp_py.problems.tag.models.transition_model module¶
The Tag problem. Implemented according to the paper Anytime Point-Based Approximations for Large POMDPs.
- Transition model: the robot moves deterministically. The target’s movement
depends on the robot; With Pr=0.8 the target moves away from the robot, and with Pr=0.2, the target stays at the same place. The target never moves closer to the robot.
- class pomdp_py.problems.tag.models.transition_model.TagTransitionModel(grid_map, target_motion_policy)[source]¶
Bases:
TransitionModel