pomdp_py.problems.multi_object_search.models package

Subpackages

Submodules

pomdp_py.problems.multi_object_search.models.observation_model module

Defines the ObservationModel for the 2D Multi-Object Search domain.

Origin: Multi-Object Search using Object-Oriented POMDPs (ICRA 2019) (extensions: action space changes, different sensor model, gridworld instead of topological graph)

Observation: {objidpose(x,y) or NULL}. The sensor model could vary;

it could be a fan-shaped model as the original paper, or it could be something else. But the resulting observation should be a map from object id to observed pose or NULL (not observed).

Observation Model

The agent can observe its own state, as well as object poses that are within its sensor range. We only need to model object observation.

class pomdp_py.problems.multi_object_search.models.observation_model.MosObservationModel(dim, sensor, object_ids, sigma=0.01, epsilon=1)[source]

Bases: OOObservationModel

Object-oriented transition model

sample(self, next_state, action, argmax=False, **kwargs)[source]

Returns random observation

class pomdp_py.problems.multi_object_search.models.observation_model.ObjectObservationModel(objid, sensor, dim, sigma=0, epsilon=1)[source]

Bases: ObservationModel

probability(observation, next_state, action, **kwargs)[source]

Returns the probability of Pr (observation | next_state, action).

Parameters:
sample(next_state, action, **kwargs)[source]

Returns observation

argmax(self, next_state, action)[source]

Returns the most likely observation

pomdp_py.problems.multi_object_search.models.observation_model.unittest()[source]

pomdp_py.problems.multi_object_search.models.policy_model module

Policy model for 2D Multi-Object Search domain. It is optional for the agent to be equipped with an occupancy grid map of the environment.

class pomdp_py.problems.multi_object_search.models.policy_model.PolicyModel(robot_id, grid_map=None)[source]

Bases: RolloutPolicy

Simple policy model. All actions are possible at any state.

sample(self, state)[source]

Returns action randomly sampled according to the distribution of this policy model.

Parameters:

state (State) – the next state \(s\)

Returns:

the action \(a\)

Return type:

Action

probability(self, action, state)[source]

Returns the probability of \(\pi(a|s)\).

Parameters:
  • action (Action) – the action \(a\)

  • state (State) – the state \(s\)

Returns:

the probability \(\pi(a|s)\)

Return type:

float

argmax(state, **kwargs)[source]

Returns the most likely action

get_all_actions(state=None, history=None)[source]

note: find can only happen after look.

rollout(self, State state, tuple history=None)[source]

pomdp_py.problems.multi_object_search.models.reward_model module

Reward model for 2D Multi-object Search domain

class pomdp_py.problems.multi_object_search.models.reward_model.MosRewardModel(target_objects, big=1000, small=1, robot_id=None)[source]

Bases: RewardModel

probability(self, reward, state, action, next_state)[source]

Returns the probability of \(\Pr(r|s,a,s')\).

Parameters:
  • reward (float) – the reward \(r\)

  • state (State) – the state \(s\)

  • action (Action) – the action \(a\)

  • next_state (State) – the next state \(s'\)

Returns:

the probability \(\Pr(r|s,a,s')\)

Return type:

float

sample(self, state, action, next_state)[source]

Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.

Parameters:
  • state (State) – the next state \(s\)

  • action (Action) – the action \(a\)

  • next_state (State) – the next state \(s'\)

Returns:

the reward \(r\)

Return type:

float

argmax(state, action, next_state, normalized=False, robot_id=None)[source]

Returns the most likely reward

class pomdp_py.problems.multi_object_search.models.reward_model.GoalRewardModel(target_objects, big=1000, small=1, robot_id=None)[source]

Bases: MosRewardModel

This is a reward where the agent gets reward only for detect-related actions.

pomdp_py.problems.multi_object_search.models.transition_model module

Defines the TransitionModel for the 2D Multi-Object Search domain.

Origin: Multi-Object Search using Object-Oriented POMDPs (ICRA 2019) (extensions: action space changes, different sensor model, gridworld instead of topological graph)

Description: Multi-Object Search in a 2D grid world.

Transition: deterministic

class pomdp_py.problems.multi_object_search.models.transition_model.MosTransitionModel(dim, sensors, object_ids, epsilon=1e-09)[source]

Bases: OOTransitionModel

Object-oriented transition model; The transition model supports the multi-robot case, where each robot is equipped with a sensor; The multi-robot transition model should be used by the Environment, but not necessarily by each robot for planning.

sample(self, state, action, argmax=False, **kwargs)[source]

Returns random next_state

argmax(self, state, action, **kwargs)[source]

Returns the most likely next state

class pomdp_py.problems.multi_object_search.models.transition_model.StaticObjectTransitionModel(objid, epsilon=1e-09)[source]

Bases: TransitionModel

This model assumes the object is static.

probability(self, next_state, state, action)[source]

Returns the probability of \(\Pr(s'|s,a)\).

Parameters:
  • state (State) – the state \(s\)

  • next_state (State) – the next state \(s'\)

  • action (Action) – the action \(a\)

Returns:

the probability \(\Pr(s'|s,a)\)

Return type:

float

sample(state, action)[source]

Returns next_object_state

argmax(state, action)[source]

Returns the most likely next object_state

class pomdp_py.problems.multi_object_search.models.transition_model.RobotTransitionModel(sensor, dim, epsilon=1e-09)[source]

Bases: TransitionModel

We assume that the robot control is perfect and transitions are deterministic.

classmethod if_move_by(robot_id, state, action, dim, check_collision=True)[source]

Defines the dynamics of robot motion; dim (tuple): the width, length of the search world.

probability(self, next_state, state, action)[source]

Returns the probability of \(\Pr(s'|s,a)\).

Parameters:
  • state (State) – the state \(s\)

  • next_state (State) – the next state \(s'\)

  • action (Action) – the action \(a\)

Returns:

the probability \(\Pr(s'|s,a)\)

Return type:

float

argmax(state, action)[source]

Returns the most likely next robot_state

sample(state, action)[source]

Returns next_robot_state

pomdp_py.problems.multi_object_search.models.transition_model.valid_pose(pose, width, length, state=None, check_collision=True, pose_objid=None)[source]

Returns True if the given pose (x,y) is a valid pose; If check_collision is True, then the pose is only valid if it is not overlapping with any object pose in the environment state.

pomdp_py.problems.multi_object_search.models.transition_model.in_boundary(pose, width, length)[source]

Module contents

Defines models, including transition, observation, reward, policy; Also includes additional components such as the sensor model and grid map.