problems.multi_object_search.models package¶
Subpackages¶
Submodules¶
problems.multi_object_search.models.observation_model module¶
Defines the ObservationModel for the 2D Multi-Object Search domain.
Origin: Multi-Object Search using Object-Oriented POMDPs (ICRA 2019) (extensions: action space changes, different sensor model, gridworld instead of topological graph)
- Observation: {objidpose(x,y) or NULL}. The sensor model could vary;
it could be a fan-shaped model as the original paper, or it could be something else. But the resulting observation should be a map from object id to observed pose or NULL (not observed).
Observation Model
The agent can observe its own state, as well as object poses that are within its sensor range. We only need to model object observation.
- class problems.multi_object_search.models.observation_model.MosObservationModel(dim, sensor, object_ids, sigma=0.01, epsilon=1)[source]¶
Bases:
OOObservationModel
Object-oriented transition model
- class problems.multi_object_search.models.observation_model.ObjectObservationModel(objid, sensor, dim, sigma=0, epsilon=1)[source]¶
Bases:
ObservationModel
- probability(observation, next_state, action, **kwargs)[source]¶
Returns the probability of Pr (observation | next_state, action).
- Parameters:
observation (ObjectObservation) –
next_state (State) –
action (Action) –
problems.multi_object_search.models.policy_model module¶
Policy model for 2D Multi-Object Search domain. It is optional for the agent to be equipped with an occupancy grid map of the environment.
- class problems.multi_object_search.models.policy_model.PolicyModel(robot_id, grid_map=None)[source]¶
Bases:
RolloutPolicy
Simple policy model. All actions are possible at any state.
- sample(self, state)[source]¶
Returns action randomly sampled according to the distribution of this policy model.
problems.multi_object_search.models.reward_model module¶
Reward model for 2D Multi-object Search domain
- class problems.multi_object_search.models.reward_model.MosRewardModel(target_objects, big=1000, small=1, robot_id=None)[source]¶
Bases:
RewardModel
- probability(self, reward, state, action, next_state)[source]¶
Returns the probability of \(\Pr(r|s,a,s')\).
- class problems.multi_object_search.models.reward_model.GoalRewardModel(target_objects, big=1000, small=1, robot_id=None)[source]¶
Bases:
MosRewardModel
This is a reward where the agent gets reward only for detect-related actions.
problems.multi_object_search.models.transition_model module¶
Defines the TransitionModel for the 2D Multi-Object Search domain.
Origin: Multi-Object Search using Object-Oriented POMDPs (ICRA 2019) (extensions: action space changes, different sensor model, gridworld instead of topological graph)
Description: Multi-Object Search in a 2D grid world.
Transition: deterministic
- class problems.multi_object_search.models.transition_model.MosTransitionModel(dim, sensors, object_ids, epsilon=1e-09)[source]¶
Bases:
OOTransitionModel
Object-oriented transition model; The transition model supports the multi-robot case, where each robot is equipped with a sensor; The multi-robot transition model should be used by the Environment, but not necessarily by each robot for planning.
- class problems.multi_object_search.models.transition_model.StaticObjectTransitionModel(objid, epsilon=1e-09)[source]¶
Bases:
TransitionModel
This model assumes the object is static.
- class problems.multi_object_search.models.transition_model.RobotTransitionModel(sensor, dim, epsilon=1e-09)[source]¶
Bases:
TransitionModel
We assume that the robot control is perfect and transitions are deterministic.
- classmethod if_move_by(robot_id, state, action, dim, check_collision=True)[source]¶
Defines the dynamics of robot motion; dim (tuple): the width, length of the search world.
- problems.multi_object_search.models.transition_model.valid_pose(pose, width, length, state=None, check_collision=True, pose_objid=None)[source]¶
Returns True if the given pose (x,y) is a valid pose; If check_collision is True, then the pose is only valid if it is not overlapping with any object pose in the environment state.
Module contents¶
Defines models, including transition, observation, reward, policy; Also includes additional components such as the sensor model and grid map.