pomdp_py.problems.rocksample.cythonize package¶
Submodules¶
pomdp_py.problems.rocksample.cythonize.rocksample_problem module¶
RockSample(n,k) problem
Origin: Heuristic Search Value Iteration for POMDPs (UAI 2004)
Description:
State space:
Position {(1,1),(1,2),…(n,n)} \(\times\) RockType_1 \(\times\) RockType_2, …, \(\times\) RockType_k where RockType_i = {Good, Bad} \(\times\) TerminalState
- (basically, the positions of rocks are known to the robot,
but not represented explicitly in the state space. Check_i will smartly check the rock i at its location.)
Action space:
North, South, East, West, Sample, Check_1, …, Check_k The first four moves the agent deterministically Sample: samples the rock at agent’s current location Check_i: receives a noisy observation about RockType_i (noise determined by eta (\(\eta\)). eta=1 -> perfect sensor; eta=0 -> uniform)
Observation: observes the property of rock i when taking Check_i.
- Reward: +10 for Sample a good rock. -10 for Sampling a bad rock.
Move to exit area +10. Other actions have no cost or reward.
Initial belief: every rock has equal probability of being Good or Bad.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.CheckAction¶
Bases:
RSAction
- rock_id¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.MoveAction¶
Bases:
RSAction
- EAST = (1, 0)¶
- NORTH = (0, 1)¶
- SOUTH = (0, -1)¶
- WEST = (-1, 0)¶
- motion¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSObservation¶
Bases:
Observation
- quality¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSObservationModel¶
Bases:
ObservationModel
- argmax(next_state, action)¶
Returns the most likely observation
- probability(self, observation, next_state, action)¶
Returns the probability of \(\Pr(o|s',a)\).
- Parameters:
observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)
- Returns:
the probability \(\Pr(o|s',a)\)
- Return type:
float
- sample(self, next_state, action)¶
Returns observation randomly sampled according to the distribution of this observation model.
- Parameters:
- Returns:
the observation \(o\)
- Return type:
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSPolicyModel¶
Bases:
RolloutPolicy
Simple policy model according to problem description.
- argmax(state, normalized=False, **kwargs)¶
Returns the most likely reward
- get_all_actions(self, *args)¶
Returns a set of all possible actions, if feasible.
- probability(self, action, state)¶
Returns the probability of \(\pi(a|s)\).
- rollout(self, State state, tuple history=None)¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSRewardModel¶
Bases:
RewardModel
- argmax(self, state, action, next_state)¶
Returns the most likely reward. This is optional.
- probability(self, reward, state, action, next_state)¶
Returns the probability of \(\Pr(r|s,a,s')\).
- sample(self, state, action, next_state)¶
Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSState¶
Bases:
State
- position¶
- rocktypes¶
- terminal¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSTransitionModel¶
Bases:
TransitionModel
The model is deterministic
- argmax(state, action)¶
Returns the most likely next state
- probability(self, next_state, state, action)¶
Returns the probability of \(\Pr(s'|s,a)\).
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RockSampleProblem¶
Bases:
POMDP
- static generate_instance(n, k)¶
Returns init_state and rock locations for an instance of RockSample(n,k)
- in_exit_area(pos)¶
- print_state()¶
- static random_free_location(n, not_free_locs)¶
returns a random (x,y) location in nxn grid that is free.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RockType¶
Bases:
object
- BAD = 'bad'¶
- GOOD = 'good'¶
- static invert(rocktype)¶
- static random(p=0.5)¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.euclidean_dist(p1, p2)¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.init_particles_belief(k, num_particles, init_state, belief='uniform')¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.main()¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.test_planner(rocksample, planner, nsteps=3, discount=0.95)¶
pomdp_py.problems.rocksample.cythonize.rocksample_problem module¶
RockSample(n,k) problem
Origin: Heuristic Search Value Iteration for POMDPs (UAI 2004)
Description:
State space:
Position {(1,1),(1,2),…(n,n)} \(\times\) RockType_1 \(\times\) RockType_2, …, \(\times\) RockType_k where RockType_i = {Good, Bad} \(\times\) TerminalState
- (basically, the positions of rocks are known to the robot,
but not represented explicitly in the state space. Check_i will smartly check the rock i at its location.)
Action space:
North, South, East, West, Sample, Check_1, …, Check_k The first four moves the agent deterministically Sample: samples the rock at agent’s current location Check_i: receives a noisy observation about RockType_i (noise determined by eta (\(\eta\)). eta=1 -> perfect sensor; eta=0 -> uniform)
Observation: observes the property of rock i when taking Check_i.
- Reward: +10 for Sample a good rock. -10 for Sampling a bad rock.
Move to exit area +10. Other actions have no cost or reward.
Initial belief: every rock has equal probability of being Good or Bad.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.CheckAction¶
Bases:
RSAction
- rock_id¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.MoveAction¶
Bases:
RSAction
- EAST = (1, 0)¶
- NORTH = (0, 1)¶
- SOUTH = (0, -1)¶
- WEST = (-1, 0)¶
- motion¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSObservation¶
Bases:
Observation
- quality¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSObservationModel¶
Bases:
ObservationModel
- argmax(next_state, action)¶
Returns the most likely observation
- probability(self, observation, next_state, action)¶
Returns the probability of \(\Pr(o|s',a)\).
- Parameters:
observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)
- Returns:
the probability \(\Pr(o|s',a)\)
- Return type:
float
- sample(self, next_state, action)¶
Returns observation randomly sampled according to the distribution of this observation model.
- Parameters:
- Returns:
the observation \(o\)
- Return type:
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSPolicyModel¶
Bases:
RolloutPolicy
Simple policy model according to problem description.
- argmax(state, normalized=False, **kwargs)¶
Returns the most likely reward
- get_all_actions(self, *args)¶
Returns a set of all possible actions, if feasible.
- probability(self, action, state)¶
Returns the probability of \(\pi(a|s)\).
- rollout(self, State state, tuple history=None)¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSRewardModel¶
Bases:
RewardModel
- argmax(self, state, action, next_state)¶
Returns the most likely reward. This is optional.
- probability(self, reward, state, action, next_state)¶
Returns the probability of \(\Pr(r|s,a,s')\).
- sample(self, state, action, next_state)¶
Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSState¶
Bases:
State
- position¶
- rocktypes¶
- terminal¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSTransitionModel¶
Bases:
TransitionModel
The model is deterministic
- argmax(state, action)¶
Returns the most likely next state
- probability(self, next_state, state, action)¶
Returns the probability of \(\Pr(s'|s,a)\).
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RockSampleProblem¶
Bases:
POMDP
- static generate_instance(n, k)¶
Returns init_state and rock locations for an instance of RockSample(n,k)
- in_exit_area(pos)¶
- print_state()¶
- static random_free_location(n, not_free_locs)¶
returns a random (x,y) location in nxn grid that is free.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RockType¶
Bases:
object
- BAD = 'bad'¶
- GOOD = 'good'¶
- static invert(rocktype)¶
- static random(p=0.5)¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.euclidean_dist(p1, p2)¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.init_particles_belief(k, num_particles, init_state, belief='uniform')¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.main()¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.test_planner(rocksample, planner, nsteps=3, discount=0.95)¶
pomdp_py.problems.rocksample.cythonize.rocksample_problem module¶
RockSample(n,k) problem
Origin: Heuristic Search Value Iteration for POMDPs (UAI 2004)
Description:
State space:
Position {(1,1),(1,2),…(n,n)} \(\times\) RockType_1 \(\times\) RockType_2, …, \(\times\) RockType_k where RockType_i = {Good, Bad} \(\times\) TerminalState
- (basically, the positions of rocks are known to the robot,
but not represented explicitly in the state space. Check_i will smartly check the rock i at its location.)
Action space:
North, South, East, West, Sample, Check_1, …, Check_k The first four moves the agent deterministically Sample: samples the rock at agent’s current location Check_i: receives a noisy observation about RockType_i (noise determined by eta (\(\eta\)). eta=1 -> perfect sensor; eta=0 -> uniform)
Observation: observes the property of rock i when taking Check_i.
- Reward: +10 for Sample a good rock. -10 for Sampling a bad rock.
Move to exit area +10. Other actions have no cost or reward.
Initial belief: every rock has equal probability of being Good or Bad.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.CheckAction¶
Bases:
RSAction
- rock_id¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.MoveAction¶
Bases:
RSAction
- EAST = (1, 0)¶
- NORTH = (0, 1)¶
- SOUTH = (0, -1)¶
- WEST = (-1, 0)¶
- motion¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSObservation¶
Bases:
Observation
- quality¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSObservationModel¶
Bases:
ObservationModel
- argmax(next_state, action)¶
Returns the most likely observation
- probability(self, observation, next_state, action)¶
Returns the probability of \(\Pr(o|s',a)\).
- Parameters:
observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)
- Returns:
the probability \(\Pr(o|s',a)\)
- Return type:
float
- sample(self, next_state, action)¶
Returns observation randomly sampled according to the distribution of this observation model.
- Parameters:
- Returns:
the observation \(o\)
- Return type:
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSPolicyModel¶
Bases:
RolloutPolicy
Simple policy model according to problem description.
- argmax(state, normalized=False, **kwargs)¶
Returns the most likely reward
- get_all_actions(self, *args)¶
Returns a set of all possible actions, if feasible.
- probability(self, action, state)¶
Returns the probability of \(\pi(a|s)\).
- rollout(self, State state, tuple history=None)¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSRewardModel¶
Bases:
RewardModel
- argmax(self, state, action, next_state)¶
Returns the most likely reward. This is optional.
- probability(self, reward, state, action, next_state)¶
Returns the probability of \(\Pr(r|s,a,s')\).
- sample(self, state, action, next_state)¶
Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSState¶
Bases:
State
- position¶
- rocktypes¶
- terminal¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSTransitionModel¶
Bases:
TransitionModel
The model is deterministic
- argmax(state, action)¶
Returns the most likely next state
- probability(self, next_state, state, action)¶
Returns the probability of \(\Pr(s'|s,a)\).
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RockSampleProblem¶
Bases:
POMDP
- static generate_instance(n, k)¶
Returns init_state and rock locations for an instance of RockSample(n,k)
- in_exit_area(pos)¶
- print_state()¶
- static random_free_location(n, not_free_locs)¶
returns a random (x,y) location in nxn grid that is free.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RockType¶
Bases:
object
- BAD = 'bad'¶
- GOOD = 'good'¶
- static invert(rocktype)¶
- static random(p=0.5)¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.euclidean_dist(p1, p2)¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.init_particles_belief(k, num_particles, init_state, belief='uniform')¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.main()¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.test_planner(rocksample, planner, nsteps=3, discount=0.95)¶
pomdp_py.problems.rocksample.cythonize.rocksample_problem module¶
RockSample(n,k) problem
Origin: Heuristic Search Value Iteration for POMDPs (UAI 2004)
Description:
State space:
Position {(1,1),(1,2),…(n,n)} \(\times\) RockType_1 \(\times\) RockType_2, …, \(\times\) RockType_k where RockType_i = {Good, Bad} \(\times\) TerminalState
- (basically, the positions of rocks are known to the robot,
but not represented explicitly in the state space. Check_i will smartly check the rock i at its location.)
Action space:
North, South, East, West, Sample, Check_1, …, Check_k The first four moves the agent deterministically Sample: samples the rock at agent’s current location Check_i: receives a noisy observation about RockType_i (noise determined by eta (\(\eta\)). eta=1 -> perfect sensor; eta=0 -> uniform)
Observation: observes the property of rock i when taking Check_i.
- Reward: +10 for Sample a good rock. -10 for Sampling a bad rock.
Move to exit area +10. Other actions have no cost or reward.
Initial belief: every rock has equal probability of being Good or Bad.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.CheckAction¶
Bases:
RSAction
- rock_id¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.MoveAction¶
Bases:
RSAction
- EAST = (1, 0)¶
- NORTH = (0, 1)¶
- SOUTH = (0, -1)¶
- WEST = (-1, 0)¶
- motion¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSObservation¶
Bases:
Observation
- quality¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSObservationModel¶
Bases:
ObservationModel
- argmax(next_state, action)¶
Returns the most likely observation
- probability(self, observation, next_state, action)¶
Returns the probability of \(\Pr(o|s',a)\).
- Parameters:
observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)
- Returns:
the probability \(\Pr(o|s',a)\)
- Return type:
float
- sample(self, next_state, action)¶
Returns observation randomly sampled according to the distribution of this observation model.
- Parameters:
- Returns:
the observation \(o\)
- Return type:
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSPolicyModel¶
Bases:
RolloutPolicy
Simple policy model according to problem description.
- argmax(state, normalized=False, **kwargs)¶
Returns the most likely reward
- get_all_actions(self, *args)¶
Returns a set of all possible actions, if feasible.
- probability(self, action, state)¶
Returns the probability of \(\pi(a|s)\).
- rollout(self, State state, tuple history=None)¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSRewardModel¶
Bases:
RewardModel
- argmax(self, state, action, next_state)¶
Returns the most likely reward. This is optional.
- probability(self, reward, state, action, next_state)¶
Returns the probability of \(\Pr(r|s,a,s')\).
- sample(self, state, action, next_state)¶
Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSState¶
Bases:
State
- position¶
- rocktypes¶
- terminal¶
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RSTransitionModel¶
Bases:
TransitionModel
The model is deterministic
- argmax(state, action)¶
Returns the most likely next state
- probability(self, next_state, state, action)¶
Returns the probability of \(\Pr(s'|s,a)\).
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RockSampleProblem¶
Bases:
POMDP
- static generate_instance(n, k)¶
Returns init_state and rock locations for an instance of RockSample(n,k)
- in_exit_area(pos)¶
- print_state()¶
- static random_free_location(n, not_free_locs)¶
returns a random (x,y) location in nxn grid that is free.
- class pomdp_py.problems.rocksample.cythonize.rocksample_problem.RockType¶
Bases:
object
- BAD = 'bad'¶
- GOOD = 'good'¶
- static invert(rocktype)¶
- static random(p=0.5)¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.euclidean_dist(p1, p2)¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.init_particles_belief(k, num_particles, init_state, belief='uniform')¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.main()¶
- pomdp_py.problems.rocksample.cythonize.rocksample_problem.test_planner(rocksample, planner, nsteps=3, discount=0.95)¶