pomdp_py.problems.tiger.cythonize package¶

Submodules¶

pomdp_py.problems.tiger.cythonize.run_tiger module¶

pomdp_py.problems.tiger.cythonize.tiger_problem module¶

The classic Tiger problem.

This is a POMDP problem; Namely, it specifies both the POMDP (i.e. state, action, observation space) and the T/O/R for the agent as well as the environment.

The description of the tiger problem is as follows: (Quote from POMDP: Introduction to Partially Observable Markov Decision Processes by Kamalzadeh and Hahsler )

A tiger is put with equal probability behind one of two doors, while treasure is put behind the other one. You are standing in front of the two closed doors and need to decide which one to open. If you open the door with the tiger, you will get hurt (negative reward). But if you open the door with treasure, you receive a positive reward. Instead of opening a door right away, you also have the option to wait and listen for tiger noises. But listening is neither free nor entirely accurate. You might hear the tiger behind the left door while it is actually behind the right door and vice versa.

States: tiger-left, tiger-right Actions: open-left, open-right, listen Rewards:

+10 for opening treasure door. -100 for opening tiger door. -1 for listening.

Observations: You can hear either “tiger-left”, or “tiger-right”.

Note that in this example, the TigerProblem is a POMDP that also contains the agent and the environment as its fields. In general this doesn’t need to be the case. (Refer to more complicated examples.)

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerAction¶: Bases: Action

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerObservation¶

Bases: Observation

name¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerObservationModel¶

Bases: ObservationModel

This problem is small enough for the probabilities to be directly given externally

argmax(next_state, action, normalized=False, **kwargs)¶: Returns the most likely observation

get_all_observations(self)¶: Returns a set of all possible observations, if feasible.

get_distribution(next_state, action)¶: Returns the underlying distribution of the model; In this case, it’s just a histogram

probability(self, observation, next_state, action)¶

Returns the probability of \(\Pr(o|s',a)\).

Parameters:

observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(o|s',a)\)

Return type:

float

sample(self, next_state, action)¶

Returns observation randomly sampled according to the distribution of this observation model.

Parameters:

next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the observation \(o\)

Return type:

Observation

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerPolicyModel¶

Bases: RandomRollout

This is an extremely dumb policy model; To keep consistent with the framework.

argmax(state, normalized=False, **kwargs)¶: Returns the most likely reward

get_all_actions(self, *args)¶: Returns a set of all possible actions, if feasible.

probability(self, action, state)¶

Returns the probability of \(\pi(a|s)\).

Parameters:

action (Action) – the action \(a\)
state (State) – the state \(s\)

Returns:

the probability \(\pi(a|s)\)

Return type:

float

sample(self, state)¶

Returns action randomly sampled according to the distribution of this policy model.

Parameters:: state (State) – the next state \(s\)
Returns:: the action \(a\)
Return type:: Action

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerProblem(obs_probs, trans_probs, init_true_state, init_belief)¶

Bases: POMDP

ACTIONS = [TigerAction(open-left), TigerAction(open-right), TigerAction(listen)]¶

OBSERVATIONS = [TigerObservation(tiger-right), TigerObservation(tiger-left)]¶

STATES = [TigerState(tiger-right), TigerState(tiger-left)]¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerRewardModel¶

Bases: RewardModel

argmax(state, action, next_state, normalized=False)¶: Returns the most likely reward

get_distribution(state, action, next_state)¶: Returns the underlying distribution of the model

probability(self, reward, state, action, next_state)¶

Returns the probability of \(\Pr(r|s,a,s')\).

Parameters:

reward (float) – the reward \(r\)
state (State) – the state \(s\)
action (Action) – the action \(a\)
next_state (State) – the next state \(s'\)

Returns:

the probability \(\Pr(r|s,a,s')\)

Return type:

float

sample(self, state, action, next_state)¶

Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)
next_state (State) – the next state \(s'\)

Returns:

the reward \(r\)

Return type:

float

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerState¶

Bases: State

name¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerTransitionModel¶

Bases: TransitionModel

This problem is small enough for the probabilities to be directly given externally

argmax(state, action, normalized=False, **kwargs)¶: Returns the most likely next state

get_all_states(self)¶: Returns a set of all possible states, if feasible.

get_distribution(state, action)¶: Returns the underlying distribution of the model

probability(self, next_state, state, action)¶

Returns the probability of \(\Pr(s'|s,a)\).

Parameters:

state (State) – the state \(s\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(s'|s,a)\)

Return type:

float

sample(self, state, action)¶

Returns next state randomly sampled according to the distribution of this transition model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)

Returns:

the next state \(s'\)

Return type:

State

pomdp_py.problems.tiger.cythonize.tiger_problem.build_actions(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_observations(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_setting(setting)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_states(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.main()¶

pomdp_py.problems.tiger.cythonize.tiger_problem.test_planner(tiger_problem, planner, nsteps=3)¶

Runs the action-feedback loop of Tiger problem POMDP

Parameters:

tiger_problem (TigerProblem) – an instance of the tiger problem.
planner (Planner) – a planner
nsteps (int) – Maximum number of steps to run this loop.

pomdp_py.problems.tiger.cythonize.tiger_problem module¶

The classic Tiger problem.

This is a POMDP problem; Namely, it specifies both the POMDP (i.e. state, action, observation space) and the T/O/R for the agent as well as the environment.

The description of the tiger problem is as follows: (Quote from POMDP: Introduction to Partially Observable Markov Decision Processes by Kamalzadeh and Hahsler )

A tiger is put with equal probability behind one of two doors, while treasure is put behind the other one. You are standing in front of the two closed doors and need to decide which one to open. If you open the door with the tiger, you will get hurt (negative reward). But if you open the door with treasure, you receive a positive reward. Instead of opening a door right away, you also have the option to wait and listen for tiger noises. But listening is neither free nor entirely accurate. You might hear the tiger behind the left door while it is actually behind the right door and vice versa.

States: tiger-left, tiger-right Actions: open-left, open-right, listen Rewards:

+10 for opening treasure door. -100 for opening tiger door. -1 for listening.

Observations: You can hear either “tiger-left”, or “tiger-right”.

Note that in this example, the TigerProblem is a POMDP that also contains the agent and the environment as its fields. In general this doesn’t need to be the case. (Refer to more complicated examples.)

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerAction¶: Bases: Action

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerObservation¶

Bases: Observation

name¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerObservationModel¶

Bases: ObservationModel

This problem is small enough for the probabilities to be directly given externally

argmax(next_state, action, normalized=False, **kwargs)¶: Returns the most likely observation

get_all_observations(self)¶: Returns a set of all possible observations, if feasible.

get_distribution(next_state, action)¶: Returns the underlying distribution of the model; In this case, it’s just a histogram

probability(self, observation, next_state, action)¶

Returns the probability of \(\Pr(o|s',a)\).

Parameters:

observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(o|s',a)\)

Return type:

float

sample(self, next_state, action)¶

Returns observation randomly sampled according to the distribution of this observation model.

Parameters:

next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the observation \(o\)

Return type:

Observation

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerPolicyModel¶

Bases: RandomRollout

This is an extremely dumb policy model; To keep consistent with the framework.

argmax(state, normalized=False, **kwargs)¶: Returns the most likely reward

get_all_actions(self, *args)¶: Returns a set of all possible actions, if feasible.

probability(self, action, state)¶

Returns the probability of \(\pi(a|s)\).

Parameters:

action (Action) – the action \(a\)
state (State) – the state \(s\)

Returns:

the probability \(\pi(a|s)\)

Return type:

float

sample(self, state)¶

Returns action randomly sampled according to the distribution of this policy model.

Parameters:: state (State) – the next state \(s\)
Returns:: the action \(a\)
Return type:: Action

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerProblem(obs_probs, trans_probs, init_true_state, init_belief)¶

Bases: POMDP

ACTIONS = [TigerAction(open-left), TigerAction(open-right), TigerAction(listen)]¶

OBSERVATIONS = [TigerObservation(tiger-right), TigerObservation(tiger-left)]¶

STATES = [TigerState(tiger-right), TigerState(tiger-left)]¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerRewardModel¶

Bases: RewardModel

argmax(state, action, next_state, normalized=False)¶: Returns the most likely reward

get_distribution(state, action, next_state)¶: Returns the underlying distribution of the model

probability(self, reward, state, action, next_state)¶

Returns the probability of \(\Pr(r|s,a,s')\).

Parameters:

reward (float) – the reward \(r\)
state (State) – the state \(s\)
action (Action) – the action \(a\)
next_state (State) – the next state \(s'\)

Returns:

the probability \(\Pr(r|s,a,s')\)

Return type:

float

sample(self, state, action, next_state)¶

Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)
next_state (State) – the next state \(s'\)

Returns:

the reward \(r\)

Return type:

float

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerState¶

Bases: State

name¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerTransitionModel¶

Bases: TransitionModel

This problem is small enough for the probabilities to be directly given externally

argmax(state, action, normalized=False, **kwargs)¶: Returns the most likely next state

get_all_states(self)¶: Returns a set of all possible states, if feasible.

get_distribution(state, action)¶: Returns the underlying distribution of the model

probability(self, next_state, state, action)¶

Returns the probability of \(\Pr(s'|s,a)\).

Parameters:

state (State) – the state \(s\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(s'|s,a)\)

Return type:

float

sample(self, state, action)¶

Returns next state randomly sampled according to the distribution of this transition model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)

Returns:

the next state \(s'\)

Return type:

State

pomdp_py.problems.tiger.cythonize.tiger_problem.build_actions(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_observations(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_setting(setting)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_states(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.main()¶

pomdp_py.problems.tiger.cythonize.tiger_problem.test_planner(tiger_problem, planner, nsteps=3)¶

Runs the action-feedback loop of Tiger problem POMDP

Parameters:

tiger_problem (TigerProblem) – an instance of the tiger problem.
planner (Planner) – a planner
nsteps (int) – Maximum number of steps to run this loop.

pomdp_py.problems.tiger.cythonize.tiger_problem module¶

The classic Tiger problem.

This is a POMDP problem; Namely, it specifies both the POMDP (i.e. state, action, observation space) and the T/O/R for the agent as well as the environment.

The description of the tiger problem is as follows: (Quote from POMDP: Introduction to Partially Observable Markov Decision Processes by Kamalzadeh and Hahsler )

A tiger is put with equal probability behind one of two doors, while treasure is put behind the other one. You are standing in front of the two closed doors and need to decide which one to open. If you open the door with the tiger, you will get hurt (negative reward). But if you open the door with treasure, you receive a positive reward. Instead of opening a door right away, you also have the option to wait and listen for tiger noises. But listening is neither free nor entirely accurate. You might hear the tiger behind the left door while it is actually behind the right door and vice versa.

States: tiger-left, tiger-right Actions: open-left, open-right, listen Rewards:

+10 for opening treasure door. -100 for opening tiger door. -1 for listening.

Observations: You can hear either “tiger-left”, or “tiger-right”.

Note that in this example, the TigerProblem is a POMDP that also contains the agent and the environment as its fields. In general this doesn’t need to be the case. (Refer to more complicated examples.)

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerAction¶: Bases: Action

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerObservation¶

Bases: Observation

name¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerObservationModel¶

Bases: ObservationModel

This problem is small enough for the probabilities to be directly given externally

argmax(next_state, action, normalized=False, **kwargs)¶: Returns the most likely observation

get_all_observations(self)¶: Returns a set of all possible observations, if feasible.

get_distribution(next_state, action)¶: Returns the underlying distribution of the model; In this case, it’s just a histogram

probability(self, observation, next_state, action)¶

Returns the probability of \(\Pr(o|s',a)\).

Parameters:

observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(o|s',a)\)

Return type:

float

sample(self, next_state, action)¶

Returns observation randomly sampled according to the distribution of this observation model.

Parameters:

next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the observation \(o\)

Return type:

Observation

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerPolicyModel¶

Bases: RandomRollout

This is an extremely dumb policy model; To keep consistent with the framework.

argmax(state, normalized=False, **kwargs)¶: Returns the most likely reward

get_all_actions(self, *args)¶: Returns a set of all possible actions, if feasible.

probability(self, action, state)¶

Returns the probability of \(\pi(a|s)\).

Parameters:

action (Action) – the action \(a\)
state (State) – the state \(s\)

Returns:

the probability \(\pi(a|s)\)

Return type:

float

sample(self, state)¶

Returns action randomly sampled according to the distribution of this policy model.

Parameters:: state (State) – the next state \(s\)
Returns:: the action \(a\)
Return type:: Action

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerProblem(obs_probs, trans_probs, init_true_state, init_belief)¶

Bases: POMDP

ACTIONS = [TigerAction(open-left), TigerAction(open-right), TigerAction(listen)]¶

OBSERVATIONS = [TigerObservation(tiger-right), TigerObservation(tiger-left)]¶

STATES = [TigerState(tiger-right), TigerState(tiger-left)]¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerRewardModel¶

Bases: RewardModel

argmax(state, action, next_state, normalized=False)¶: Returns the most likely reward

get_distribution(state, action, next_state)¶: Returns the underlying distribution of the model

probability(self, reward, state, action, next_state)¶

Returns the probability of \(\Pr(r|s,a,s')\).

Parameters:

reward (float) – the reward \(r\)
state (State) – the state \(s\)
action (Action) – the action \(a\)
next_state (State) – the next state \(s'\)

Returns:

the probability \(\Pr(r|s,a,s')\)

Return type:

float

sample(self, state, action, next_state)¶

Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)
next_state (State) – the next state \(s'\)

Returns:

the reward \(r\)

Return type:

float

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerState¶

Bases: State

name¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerTransitionModel¶

Bases: TransitionModel

This problem is small enough for the probabilities to be directly given externally

argmax(state, action, normalized=False, **kwargs)¶: Returns the most likely next state

get_all_states(self)¶: Returns a set of all possible states, if feasible.

get_distribution(state, action)¶: Returns the underlying distribution of the model

probability(self, next_state, state, action)¶

Returns the probability of \(\Pr(s'|s,a)\).

Parameters:

state (State) – the state \(s\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(s'|s,a)\)

Return type:

float

sample(self, state, action)¶

Returns next state randomly sampled according to the distribution of this transition model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)

Returns:

the next state \(s'\)

Return type:

State

pomdp_py.problems.tiger.cythonize.tiger_problem.build_actions(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_observations(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_setting(setting)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_states(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.main()¶

pomdp_py.problems.tiger.cythonize.tiger_problem.test_planner(tiger_problem, planner, nsteps=3)¶

Runs the action-feedback loop of Tiger problem POMDP

Parameters:

tiger_problem (TigerProblem) – an instance of the tiger problem.
planner (Planner) – a planner
nsteps (int) – Maximum number of steps to run this loop.

pomdp_py.problems.tiger.cythonize.tiger_problem module¶

The classic Tiger problem.

This is a POMDP problem; Namely, it specifies both the POMDP (i.e. state, action, observation space) and the T/O/R for the agent as well as the environment.

The description of the tiger problem is as follows: (Quote from POMDP: Introduction to Partially Observable Markov Decision Processes by Kamalzadeh and Hahsler )

A tiger is put with equal probability behind one of two doors, while treasure is put behind the other one. You are standing in front of the two closed doors and need to decide which one to open. If you open the door with the tiger, you will get hurt (negative reward). But if you open the door with treasure, you receive a positive reward. Instead of opening a door right away, you also have the option to wait and listen for tiger noises. But listening is neither free nor entirely accurate. You might hear the tiger behind the left door while it is actually behind the right door and vice versa.

States: tiger-left, tiger-right Actions: open-left, open-right, listen Rewards:

+10 for opening treasure door. -100 for opening tiger door. -1 for listening.

Observations: You can hear either “tiger-left”, or “tiger-right”.

Note that in this example, the TigerProblem is a POMDP that also contains the agent and the environment as its fields. In general this doesn’t need to be the case. (Refer to more complicated examples.)

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerAction¶: Bases: Action

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerObservation¶

Bases: Observation

name¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerObservationModel¶

Bases: ObservationModel

This problem is small enough for the probabilities to be directly given externally

argmax(next_state, action, normalized=False, **kwargs)¶: Returns the most likely observation

get_all_observations(self)¶: Returns a set of all possible observations, if feasible.

get_distribution(next_state, action)¶: Returns the underlying distribution of the model; In this case, it’s just a histogram

probability(self, observation, next_state, action)¶

Returns the probability of \(\Pr(o|s',a)\).

Parameters:

observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(o|s',a)\)

Return type:

float

sample(self, next_state, action)¶

Returns observation randomly sampled according to the distribution of this observation model.

Parameters:

next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the observation \(o\)

Return type:

Observation

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerPolicyModel¶

Bases: RandomRollout

This is an extremely dumb policy model; To keep consistent with the framework.

argmax(state, normalized=False, **kwargs)¶: Returns the most likely reward

get_all_actions(self, *args)¶: Returns a set of all possible actions, if feasible.

probability(self, action, state)¶

Returns the probability of \(\pi(a|s)\).

Parameters:

action (Action) – the action \(a\)
state (State) – the state \(s\)

Returns:

the probability \(\pi(a|s)\)

Return type:

float

sample(self, state)¶

Returns action randomly sampled according to the distribution of this policy model.

Parameters:: state (State) – the next state \(s\)
Returns:: the action \(a\)
Return type:: Action

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerProblem(obs_probs, trans_probs, init_true_state, init_belief)¶

Bases: POMDP

ACTIONS = [TigerAction(open-left), TigerAction(open-right), TigerAction(listen)]¶

OBSERVATIONS = [TigerObservation(tiger-right), TigerObservation(tiger-left)]¶

STATES = [TigerState(tiger-right), TigerState(tiger-left)]¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerRewardModel¶

Bases: RewardModel

argmax(state, action, next_state, normalized=False)¶: Returns the most likely reward

get_distribution(state, action, next_state)¶: Returns the underlying distribution of the model

probability(self, reward, state, action, next_state)¶

Returns the probability of \(\Pr(r|s,a,s')\).

Parameters:

reward (float) – the reward \(r\)
state (State) – the state \(s\)
action (Action) – the action \(a\)
next_state (State) – the next state \(s'\)

Returns:

the probability \(\Pr(r|s,a,s')\)

Return type:

float

sample(self, state, action, next_state)¶

Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)
next_state (State) – the next state \(s'\)

Returns:

the reward \(r\)

Return type:

float

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerState¶

Bases: State

name¶

class pomdp_py.problems.tiger.cythonize.tiger_problem.TigerTransitionModel¶

Bases: TransitionModel

This problem is small enough for the probabilities to be directly given externally

argmax(state, action, normalized=False, **kwargs)¶: Returns the most likely next state

get_all_states(self)¶: Returns a set of all possible states, if feasible.

get_distribution(state, action)¶: Returns the underlying distribution of the model

probability(self, next_state, state, action)¶

Returns the probability of \(\Pr(s'|s,a)\).

Parameters:

state (State) – the state \(s\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(s'|s,a)\)

Return type:

float

sample(self, state, action)¶

Returns next state randomly sampled according to the distribution of this transition model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)

Returns:

the next state \(s'\)

Return type:

State

pomdp_py.problems.tiger.cythonize.tiger_problem.build_actions(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_observations(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_setting(setting)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.build_states(strings)¶

pomdp_py.problems.tiger.cythonize.tiger_problem.main()¶

pomdp_py.problems.tiger.cythonize.tiger_problem.test_planner(tiger_problem, planner, nsteps=3)¶

Runs the action-feedback loop of Tiger problem POMDP

Parameters:

tiger_problem (TigerProblem) – an instance of the tiger problem.
planner (Planner) – a planner
nsteps (int) – Maximum number of steps to run this loop.

Table of Contents

Navigation

Related Topics

Donate/support

pomdp_py.problems.tiger.cythonize package¶

Submodules¶

pomdp_py.problems.tiger.cythonize.run_tiger module¶

pomdp_py.problems.tiger.cythonize.tiger_problem module¶

pomdp_py.problems.tiger.cythonize.tiger_problem module¶

pomdp_py.problems.tiger.cythonize.tiger_problem module¶

pomdp_py.problems.tiger.cythonize.tiger_problem module¶

Module contents¶