pomdp_py.utils package¶

`pomdp_py.utils.interfaces`	Utilities for interfacing with external libraries
`pomdp_py.utils.debugging`	This module contains utility functions making it easier to debug POMDP planning.
`pomdp_py.utils.templates`	Some particular implementations of the interface for convenience
`pomdp_py.utils.cython_utils`	Utility functions for Cython code.
`pomdp_py.utils.typ`	Utilties for typography, i.e. dealing with strings for the purpose of displaying them.
`pomdp_py.utils.math`	Assorted utilities for math
`pomdp_py.utils.misc`	Misc Python utilities
`pomdp_py.utils.colors`	Utilities for dealing with colors

Subpackages¶

pomdp_py.utils.interfaces package
- pomdp_py.utils.interfaces.simple_rl module

Submodules¶

pomdp_py.utils.colors module¶

Utilities for dealing with colors

pomdp_py.utils.colors.lighter(color, percent)[source]¶: assumes color is rgb between (0, 0, 0) and (255, 255, 255)

pomdp_py.utils.colors.rgb_to_hex(rgb)[source]¶

pomdp_py.utils.colors.hex_to_rgb(hx)[source]¶: hx is a string, begins with #. ASSUME len(hx)=7.

pomdp_py.utils.colors.inverse_color_rgb(rgb)[source]¶

pomdp_py.utils.colors.inverse_color_hex(hx)[source]¶: hx is a string, begins with #. ASSUME len(hx)=7.

pomdp_py.utils.colors.random_unique_color(colors, ctype=1)[source]¶: ctype=1: completely random ctype=2: red random ctype=3: blue random ctype=4: green random ctype=5: yellow random

pomdp_py.utils.cython_utils.cpython-37m-x86_64-linux-gnu module¶

pomdp_py.utils.cython_utils module¶

Utility functions for Cython code.

pomdp_py.utils.cython_utils.det_dict_hash(dct, keep=9)¶: Deterministic hash of a dictionary without sorting.

pomdp_py.utils.debugging module¶

This module contains utility functions making it easier to debug POMDP planning.

TreeDebugger¶

The core debugging functionality for POMCP/POUCT search trees is incorporated into the TreeDebugger. It is designed for ease of use during a pdb or ipdb debugging session. Here is a minimal example usage:

from pomdp_py.utils import TreeDebugger
from pomdp_problems.tiger import TigerProblem

# pomdp_py.Agent
agent = TigerProblem.create("tiger-left", 0.5, 0.15).agent

# suppose pouct is a pomdp_py.POUCT object (POMCP works too)
pouct = pomdp_py.POUCT(max_depth=4, discount_factor=0.95,
                       num_sims=4096, exploration_const=200,
                       rollout_policy=tiger_problem.agent.policy_model)

action = pouct.plan(agent)
dd = TreeDebugger(agent.tree)
import pdb; pdb.set_trace()

When the program executes, you enter the pdb debugger, and you can:

(Pdb) dd.pp
_VNodePP(n=4095, v=-19.529)(depth=0)
├─── ₀listen⟶_QNodePP(n=4059, v=-19.529)
│    ├─── ₀tiger-left⟶_VNodePP(n=2013, v=-16.586)(depth=1)
│    │    ├─── ₀listen⟶_QNodePP(n=1883, v=-16.586)
│    │    │    ├─── ₀tiger-left⟶_VNodePP(n=1441, v=-8.300)(depth=2)
... # prints out the entire tree; Colored in terminal.

(Pdb) dd.p(1)
_VNodePP(n=4095, v=-19.529)(depth=0)
├─── ₀listen⟶_QNodePP(n=4059, v=-19.529)
│    ├─── ₀tiger-left⟶_VNodePP(n=2013, v=-16.586)(depth=1)
│    │    ├─── ₀listen⟶_QNodePP(n=1883, v=-16.586)
│    │    ├─── ₁open-left⟶_QNodePP(n=18, v=-139.847)
│    │    └─── ₂open-right⟶_QNodePP(n=112, v=-57.191)
... # prints up to depth 1

Note that the printed texts are colored in the terminal.

You can retrieve the subtree through indexing:

(Pdb) dd[0]
listen⟶_QNodePP(n=4059, v=-19.529)
    - [0] tiger-left: VNode(n=2013, v=-16.586)
    - [1] tiger-right: VNode(n=2044, v=-16.160)

(Pdb) dd[0][1][2]
open-right⟶_QNodePP(n=15, v=-148.634)
    - [0] tiger-left: VNode(n=7, v=-20.237)
    - [1] tiger-right: VNode(n=6, v=8.500)

You can obtain the currently preferred action sequence by:

(Pdb) dd.mbp
   listen  []
   listen  []
   listen  []
   listen  []
   open-left  []
 _VNodePP(n=4095, v=-19.529)(depth=0)
 ├─── ₀listen⟶_QNodePP(n=4059, v=-19.529)
 │    └─── ₁tiger-right⟶_VNodePP(n=2044, v=-16.160)(depth=1)
 │         ├─── ₀listen⟶_QNodePP(n=1955, v=-16.160)
 │         │    └─── ₁tiger-right⟶_VNodePP(n=1441, v=-8.300)(depth=2)
 │         │         ├─── ₀listen⟶_QNodePP(n=947, v=-8.300)
 │         │         │    └─── ₁tiger-right⟶_VNodePP(n=768, v=0.022)(depth=3)
 │         │         │         ├─── ₀listen⟶_QNodePP(n=462, v=0.022)
 │         │         │         │    └─── ₁tiger-right⟶_VNodePP(n=395, v=10.000)(depth=4)
 │         │         │         │         ├─── ₁open-left⟶_QNodePP(n=247, v=10.000)

mbp stands for “mark best plan”.

To explore more features, browse the list of methods in the documentation.

class pomdp_py.utils.debugging.TreeDebugger(tree)[source]¶

Bases: object

Helps you debug the search tree; A search tree is a tree that contains a subset of future histories, organized into QNodes (value represents Q(b,a); children are observations) and VNodes (value represents V(b); children are actions).

num_nodes(kind='all')[source]¶: Returns the total number of nodes in the tree rooted at “current”

property depth¶: Tree depth starts from 0 (root node only). It is the largest number of edges on a path from root to leaf.

property d¶: alias for depth

property num_layers¶: Returns the number of layers; It is the number of layers of nodes, which equals to depth + 1

property nl¶: alias for num_layers

property nn¶: Returns the total number of nodes in the tree

property nq¶: Returns the total number of QNodes in the tree

property nv¶: Returns the total number of VNodes in the tree

l(depth, as_debuggers=True)[source]¶: alias for layer

layer(depth, as_debuggers=True)[source]¶

Returns a list of nodes at the given depth. Will only return VNodes. Warning: If depth is high, there will likely be a huge number of nodes.

Parameters:

depth (int) – Depth of the tree
as_debuggers (bool) – True if return a list of TreeDebugger objects, one for each tree on the layer.

property leaf¶

step(key)[source]¶: Updates current interaction node to follow the edge along key

s(key)[source]¶: alias for step

back()[source]¶: move current node of interaction back to parent

property b¶: alias for back

property root¶: The root node when first creating this TreeDebugger

property r¶: alias for root

property c¶: Current node of interaction

p(*args, **kwargs)[source]¶: print tree

property pp¶: print tree, with preset options

property mbp¶: Mark Best and Print. Mark the best sequence, and then print with only the marked nodes

property pm¶: Print marked only

mark_sequence(seq, color='blue')[source]¶: Given a list of keys (understandable by __getitem__ in _NodePP), mark nodes (both QNode and VNode) along the path in the tree. Note this sequence starts from self.current; So self.current will also be marked.

mark(seq, **kwargs)[source]¶: alias for mark_sequence

mark_path(dest, **kwargs)[source]¶: paths the path to dest node

markp(dest, **kwargs)[source]¶: alias to mark_path

property clear¶: Clear the marks

property bestseq¶

Returns a list of actions, observation sequence that have the highest value for each step. Such a sequence is “preferred”.

Also, prints out the list of preferred actions for each step into the future

bestseqd(max_depth)[source]¶: alias for bestseq except with

static single_node_str(node, parent_edge=None, indent=1, include_children=True)[source]¶: Returns a string for printing given a single vnode.

static preferred_actions(root, max_depth=None)[source]¶: Print out the currently preferred actions up to given max_depth

path(dest)[source]¶

alias for path_to; Example usage:

marking path from root to the first node on the second layer:

dd.mark(dd.path(dd.layer(2)[0]))

path_to(dest)[source]¶: Returns a list of keys (actions / observations) that represents the path from self.current to the given node dest. Returns None if the path does not exist. Uses DFS. Can be useful for marking path to a node to a specific layer. Note that the returned path is a list of keys (i.e. edges), not nodes.

static tree_stats(root, max_depth=None)[source]¶: Gether statistics about the tree

pomdp_py.utils.debugging.sorted_by_str(enumerable)[source]¶

pomdp_py.utils.debugging.interpret_color(colorstr)[source]¶

pomdp_py.utils.math module¶

Assorted utilities for math

pomdp_py.utils.math.vec(p1, p2)[source]¶: vector from p1 to p2

pomdp_py.utils.math.proj(vec1, vec2, scalar=False)[source]¶

pomdp_py.utils.math.R_x(th)[source]¶

pomdp_py.utils.math.R_y(th)[source]¶

pomdp_py.utils.math.R_z(th)[source]¶

pomdp_py.utils.math.T(dx, dy, dz)[source]¶

pomdp_py.utils.math.to_radians(th)[source]¶

pomdp_py.utils.math.R_between(v1, v2)[source]¶

pomdp_py.utils.math.approx_equal(v1, v2, epsilon=1e-06)[source]¶

pomdp_py.utils.math.euclidean_dist(p1, p2)[source]¶

pomdp_py.utils.misc module¶

Misc Python utilities

pomdp_py.utils.misc.remap(oldvalue, oldmin, oldmax, newmin, newmax)[source]¶

pomdp_py.utils.misc.json_safe(obj)[source]¶

pomdp_py.utils.misc.safe_slice(arr, start, end)[source]¶

pomdp_py.utils.misc.similar(a, b)[source]¶

class pomdp_py.utils.misc.special_char[source]¶

Bases: object

left = '←'¶

up = '↑'¶

right = '→'¶

down = '↓'¶

longleft = '⟵'¶

longright = '⟶'¶

hline = '─'¶

vline = '│'¶

bottomleft = '└'¶

longbottomleft = '└─'¶

topleft = '┌'¶

longtopleft = '┌─'¶

topright = '┐'¶

longtopright = '─┐'¶

bottomright = '┘'¶

longbottomright = '─┘'¶

intersect = '┼'¶

topt = '┬'¶

leftt = '├'¶

rightt = '┤'¶

bottomt = '┴'¶

shadebar = '▒'¶

SUBSCRIPT = {48: 8320, 49: 8321, 50: 8322, 51: 8323, 52: 8324, 53: 8325, 54: 8326, 55: 8327, 56: 8328, 57: 8329}¶

pomdp_py.utils.plotting module¶

pomdp_py.utils.templates module¶

Some particular implementations of the interface for convenience

class pomdp_py.utils.templates.SimpleState(data)[source]¶

Bases: State

A SimpleState is a state that stores one piece of hashable data and the equality of two states of this kind depends just on this data

class pomdp_py.utils.templates.SimpleAction(name)[source]¶

Bases: Action

A SimpleAction is an action defined by a string name

class pomdp_py.utils.templates.SimpleObservation(data)[source]¶

Bases: Observation

A SimpleObservation is an observation with a piece of hashable data that defines the equality.

class pomdp_py.utils.templates.DetTransitionModel(epsilon=1e-12)[source]¶

Bases: TransitionModel

A DetTransitionModel is a deterministic transition model. A probability of 1 - epsilon is given for correct transition, and epsilon is given for incorrect transition.

probability(self, next_state, state, action)[source]¶

Returns the probability of \(\Pr(s'|s,a)\).

Parameters:

state (State) – the state \(s\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(s'|s,a)\)

Return type:

float

sample(self, state, action)[source]¶

Returns next state randomly sampled according to the distribution of this transition model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)

Returns:

the next state \(s'\)

Return type:

State

class pomdp_py.utils.templates.DetObservationModel(epsilon=1e-12)[source]¶

Bases: ObservationModel

A DetTransitionModel is a deterministic transition model. A probability of 1 - epsilon is given for correct transition, and epsilon is given for incorrect transition.

probability(self, observation, next_state, action)[source]¶

Returns the probability of \(\Pr(o|s',a)\).

Parameters:

observation (Observation) – the observation \(o\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(o|s',a)\)

Return type:

float

sample(self, next_state, action)[source]¶

Returns observation randomly sampled according to the distribution of this observation model.

Parameters:

next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the observation \(o\)

Return type:

Observation

class pomdp_py.utils.templates.DetRewardModel[source]¶

Bases: RewardModel

A DetRewardModel is a deterministic reward model (the most typical kind).

reward_func(state, action, next_state)[source]¶

sample(self, state, action, next_state)[source]¶

Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)
next_state (State) – the next state \(s'\)

Returns:

the reward \(r\)

Return type:

float

argmax(self, state, action, next_state)[source]¶: Returns the most likely reward. This is optional.

class pomdp_py.utils.templates.UniformPolicyModel(actions)[source]¶

Bases: RolloutPolicy

sample(self, state)[source]¶

Returns action randomly sampled according to the distribution of this policy model.

Parameters:: state (State) – the next state \(s\)
Returns:: the action \(a\)
Return type:: Action

get_all_actions(self, *args)[source]¶: Returns a set of all possible actions, if feasible.

rollout(self, State state, tuple history=None)[source]¶

class pomdp_py.utils.templates.TabularTransitionModel(weights)[source]¶

Bases: TransitionModel

This tabular transition model is built given a dictionary that maps a tuple (state, action, next_state) to a probability. This model assumes that the given weights is complete, that is, it specifies the probability of all state-action-nextstate combinations

probability(self, next_state, state, action)[source]¶

Returns the probability of \(\Pr(s'|s,a)\).

Parameters:

state (State) – the state \(s\)
next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the probability \(\Pr(s'|s,a)\)

Return type:

float

sample(self, state, action)[source]¶

Returns next state randomly sampled according to the distribution of this transition model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)

Returns:

the next state \(s'\)

Return type:

State

get_all_states(self)[source]¶: Returns a set of all possible states, if feasible.

class pomdp_py.utils.templates.TabularObservationModel(weights)[source]¶

Bases: ObservationModel

This tabular observation model is built given a dictionary that maps a tuple (next_state, action, observation) to a probability. This model assumes that the given weights is complete.

probability(observation, next_state, action)[source]¶: observation is emitted from state

sample(self, next_state, action)[source]¶

Returns observation randomly sampled according to the distribution of this observation model.

Parameters:

next_state (State) – the next state \(s'\)
action (Action) – the action \(a\)

Returns:

the observation \(o\)

Return type:

Observation

get_all_observations(self)[source]¶: Returns a set of all possible observations, if feasible.

class pomdp_py.utils.templates.TabularRewardModel(rewards)[source]¶

Bases: RewardModel

This tabular reward model is built given a dictionary that maps a state or a tuple (state, action), or (state, action, next_state) to a probability. This model assumes that the given rewards is complete.

sample(self, state, action, next_state)[source]¶

Returns reward randomly sampled according to the distribution of this reward model. This is required, i.e. assumed to be implemented for a reward model.

Parameters:

state (State) – the next state \(s\)
action (Action) – the action \(a\)
next_state (State) – the next state \(s'\)

Returns:

the reward \(r\)

Return type:

float

pomdp_py.utils.test_utils module¶

pomdp_py.utils.typ module¶

Utilties for typography, i.e. dealing with strings for the purpose of displaying them.

class pomdp_py.utils.typ.bcolors[source]¶

Bases: object

WHITE = '\x1b[97m'¶

CYAN = '\x1b[96m'¶

MAGENTA = '\x1b[95m'¶

BLUE = '\x1b[94m'¶

GREEN = '\x1b[92m'¶

YELLOW = '\x1b[93m'¶

RED = '\x1b[91m'¶

BOLD = '\x1b[1m'¶

ENDC = '\x1b[0m'¶

static disable()[source]¶

static s(color, content)[source]¶: Returns a string with color when shown on terminal. color is a constant in bcolors class.

pomdp_py.utils.typ.info(content)[source]¶

pomdp_py.utils.typ.note(content)[source]¶

pomdp_py.utils.typ.error(content)[source]¶

pomdp_py.utils.typ.warning(content)[source]¶

pomdp_py.utils.typ.success(content)[source]¶

pomdp_py.utils.typ.bold(content)[source]¶

pomdp_py.utils.typ.cyan(content)[source]¶

pomdp_py.utils.typ.magenta(content)[source]¶

pomdp_py.utils.typ.blue(content)[source]¶

pomdp_py.utils.typ.green(content)[source]¶

pomdp_py.utils.typ.yellow(content)[source]¶

pomdp_py.utils.typ.red(content)[source]¶

pomdp_py.utils.typ.white(content)[source]¶

Table of Contents

Navigation

Related Topics

Donate/support

pomdp_py.utils package¶

Subpackages¶

Submodules¶

pomdp_py.utils.colors module¶

pomdp_py.utils.cython_utils.cpython-37m-x86_64-linux-gnu module¶

pomdp_py.utils.cython_utils module¶

pomdp_py.utils.debugging module¶

TreeDebugger¶

pomdp_py.utils.math module¶

pomdp_py.utils.misc module¶

pomdp_py.utils.plotting module¶

pomdp_py.utils.templates module¶

pomdp_py.utils.test_utils module¶

pomdp_py.utils.typ module¶

Module contents¶