TreeModel.java

TreeModel and its variant NewTreeModel are the classes that contain the most fundamental (and most complicated) code in the project.

TreeModel implements the statistical modeling technique that is described in the paper Interaction History Tree Models.

The TreeModel class maintains three similar data structures, each serving its own purpose: a model of observed behavior, a model of expected behavior and delta: a cache that holds information that could be derived from the models each time it is needed, but at great computational cost. Each of these data structures has the form of a tree of interaction histories and a (partial) mapping that maps nodes of the tree to finite distributions.

Every time that a response to the previous action is received from the environment, the underlying policy is adapted. At this point the state of the TreeModel is of the form:

and the response is rt. Now h is used to find a single path from the root node of each tree by finding the child of the root node that corresponds to at, and then the child of that node that corresponds to at-1 rt-1, and then the child of that node that corresponds to at-2 rt-2, and so on. So each tree has its root in the present and its branches extending into the past.

The data structures are updated as follows:

Update observed model

The model of observed behavior is updated

For each node along the path in the tree of the model of observed behavior, the value for rt is incremented by 1.

As a result, the observed model has at node N a finite distribution that holds the frequencies of all the responses seen after the state corresponding to N. Note that this distribution becomes more specific, albeit less trustworthy as we choose N further away from the root of the tree.

Ensure minimal expected model

If at is an action that has never occurred before, a finite distribution is added for it in the model of expected behavior. The new finite distribution is almost trivial: it has value 1 for rt and value 0 for all other possible responses.

Update delta

The delta data structure is updated.

First we determine the depth (distance to root) de of the deepest node along the path in the model of expected behavior that maps to a finite distribution.

Then for each node along the path in delta that has a depth smaller than de, the value for rt is incremented by 1.

As a result, delta has at node N the statistics of the observed data at N that is 'covered' by a finite distribution in the model of expected behavior.

Extend observed model

If the last information in the finite distribution that corresponds to the last node on the path in the observed model is trustworthy enough, the tree of the model of observed behavior is extended with a new node such that the new node will be on the path that corresponds to the current state of the TreeModel. An almost trivial finite distribution is added for the new node.

Update distributions of expected model

For each node along the path in the model of expected behavior that is mapped to a finite distribution, the finite distribution is re-evaluated using the MDL principle. If possible, the finite distribution is adapted to achieve optimal compression on the data described in the finite distribution that is attached to the corresponding node in the model of observed behavior.

Prune expected model

The finite distributions along the path in the model of expected behavior are checked for redundancy, finite distributions farthest from the root first. If the total description length of the data recorded in the model of observed behavior given the current model of expected behavior with the finite distribution under examination deleted is less than the total description length of the data recorded in the model of observed behavior given the current model of expected behavior intact, the offending finite distribution is removed.

Extend expected model

The finite distributions along the path in the model of observed behavior that are farther away from the root than the deepest finite distribution in the model of expected behavior are checked for useful information, finite distributions closest to the root first. If the total description length of the data recorded in the model of observed behavior given the current model of expected behavior with a finite distribution derived from the finite distribution under examination added is less than the total description length of the data recorded in the model of observed behavior given the current model of expected behavior, the proposed finite distribution is added.

Visit the project page on GitHub for project details.