Symbolic Logic:Learning:Hierarchical Object Classification

From Knowino
Jump to: navigation, search

Objects and things may be classified to predict behaviour. This is an example of the compresion of data, leading to inductive inference.

This is described as Hierarchical Cluster Analysis, except that the approach generally described differs from what I describe here.

Contents

[edit] The problem

An object or thing may be described by a tuple of data giving attributes of that thing. Some attributes may be missing.

A set of things is described by a set of tuples.

We wish to classify objects into a class inheritance hierarchy so that we can predict attributes we dont yet know about an object.

If you have a set of tuples of attributes, each tuple representing an object or thing, then you could classify the things into an inheritance tree.

[edit] Minimizing complexity

The representation of a data set may be compressed by organising the information into an inheritance tree. The amount of compression indicates the predictive power of the inheritance tree.

[edit] Computational Complexity

The amount of processing required to create the inheritance tree may be large. At the expense of accuracy we would like the inheritance tree to be added to incrementally, and the time taken to add an element to the tree to be of order Log N.

[edit] The Complexity Tree

The complexity (information content) of a tree is the sum of the complexity of each branch of the tree. In calculating the effect on the complexity of the model of adding a tuple under a node, the complexity of the branch alone needs to be considered.

[edit] The Model

Let an InheritanceClassificationTree has a multiway tree made up of treenodes.

Each TreeNode has attributes

A ValueTuple is a structure of named attribute values. Each value may be missing or have a value of a particular type.

A ClassificationTuple has the samed named attributes. However each attribute is described by a ClassificationAttribute.

Both ValueTuple and ClassificationTuple inherit from Tuple.

[edit] Calculating Complexity

We wish to calculate the change in complexity when a new tuple is added to a tree node.

Adding a tuple adds the data for the tuple to this node, but encodes it using the tree node. The complexity depends on the complexity.

Adding a tuple also changes classification attributes.

[edit] Encoding Boolean Value

If the boolean value is not missing in the classification attribute then it must be missing or have the same value attribute.

[edit] Encoding String Value

If the string value is not missing in then it is encoded using log of the relative frequency of the string.

[edit] Encoding Number Value

If the string value is not missing in then its information content is approximated as the standard deviation * number of values.

[edit] The Algorithm

InheritanceClassificationTree.ClassifyTuple(Tuple newTuple) is

[edit] TreeNode Constructor

Set up the tree node with a classification tuple, and a value tuple.

TreeNode.TreeNode(Tuple newTuple)

[edit] TreeNode.AddTuple

Add a tuple to a tree node. Get back the changes to the classification tuple, for use in updating the classification of any parent node.

void TreeNode.AddTuple(Tuple newTuple, Tuple oldClassificationTuple, Tuple newClassificationTuple) is

[edit] TreeNode.FindBestFit

Find the child node whose classification node is most similar to the tuple. This is the node which encodes the tuple with the least information. Consider the effect on the encoding of the whole sub tree in determining how much extra information is required to encode this tuple.

The tuple should only be considered for adding to a child to the tree if the increase in information size is less than the raw un-encoded size of the tuple.

TreeNode TreeNode.FindBestFit(Tuple newTuple)

[edit] TreeNode.GetNewSize

Calculate the change in the information size that would result from adding the tuple.

double TreeNode.GetDeltaSize(Tuple newTuple)

[edit] ClassificationTuple.GetTotalInfoSize

Calculate the amount of information required to encode a tuple under a classification tuple.

double ClassificationTuple.GetTotalInfoSize()

[edit] ClassificationTuple.UpdateClassification

Update the tuple classification to include this tuple

double ClassificationTuple.Encode(Tuple newTuple)

Personal tools
Variants
Actions
Navigation
Community
Toolbox