jpt.learning.c45

Attributes

logger

DISCRIMINATIVE

GENERATIVE

_locals

Classes

JPTPartition

Represents a partition of the input data during JPT learning.

C45Algorithm

Functions

_initialize_worker_process()

learn_prior(variable, column)

c45split(→ Tuple[Dict[str, Any], JPTPartition, ...)

Creates a node in the decision tree according to the C4.5 algorithm

Module Contents

jpt.learning.c45.logger
jpt.learning.c45.DISCRIMINATIVE = 'discriminative'
jpt.learning.c45.GENERATIVE = 'generative'
jpt.learning.c45._locals
jpt.learning.c45._initialize_worker_process()
class jpt.learning.c45.JPTPartition(data: numpy.ndarray | None, start: int, end: int, node_idx: int, parent_idx: int | None, child_idx: int | None, path: List[Set or Interval], min_samples_leaf: int, depth: int, min_eval_samples: int = 0)

Represents a partition of the input data during JPT learning.

Parameters:
  • data – the indices for the training samples used to calculate the gain.

  • start – the starting index in the data.

  • end – the stopping index in the data.

  • node_idx – the node of the current iteration

  • parent_idx – the parent node of the current iteration, initially None.

  • child_idx – the index of the child in the current iteration.

  • depth – the depth of the tree in the current recursion level.

data
start
end
node_idx
parent_idx
child_idx
depth
min_samples_leaf
min_eval_samples = 0
path
property n_samples
jpt.learning.c45.learn_prior(variable: jpt.variables.Variable, column: int)
jpt.learning.c45.c45split(partition: JPTPartition, prune_or_split: Callable = None) Tuple[Dict[str, Any], JPTPartition, JPTPartition | None, JPTPartition | None]

Creates a node in the decision tree according to the C4.5 algorithm

class jpt.learning.c45.C45Algorithm(jpt: jpt.trees.JPT)
jpt
lock = None
c45queue = None
finish = None
_progressbar = None
_prune_or_split = None
queue_length = 0
indices = None
min_samples_leaf = None
min_eval_samples = 0
_node_counter = 0
_node_created(args: Tuple) None
learn(data: pandas.DataFrame = None, keep_samples: bool = False, close_convex_gaps: bool = True, verbose: bool = False, prune_or_split: Callable | None = None, multicore: int | None = None, split_validation_mask: numpy.ndarray | None = None, split_validation_mode: str = 'both') None

Fit the jpt to data.

Parameters:
  • data ([[str or float or bool]]; (according to self.variables)) – The training examples (assumed in row-shape)

  • rows ([[str or float or bool]]; (according to self.variables)) – The training examples (assumed in row-shape)

  • columns ([[str or float or bool]]; (according to self.variables)) – The training examples (assumed in column-shape)

  • keep_samples – If true, stores the indices of the original data samples in the leaf nodes. For debugging purposes only. Default is false.

  • close_convex_gaps

  • prune_or_split

  • multicore – The number of cores to use for learning. If None, all cores available will be used.

  • verbose

Returns:

the fitted model

postprocess_leaves() None

Postprocess leaves such that the convex hull that is postulated from this tree has likelihood > 0 for every point inside the hull.