jpt.learning.c45

Attributes

`logger`
`DISCRIMINATIVE`
`GENERATIVE`
`_locals`

Classes

`JPTPartition`	Represents a partition of the input data during JPT learning.
`C45Algorithm`

Functions

`_initialize_worker_process`()
`learn_prior`(variable, column)
`c45split`(→ Tuple[Dict[str, Any], JPTPartition, ...)	Creates a node in the decision tree according to the C4.5 algorithm

Module Contents

jpt.learning.c45.logger

jpt.learning.c45.DISCRIMINATIVE = 'discriminative'

jpt.learning.c45.GENERATIVE = 'generative'

jpt.learning.c45._locals

jpt.learning.c45._initialize_worker_process()

class jpt.learning.c45.JPTPartition(data: numpy.ndarray | None, start: int, end: int, node_idx: int, parent_idx: int | None, child_idx: int | None, path: List[Set or Interval], min_samples_leaf: int, depth: int, min_eval_samples: int = 0)

Represents a partition of the input data during JPT learning.

Parameters:

data – the indices for the training samples used to calculate the gain.
start – the starting index in the data.
end – the stopping index in the data.
node_idx – the node of the current iteration
parent_idx – the parent node of the current iteration, initially None.
child_idx – the index of the child in the current iteration.
depth – the depth of the tree in the current recursion level.

data

start

end

node_idx

parent_idx

child_idx

depth

min_samples_leaf

min_eval_samples = 0

path

property n_samples

jpt.learning.c45.learn_prior(variable: jpt.variables.Variable, column: int)

jpt.learning.c45.c45split(partition: JPTPartition, prune_or_split: Callable = None) → Tuple[Dict[str, Any], JPTPartition, JPTPartition | None, JPTPartition | None]: Creates a node in the decision tree according to the C4.5 algorithm

class jpt.learning.c45.C45Algorithm(jpt: jpt.trees.JPT)

jpt

lock = None

c45queue = None

finish = None

_progressbar = None

_prune_or_split = None

queue_length = 0

indices = None

min_samples_leaf = None

min_eval_samples = 0

_node_counter = 0

_node_created(args: Tuple) → None

learn(data: pandas.DataFrame = None, keep_samples: bool = False, close_convex_gaps: bool = True, verbose: bool = False, prune_or_split: Callable | None = None, multicore: int | None = None, split_validation_mask: numpy.ndarray | None = None, split_validation_mode: str = 'both') → None

Fit the jpt to data.

Parameters:

data ([[str or float or bool]]; (according to self.variables)) – The training examples (assumed in row-shape)
rows ([[str or float or bool]]; (according to self.variables)) – The training examples (assumed in row-shape)
columns ([[str or float or bool]]; (according to self.variables)) – The training examples (assumed in column-shape)
keep_samples – If true, stores the indices of the original data samples in the leaf nodes. For debugging purposes only. Default is false.
close_convex_gaps –
prune_or_split –
multicore – The number of cores to use for learning. If None, all cores available will be used.
verbose –

Returns:

the fitted model

postprocess_leaves() → None: Postprocess leaves such that the convex hull that is postulated from this tree has likelihood > 0 for every point inside the hull.