jpt.learning.c45 ================ .. py:module:: jpt.learning.c45 Attributes ---------- .. autoapisummary:: jpt.learning.c45.logger jpt.learning.c45.DISCRIMINATIVE jpt.learning.c45.GENERATIVE jpt.learning.c45._locals Classes ------- .. autoapisummary:: jpt.learning.c45.JPTPartition jpt.learning.c45.C45Algorithm Functions --------- .. autoapisummary:: jpt.learning.c45._initialize_worker_process jpt.learning.c45.learn_prior jpt.learning.c45.c45split Module Contents --------------- .. py:data:: logger .. py:data:: DISCRIMINATIVE :value: 'discriminative' .. py:data:: GENERATIVE :value: 'generative' .. py:data:: _locals .. py:function:: _initialize_worker_process() .. py:class:: JPTPartition(data: Optional[numpy.ndarray], start: int, end: int, node_idx: int, parent_idx: Optional[int], child_idx: Optional[int], path: List[Set or Interval], min_samples_leaf: int, depth: int, min_eval_samples: int = 0) Represents a partition of the input data during JPT learning. :param data: the indices for the training samples used to calculate the gain. :param start: the starting index in the data. :param end: the stopping index in the data. :param node_idx: the node of the current iteration :param parent_idx: the parent node of the current iteration, initially ``None``. :param child_idx: the index of the child in the current iteration. :param depth: the depth of the tree in the current recursion level. .. py:attribute:: data .. py:attribute:: start .. py:attribute:: end .. py:attribute:: node_idx .. py:attribute:: parent_idx .. py:attribute:: child_idx .. py:attribute:: depth .. py:attribute:: min_samples_leaf .. py:attribute:: min_eval_samples :value: 0 .. py:attribute:: path .. py:property:: n_samples .. py:function:: learn_prior(variable: jpt.variables.Variable, column: int) .. py:function:: c45split(partition: JPTPartition, prune_or_split: Callable = None) -> Tuple[Dict[str, Any], JPTPartition, Optional[JPTPartition], Optional[JPTPartition]] Creates a node in the decision tree according to the C4.5 algorithm .. py:class:: C45Algorithm(jpt: jpt.trees.JPT) .. py:attribute:: jpt .. py:attribute:: lock :value: None .. py:attribute:: c45queue :value: None .. py:attribute:: finish :value: None .. py:attribute:: _progressbar :value: None .. py:attribute:: _prune_or_split :value: None .. py:attribute:: queue_length :value: 0 .. py:attribute:: indices :value: None .. py:attribute:: min_samples_leaf :value: None .. py:attribute:: min_eval_samples :value: 0 .. py:attribute:: _node_counter :value: 0 .. py:method:: _node_created(args: Tuple) -> None .. py:method:: learn(data: pandas.DataFrame = None, keep_samples: bool = False, close_convex_gaps: bool = True, verbose: bool = False, prune_or_split: Optional[Callable] = None, multicore: Optional[int] = None, split_validation_mask: Optional[numpy.ndarray] = None, split_validation_mode: str = 'both') -> None Fit the jpt to ``data``. :param data: The training examples (assumed in row-shape) :type data: [[str or float or bool]]; (according to `self.variables`) :param rows: The training examples (assumed in row-shape) :type rows: [[str or float or bool]]; (according to `self.variables`) :param columns: The training examples (assumed in column-shape) :type columns: [[str or float or bool]]; (according to `self.variables`) :param keep_samples: If true, stores the indices of the original data samples in the leaf nodes. For debugging purposes only. Default is false. :param close_convex_gaps: :param prune_or_split: :param multicore: The number of cores to use for learning. If ``None``, all cores available will be used. :param verbose: :return: the fitted model .. py:method:: postprocess_leaves() -> None Postprocess leaves such that the convex hull that is postulated from this tree has likelihood > 0 for every point inside the hull.