jpt.trees ========= .. py:module:: jpt.trees .. autoapi-nested-parse:: © Copyright 2021-23, Mareike Picklum, Daniel Nyga. Classes ------- .. autoapisummary:: jpt.trees.Node jpt.trees.DecisionNode jpt.trees.Leaf jpt.trees.JPT Module Contents --------------- .. py:class:: Node(idx: int, parent: Optional[DecisionNode] = None) Wrapper for the nodes of the :class:`jpt.learning.trees.Tree`. Create a Node :param idx: the identifier of a node :param parent: the parent of this node .. py:attribute:: idx .. py:attribute:: parent :type: DecisionNode :value: None .. py:attribute:: samples :value: 0 .. py:attribute:: _path :value: [] .. py:property:: path :type: jpt.variables.VariableMap :return: the path of this Node as VariableMap .. py:method:: consistent_with(evidence: jpt.variables.VariableMap) -> bool Check if the node is consistent with the variable assignments in evidence. :param evidence: A VariableMap that maps to singular values (numeric or symbolic) or ranges (continuous set, set) :return: bool .. py:method:: format_path(fmt: str = None, precision: int = None) -> str .. py:method:: number_of_parameters() -> int :abstractmethod: .. py:method:: __str__() -> str .. py:method:: __repr__() -> str .. py:method:: depth() -> int :return: the depth of this node .. py:method:: contains(samples: numpy.ndarray, variable_index_map: jpt.variables.VariableMap) -> numpy.array Check if this node contains the given samples in parallel. :param samples: The samples to check :param variable_index_map: A VariableMap mapping to the indices in 'samples' :return: numpy array with 0s and 1s .. py:class:: DecisionNode(idx: Optional[int], variable: jpt.variables.Variable, parent: 'DecisionNode' or None = None) Bases: :py:obj:`Node` Represents an inner (decision) node of the the :class:`jpt.learning.trees.Tree`. Create a DecisionNode :param idx: The identifier of a node :param variable: The split variable :param parent: The parent of this node .. py:attribute:: _splits :value: None .. py:attribute:: variable .. py:attribute:: children :type: None or List[Node] :value: None .. py:method:: __hash__() .. py:method:: __eq__(o) -> bool .. py:method:: to_json() -> Dict[str, Any] :return: The DecisionNode as a json serializable dict. .. py:method:: from_json(tree: JPT, data: Dict[str, Any]) -> DecisionNode :staticmethod: Construct a Decision node from a json dict. :param tree: The tree to mount the node in :param data: The data describing the members of the node :return: the constructed and mounted DecisionNode .. py:property:: splits :type: List .. py:method:: set_child(idx: int, node: Node) -> None Set the child at ``index`` of this Node. Also extend the path of the child node with this nodes' path. :param idx: the idx of the child (0 for left, 1 for right) :param node: The child .. py:method:: str_edge(idx_split: int) -> str Convert the edge to child at ``idx`` to a string. :param idx_split: The index of the child :return: str .. py:property:: str_node :type: str .. py:method:: recursive_children() :return: All children of this node .. py:method:: __str__() -> str .. py:method:: __repr__() -> str .. py:method:: number_of_parameters() -> int :return: The number of relevant parameters in this decision node. 2 are parameters necessary since it the variable and its splitting value are sufficient to describe this computation unit. .. py:class:: Leaf(idx: int, parent: DecisionNode or None = None, prior: float or None = None) Bases: :py:obj:`Node` Represents a leaf node of the :class:`jpt.trees.Tree`. Construct a Leaf :param idx: the index of this leaf :param parent: the parent of this leaf :param prior: the prior of this leaf (relative number of samples in this leaf) .. py:attribute:: distributions .. py:attribute:: prior :value: None .. py:attribute:: s_indices :value: [] .. py:property:: str_node :type: str .. py:method:: applies(query: jpt.variables.VariableAssignment) -> bool Checks whether this leaf is consistent with the given ``query``. :param query: the query to check :return: bool .. py:property:: value .. py:method:: recursive_children() :return: All children of this node .. py:method:: __str__() -> str .. py:method:: __repr__() -> str .. py:method:: __hash__() .. py:method:: to_json() -> Dict[str, Any] :return: The DecisionNode as a json serializable dict. .. py:method:: from_json(tree: JPT, data: Dict[str, Any]) -> Leaf :staticmethod: Construct a Decision node from a json dict. :param tree: The tree to mount the node in :param data: The data describing the members of the node :return: the constructed and mounted DecisionNode .. py:method:: __eq__(o) -> bool .. py:method:: consistent_with(evidence: jpt.variables.VariableMap) -> bool Check if the node is consistent with the variable assignments in evidence. :param evidence: A preprocessed VariableMap that maps to singular values (numeric or symbolic) or ranges (continuous set, set) .. py:method:: path_consistent_with(evidence: jpt.variables.VariableMap) -> bool Check if the path of this node is consistent with the variable assignments in evidence. :param evidence: A preprocessed VariableMap that maps to singular values (numeric or symbolic) or ranges (continuous set, set) .. py:method:: probability(query: jpt.variables.VariableAssignment, dirac_scaling: float = 2.0, min_distances: jpt.variables.VariableMap = None) -> float Calculate the probability of a (partial) query. Exploits the independence assumption. :param query: A preprocessed VariableMap that maps to singular values (numeric or symbolic) or ranges (continuous set, set) :type query: VariableMap :param dirac_scaling: the minimal distance between the samples within a dimension are multiplied by this factor if a durac impulse is used to model the variable. :type dirac_scaling: float :param min_distances: A dict mapping the variables to the minimal distances between the observations. This can be useful to use the same likelihood parameters for different test sets for example in cross validation processes. :type min_distances: A VariableMap from numeric variables to floats or None .. py:method:: _numeric_probability(variable: jpt.variables.NumericVariable, value, dirac_scaling: float = 2.0, min_distances: jpt.variables.VariableMap = None) Calculate the probability of an arbitrary value for a numeric variable. :param variable: A numeric variable :param dirac_scaling: the minimal distance between the samples within a dimension are multiplied by this factor if a durac impulse is used to model the variable. :param min_distances: A dict mapping the variables to the minimal distances between the observations. This can be useful to use the same likelihood parameters for different test sets for example in cross validation processes. .. py:method:: likelihood(queries: pandas.DataFrame, dirac_scaling: float = 2.0, min_distances: jpt.variables.VariableMap = None, single_likelihoods: bool = False, variables: Iterable[Union[jpt.variables.Variable, str]] = None) -> numpy.ndarray Calculate the probability of a (partial) query. Exploits the independence assumption. :param single_likelihoods: :param queries: An array-like object that represents variable assignments in value space. :param dirac_scaling: the minimal distance between the samples within a dimension are multiplied by this factor if a dirac impulse is used to model the variable. :type dirac_scaling: float :param min_distances: A dict mapping the variables to the minimal distances between the observations. This can be useful to use the same likelihood parameters for different test sets for example in cross validation processes. :type min_distances: A VariableMap from numeric variables to floats or None :param single_likelihoods: whether likelihoods of each variable shall be reported :param variables: the variables indices to consider in the likelihood calculation .. py:method:: copy() -> Leaf Create a copy of this leaf. The copy is unaware of the tree and vice versa. Hence, not path or parent etc. is set. The copy only provides querying functionality. .. py:method:: conditional_leaf(evidence: jpt.variables.VariableAssignment) -> Leaf Create a leaf that is cropped to the values described in evidence. :param evidence: A VariableAssignment describing evidence. :return: The cropped leaf, that hos no parent, path, etc. set. .. py:method:: mpe(minimal_distances: jpt.variables.VariableMap) -> tuple[jpt.variables.VariableMap, float] Calculate the most probable explanation of this leaf as a fully factorized distribution. :return: the likelihood of the maximum as a float and the configuration as a VariableMap .. py:method:: k_mpe() -> Iterator[jpt.variables.LabelAssignment] Compute the ``k`` most probable explanations of this leaf. :return: .. py:method:: number_of_parameters() -> int :return: The number of relevant parameters in this decision node. Leafs require 1 + the sum of all distributions parameters. The 1 extra parameter represents the prior. .. py:method:: sample(amount) -> numpy.ndarray Sample `amount` many samples from the leaf. :return: A numpy array of size (amount, self.variables) containing the samples. .. py:class:: JPT(variables: list[jpt.variables.Variable], targets: list[str | jpt.variables.Variable] = None, features: list[str | jpt.variables.Variable] = None, min_samples_leaf: float | int = 1, min_impurity_improvement: float | None = None, max_leaves: int | None = None, max_depth: int | None = None, dependencies=None, min_eval_samples: float | int = 0) Implementation of Joint Probability Trees (JPTs). Create a JPT. :param variables: The variables represented by this model. :param targets: The variables where the information gain will be computed on. :param features: The variables where splits are chosen from. :param min_samples_leaf: If int, the minimum number of samples required to form a leaf. If float, the minimum fraction of samples. :param min_eval_samples: Minimum number of EVALUATION samples required in each child partition when split validation is active in ``'evaluation'`` mode. Only enforced when a ``split_validation_mask`` is passed to ``learn()`` and ``split_validation_mode='evaluation'``. If int, the absolute minimum. If a float in ``(0, 1)``, the minimum fraction of the *total* training rows (same convention as ``min_samples_leaf``). ``0`` disables the check (default). :param min_impurity_improvement: The minimal information gain to justify a split. :param max_leaves: The maximum number of leaves (deprecated). :param max_depth: The maximum depth the tree may have. :param dependencies: Specifies which targets depend on which features. Accepts three forms: - ``None``: every target depends on every feature (default, fully connected). - ``dict[Variable, list[Variable]]``: explicit mapping from features to their dependent targets. - A ``DependencyDiscovery`` instance: a callable strategy that discovers dependencies from training data during ``learn()``. The strategy is re-invoked on each call to ``learn()`` and its configuration is preserved during serialization. .. py:attribute:: logger .. py:attribute:: _variables .. py:attribute:: varnames :type: collections.OrderedDict[str, jpt.variables.Variable] .. py:attribute:: _targets .. py:attribute:: leaves :type: dict[int, Leaf] .. py:attribute:: innernodes :type: dict[int, DecisionNode] .. py:attribute:: priors :type: jpt.variables.VariableMap .. py:attribute:: min_samples_leaf :value: 1 .. py:attribute:: min_eval_samples :value: 0 .. py:attribute:: _keep_samples :value: False .. py:attribute:: min_impurity_improvement :value: 0 .. py:attribute:: minimal_distances :type: jpt.variables.VariableMap .. py:attribute:: _numsamples :value: 0 .. py:attribute:: root :value: None .. py:attribute:: max_leaves :value: None .. py:attribute:: max_depth .. py:method:: _reset() -> None Delete all parameters of this model (not the hyperparameters) .. py:property:: allnodes :type: MutableMapping[int, Node] .. py:property:: variables :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: targets :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: features :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: numeric_variables :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: symbolic_variables :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: integer_variables :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: numeric_targets :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: symbolic_targets :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: integer_targets :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: numeric_features :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: symbolic_features :type: tuple[jpt.variables.Variable, Ellipsis] .. py:property:: integer_features :type: tuple[jpt.variables.Variable, Ellipsis] .. py:method:: to_json() -> dict[str, Any] Convert the tree to a JSON-serializable dictionary. .. py:method:: from_json(data: dict[str, Any], variables: Iterable[jpt.variables.Variable] | None = None) -> JPT :staticmethod: Construct a tree from a json dict. :data: The JSON dictionary holding the serialized JPT data. :variables: (optional) An iterable holding the already de-serialized variables the JPT shall be constructed with. .. py:method:: __getstate__() .. py:method:: __setstate__(state) .. py:method:: __eq__(o) -> bool .. py:method:: encode(samples: numpy.ndarray) -> numpy.ndarray Get the leaf index that describes the partition of each sample. Only works for fully initialized samples, i. e. a matrix of arbitrary many rows but #variables many columns. :param samples: the samples to evaluate :return: A 1D numpy array of integers containing the leaf index of every sample. .. py:method:: pdf(values: jpt.variables.VariableAssignment) -> float Get the likelihood of one world :param values: A VariableMap mapping some variables to one value. :return: The likelihood as float .. py:method:: infer(query: dict[jpt.variables.Variable | str, Any] | jpt.variables.VariableAssignment, evidence: dict[jpt.variables.Variable | str, Any] | jpt.variables.VariableAssignment = None, fail_on_unsatisfiability: bool = True) -> float | None For each candidate leaf ``l`` calculate the number of samples in which `query` is true: .. math:: P(query|evidence) = \frac{p_q}{p_e} :label: query .. math:: p_q = \frac{c}{N} :label: pq .. math:: c = \frac{\prod{F}}{x^{n-1}} :label: c where ``Q`` is the set of variables in `query`, :math:`P_{l}` is the set of variables that occur in ``l``, :math:`F = \{v | v \in Q \wedge~v \notin P_{l}\}` is the set of variables in the `query` that do not occur in ``l``'s path, :math:`x = |S_{l}|` is the number of samples in ``l``, :math:`n = |F|` is the number of free variables and ``N`` is the number of samples represented by the entire tree. reference to :eq:`query` :param query: the event to query for, i.e. the query part of the conditional P(query|evidence) or the prior P(query) :type query: dict of {jpt.variables.Variable : jpt.learning.distributions.Distribution.value} :param evidence: the event conditioned on, i.e. the evidence part of the conditional P(query|evidence) :type evidence: dict of {jpt.variables.Variable : jpt.learning.distributions.Distribution.value} :param fail_on_unsatisfiability: whether an error is raised in case of unsatisfiable evidence or not. .. py:method:: posterior(variables: list[jpt.variables.Variable | str] = None, evidence: dict[jpt.variables.Variable | str, Any] | jpt.variables.VariableAssignment = None, fail_on_unsatisfiability: bool = True, report_inconsistencies: bool = False) -> jpt.variables.VariableMap | None Compute the posterior distribution of every variable in ``variables``. The result contains independent distributions. Be aware that they might not actually be independent. :param variables: The query variables of the posterior to be computed :param evidence: The evidence given for the posterior to be computed :param fail_on_unsatisfiability: Rather or not an ``Unsatisfiability`` error is raised if the likelihood of the evidence is 0. :param report_inconsistencies: In case of an ``Unsatisfiability`` error, the exception raise will contain information about the variable assignments that caused the inconsistency. :return: jpt.trees.PosteriorResult containing distributions, candidates and weights .. py:method:: expectation(variables: Iterable[jpt.variables.Variable] | None = None, evidence: jpt.variables.VariableAssignment | dict[str, numbers.Number | jpt.base.intervals.Interval | str] | None = None, fail_on_unsatisfiability: bool = True) -> Optional[jpt.variables.VariableMap] Compute the expected value of all ``variables``. If no ``variables`` are passed, it defaults to all variables not passed as ``evidence``. :param variables: The variables to compute the expectation distributions on :param evidence: The raw evidence applied to the tree :param fail_on_unsatisfiability: Rather or not an ``Unsatisfiability`` error is raised if the likelihood of the evidence is 0. :return: VariableMap .. py:method:: mpe(evidence: Dict[jpt.variables.Variable | str, Any] | jpt.variables.VariableAssignment = None, fail_on_unsatisfiability: bool = True) -> Tuple[list[jpt.variables.LabelAssignment], float] | None Calculate the most probable explanation of all variables if the tree given the evidence. :param evidence: The evidence that is applied to the tree :param fail_on_unsatisfiability: Rather or not an ``Unsatisfiability`` error is raised if the likelihood of the evidence is 0. :return: List of LabelAssignments that describes all maxima of the tree given the evidence. Additionally, a float describing the likelihood of all solutions is returned. .. py:method:: kmpe(evidence: dict[jpt.variables.Variable | str, Any] | jpt.variables.VariableAssignment = None, fail_on_unsatisfiability: bool = True, k: int = 0) -> Iterator[Tuple[jpt.variables.LabelAssignment, float]] | None Perform a k-MPE inference on this JPT under the given evidence. k-MPE yields the ``k`` most probable explanation states in decreasing order. :param evidence: The evidence to apply :param fail_on_unsatisfiability: Rather to raise an Unsatisfiability Error on impossible evidence or not. :param k: the number of solutions to return :return: An iterator with states ordered by likelihood. .. py:method:: _preprocess_query(query: dict | jpt.variables.VariableMap, remove_none: bool = True, skip_unknown_variables: bool = False, allow_singular_values: bool = False, space: Literal['labels', 'values'] = 'labels') -> jpt.variables.LabelAssignment Transform a query entered by a user into an internal representation that can be further processed. :param query: the raw query :param remove_none: Rather to remove None entries or not :param skip_unknown_variables: skip preprocessing for variable that does not exist in tree (may happen in multiple reverse tree inference). If False, an exception is raised; default: False :param allow_singular_values: Allow singular values, such that they are transformed to the daomain specification of numeric variables but not transformed to intervals via the PPF. :return: the preprocessed VariableMap .. py:method:: _check_variable_assignment(assignment: jpt.variables.VariableAssignment | None) Check the variable assignment for compatibility with the variables of this JPT. .. py:method:: apply(query: jpt.variables.VariableAssignment | dict[str, int | jpt.base.intervals.Interval | float | str]) -> Iterator[Leaf] Iterator that yields leaves tha are consistent with a ``query``. A leaf is consistent with a query, if either of the following propositions hold for all constaints expressed by its path to the root node: 1. the variable is not constrained by the query 2. the variable is constrained by the query and the query is not consistent with the path :param query: the preprocessed query, either an instance of a subclass of ``VariableAssignment`` or a dict mapping variables to their respective labels. :return: .. py:method:: __str__() -> str .. py:method:: __repr__() -> str .. py:method:: to_string() -> str .. py:method:: fancy_tree() -> str .. py:method:: pfmt() -> str :return: a pretty-format string representation of this JPT. .. py:method:: _pfmt(node: Node, indent: int) -> str :param node: The starting node :param indent: the indentation of each new level :return: a pretty-format string representation of this JPT from node downward. .. py:method:: learn(data: pandas.DataFrame | numpy.ndarray, keep_samples: bool = False, close_convex_gaps: bool = False, verbose: bool = False, prune_or_split: Callable[[JPT, Any, numpy.ndarray, numpy.ndarray], bool] | None = None, multicore: int | None = None, split_validation_mask: numpy.ndarray | None = None, split_validation_mode: str = 'both') -> JPT Fit the jpt to ``data``. :param data: The training examples (assumed in row-shape) :type data: [[str or float or bool]]; (according to `self.variables`) :param keep_samples: If true, stores the indices of the original data samples in the leaf nodes. For debugging purposes only. Default is false. :param close_convex_gaps: :param prune_or_split: A callable ``(jpt, partition, indices, data) -> bool`` that is invoked before each split. Returns ``True`` to prune (make the node a leaf) or ``False`` to allow splitting. ``indices`` and ``data`` are numpy arrays. :param multicore: The number of cores to use for learning. If ``None``, all available cores are used. :param verbose: :param split_validation_mask: A boolean or uint8 array of length ``len(data)``. ``True``/``1`` marks training samples whose feature values serve as candidate split points; ``False``/``0`` marks evaluation samples whose feature values are excluded from candidates. Target values of *all* samples always contribute to the impurity score (unless ``split_validation_mode`` restricts this). ``None`` disables split validation (default). :param split_validation_mode: Controls which targets contribute to the impurity score: ``'both'`` (default) uses all targets, ``'training'`` uses only training targets, ``'evaluation'`` uses only evaluation targets. :return: the fitted model .. py:attribute:: fit .. py:method:: sample(sample, ft) :staticmethod: .. py:method:: likelihood(data: pandas.DataFrame | numpy.ndarray, dirac_scaling: float = 2.0, min_distances: Dict = None, preprocess: bool = True, multicore: int | None = None, verbose: bool = False, single_likelihoods: bool = False, variables: Iterable[jpt.variables.Variable] = None) -> numpy.ndarray Get the probabilities of a list of worlds. The worlds must be fully assigned with scalar values (no intervals or sets). :param variables: Which variables in consider for their likelihood computat :param data: An array containing the worlds. The shape is (x, len(variables)). :param dirac_scaling: the minimal distance between the samples within a dimension are multiplied by this factor if a durac impulse is used to model the variable. :param min_distances: A dict mapping the variables to the minimal distances between the observations. This can be useful to use the same likelihood parameters for different test sets for example in cross validation processes. :param verbose: print status information to the console :param multicore: how many cores should be used (defaults to all) :param preprocess: whether to apply the preprocessing to the data passed. :param single_likelihoods: will not only return the overall likelihoods but also the likelihoods per variable :return: A np.ndarray with shape (x, ) containing the probabilities. .. py:method:: parallel_likelihood(data: numpy.ndarray | pandas.DataFrame, dirac_scaling: float = 2.0, min_distances: Dict = None, single_likelihoods: bool = False) -> numpy.ndarray Get the probabilities of a list of worlds. The worlds must be fully assigned with scalar values (no intervals or sets). :param data: An array containing the worlds. The shape is (x, len(variables)). :param dirac_scaling: the minimal distance between the samples within a dimension are multiplied by this factor if a durac impulse is used to model the variable. :param min_distances: A dict mapping the variables to the minimal distances between the observations. This can be useful to use the same likelihood parameters for different test sets for example in cross validation processes. :param single_likelihoods: will not only return the overall likelihoods but also the likelihoods per variable :returns: An np.array with shape (x, ) containing the probabilities. .. py:method:: reverse(query: Dict, confidence: float = 0.05) -> List[tuple] Determines the leaf nodes that match query best and returns them along with their respective confidence. :param query: a mapping from featurenames to either numeric value intervals or an iterable of categorical values :param confidence: the confidence level for this MPE inference :returns: a tuple of probabilities and jpt.trees.Leaf objects that match requirement (representing path to root) .. py:method:: plot(title: str = 'unnamed', filename: str | None = None, directory: str = None, plotvars: Iterable[jpt.variables.Variable] = None, view: bool = True, max_symb_values: int = 10, nodefill: str = None, leaffill: str = None, alphabet: bool = False, verbose: bool = False, engine=None) -> str Generates an SVG representation of the generated regression tree. :param title: title of the plot :param filename: the name of the JPT (will also be used as filename; extension will be added automatically) :param directory: the location to save the SVG file to :param plotvars: the variables to be plotted in the graph :param view: whether the generated SVG file will be opened automatically :param max_symb_values: limit the maximum number of symbolic values that are plotted to this number :param nodefill: the color of the inner nodes in the plot; accepted formats: RGB, RGBA, HSV, HSVA or color name :param leaffill: the color of the leaf nodes in the plot; accepted formats: RGB, RGBA, HSV, HSVA or color name :param alphabet: whether to plot symbolic variables in alphabetic order, if False, they are sorted by probability (descending); default is False :param verbose: :param engine: the rendering engine for the distribution plots in the leafs; either 'matplotlib' or 'plotly'; :return: (str) the path under which the rendered image has been saved. .. py:method:: pickle(fpath: str) -> None Pickles the fitted regression tree to a file at the given location ``fpath``. :param fpath: the location for the pickled file .. py:method:: calcnorm(sigma: float, mu: float, intervals) :staticmethod: Computes the CDF for a multivariate normal distribution. :param sigma: the standard deviation :param mu: the expected value :param intervals: the boundaries of the integral :type intervals: list of matcalo.utils.utils.Interval :return: .. py:method:: copy() -> JPT :return: a new copy of this jpt where all references are the original tree are cut. .. py:method:: conditional_jpt(evidence: jpt.variables.VariableAssignment | None = None, fail_on_unsatisfiability: bool = True) -> Optional[JPT] Apply evidence on a JPT and get a new JPT that represent P(x|evidence). :param evidence: A VariableAssignment mapping the observed variables to there observed values :param fail_on_unsatisfiability: whether an error is raised in case of unsatisfiable evidence or not .. py:method:: multiply_by_leaf_prior(prior: dict[int, float]) -> JPT Multiply every leafs prior by the given priors. This serves as handling the factor message from factor nodes. Be vary since this method overwrites the JPT in-place. :param prior: The priors, a Dict mapping from leaf indices to float :return: self .. py:method:: normalize() -> JPT Normalize the tree s. t. the sum of all leaf priors is 1. :return: self .. py:method:: save(file: str | IO, protocol: Literal['pickle', 'json'] = 'pickle') -> None Write this JPT persistently to disk. :param file: either a string or file-like object. :param protocol: .. py:attribute:: dump .. py:method:: dumps(protocol: Literal['pickle', 'json'] = 'pickle') -> bytes .. py:method:: load(file: str | IO, protocol: Literal['pickle', 'json'] = 'pickle') -> JPT :staticmethod: Load a JPT from disk. :param file: either a string or file-like object. :param protocol: :return: the JPT described in ``file`` .. py:method:: loads(data: typing_extensions.Buffer, protocol: Literal['pickle', 'json'] = 'pickle') -> JPT :staticmethod: .. py:method:: depth() -> int :return: the maximal depth of a leaf in the tree. .. py:method:: total_samples() -> int :return: the total number of samples represented by this tree. .. py:method:: number_of_parameters() -> int :return: The number of relevant parameters in the entire tree .. py:method:: bind(*arg, **kwargs) -> jpt.variables.LabelAssignment Returns a ``LabelAssignment`` object with the assignments passed. This method accepts one optional positional argument, which -- if passed -- must be a dictionary of the desired variable assignments. Keyword arguments may specify additional variable, value pairs. If a positional argument is passed, the following options may be passed in addition as keyword arguments: :param allow_singular_values: Allow singular values, such that they are transformed to the daomain specification of numeric variables but not transformed to intervals via the PPF. :param space: Literal['values', 'labels'] Whether the variables shall be assigned to terms in value or label space of the JPT. .. py:method:: moment(order: int = 1, center: jpt.variables.VariableAssignment | None = None, evidence: jpt.variables.VariableAssignment | None = None, fail_on_unsatisfiability: bool = True) -> jpt.variables.VariableMap | None Calculate the order of each numeric/integer random variable given the evidence. :param order: The order of the moment :param center: A VariableAssignment mapping each numeric/integer variable to some constant. If a variable has a constant, it will be interpreted as 'c' for the central moment. If it is not set, 0 will be used by default. :param evidence: The evidence given for the posterior to be computed :param fail_on_unsatisfiability: Rather or not an ``Unsatisfiability`` error is raised if the likelihood of the evidence is 0. .. py:method:: get_hyperparameters_dict() -> dict[str, Any] Get all hyperparameters as dict that can be used for MLFlow model tracking. .. py:method:: prune(similarity_threshold: float, approximate: float | dict[jpt.variables.Variable | str, float] | jpt.variables.VariableMap | None = None) -> JPT Prune this tree by repeatedly merging leaves with very similiar distributions. :param similarity_threshold: the average similarity of distributions in [0, 1] that two leaves must exhibit in order to be considered for a merge. :param approximate: :return: