jpt.trees

© Copyright 2021-23, Mareike Picklum, Daniel Nyga.

Classes

Node

Wrapper for the nodes of the jpt.learning.trees.Tree.

DecisionNode

Represents an inner (decision) node of the the jpt.learning.trees.Tree.

Leaf

Represents a leaf node of the jpt.trees.Tree.

JPT

Implementation of Joint Probability Trees (JPTs).

Module Contents

class jpt.trees.Node(idx: int, parent: DecisionNode | None = None)

Wrapper for the nodes of the jpt.learning.trees.Tree.

Create a Node :param idx: the identifier of a node :param parent: the parent of this node

idx
parent: DecisionNode = None
samples = 0
_path = []
property path: jpt.variables.VariableMap
Returns:

the path of this Node as VariableMap

consistent_with(evidence: jpt.variables.VariableMap) bool

Check if the node is consistent with the variable assignments in evidence.

Parameters:

evidence – A VariableMap that maps to singular values (numeric or symbolic) or ranges (continuous set, set)

Returns:

bool

format_path(fmt: str = None, precision: int = None) str
abstract number_of_parameters() int
__str__() str
__repr__() str
depth() int
Returns:

the depth of this node

contains(samples: numpy.ndarray, variable_index_map: jpt.variables.VariableMap) numpy.array

Check if this node contains the given samples in parallel.

Parameters:
  • samples – The samples to check

  • variable_index_map – A VariableMap mapping to the indices in ‘samples’

Returns:

numpy array with 0s and 1s

class jpt.trees.DecisionNode(idx: int | None, variable: jpt.variables.Variable, parent: 'DecisionNode' or None = None)

Bases: Node

Represents an inner (decision) node of the the jpt.learning.trees.Tree.

Create a DecisionNode

Parameters:
  • idx – The identifier of a node

  • variable – The split variable

  • parent – The parent of this node

_splits = None
variable
children: None or List[Node] = None
__hash__()
__eq__(o) bool
to_json() Dict[str, Any]
Returns:

The DecisionNode as a json serializable dict.

static from_json(tree: JPT, data: Dict[str, Any]) DecisionNode

Construct a Decision node from a json dict. :param tree: The tree to mount the node in :param data: The data describing the members of the node :return: the constructed and mounted DecisionNode

property splits: List
set_child(idx: int, node: Node) None

Set the child at index of this Node. Also extend the path of the child node with this nodes’ path. :param idx: the idx of the child (0 for left, 1 for right) :param node: The child

str_edge(idx_split: int) str

Convert the edge to child at idx to a string. :param idx_split: The index of the child :return: str

property str_node: str
recursive_children()
Returns:

All children of this node

__str__() str
__repr__() str
number_of_parameters() int
Returns:

The number of relevant parameters in this decision node. 2 are parameters necessary since it the variable and its splitting value are sufficient to describe this computation unit.

class jpt.trees.Leaf(idx: int, parent: DecisionNode or None = None, prior: float or None = None)

Bases: Node

Represents a leaf node of the jpt.trees.Tree.

Construct a Leaf :param idx: the index of this leaf :param parent: the parent of this leaf :param prior: the prior of this leaf (relative number of samples in this leaf)

distributions
prior = None
s_indices = []
property str_node: str
applies(query: jpt.variables.VariableAssignment) bool

Checks whether this leaf is consistent with the given query. :param query: the query to check :return: bool

property value
recursive_children()
Returns:

All children of this node

__str__() str
__repr__() str
__hash__()
to_json() Dict[str, Any]
Returns:

The DecisionNode as a json serializable dict.

static from_json(tree: JPT, data: Dict[str, Any]) Leaf

Construct a Decision node from a json dict. :param tree: The tree to mount the node in :param data: The data describing the members of the node :return: the constructed and mounted DecisionNode

__eq__(o) bool
consistent_with(evidence: jpt.variables.VariableMap) bool

Check if the node is consistent with the variable assignments in evidence.

Parameters:

evidence – A preprocessed VariableMap that maps to singular values (numeric or symbolic) or ranges (continuous set, set)

path_consistent_with(evidence: jpt.variables.VariableMap) bool

Check if the path of this node is consistent with the variable assignments in evidence.

Parameters:

evidence – A preprocessed VariableMap that maps to singular values (numeric or symbolic) or ranges (continuous set, set)

probability(query: jpt.variables.VariableAssignment, dirac_scaling: float = 2.0, min_distances: jpt.variables.VariableMap = None) float

Calculate the probability of a (partial) query. Exploits the independence assumption.

Parameters:
  • query (VariableMap) – A preprocessed VariableMap that maps to singular values (numeric or symbolic) or ranges (continuous set, set)

  • dirac_scaling (float) – the minimal distance between the samples within a dimension are multiplied by this factor if a durac impulse is used to model the variable.

  • min_distances (A VariableMap from numeric variables to floats or None) – A dict mapping the variables to the minimal distances between the observations. This can be useful to use the same likelihood parameters for different test sets for example in cross validation processes.

_numeric_probability(variable: jpt.variables.NumericVariable, value, dirac_scaling: float = 2.0, min_distances: jpt.variables.VariableMap = None)

Calculate the probability of an arbitrary value for a numeric variable.

Parameters:
  • variable – A numeric variable

  • dirac_scaling – the minimal distance between the samples within a dimension are multiplied by this factor if a durac impulse is used to model the variable.

  • min_distances – A dict mapping the variables to the minimal distances between the observations. This can be useful to use the same likelihood parameters for different test sets for example in cross validation processes.

likelihood(queries: pandas.DataFrame, dirac_scaling: float = 2.0, min_distances: jpt.variables.VariableMap = None, single_likelihoods: bool = False, variables: Iterable[jpt.variables.Variable | str] = None) numpy.ndarray

Calculate the probability of a (partial) query. Exploits the independence assumption.

Parameters:
  • single_likelihoods

  • queries – An array-like object that represents variable assignments in value space.

  • dirac_scaling (float) – the minimal distance between the samples within a dimension are multiplied by this factor if a dirac impulse is used to model the variable.

  • min_distances (A VariableMap from numeric variables to floats or None) – A dict mapping the variables to the minimal distances between the observations. This can be useful to use the same likelihood parameters for different test sets for example in cross validation processes.

  • single_likelihoods – whether likelihoods of each variable shall be reported

  • variables – the variables indices to consider in the likelihood calculation

copy() Leaf

Create a copy of this leaf. The copy is unaware of the tree and vice versa. Hence, not path or parent etc. is set. The copy only provides querying functionality.

conditional_leaf(evidence: jpt.variables.VariableAssignment) Leaf

Create a leaf that is cropped to the values described in evidence.

Parameters:

evidence – A VariableAssignment describing evidence.

Returns:

The cropped leaf, that hos no parent, path, etc. set.

mpe(minimal_distances: jpt.variables.VariableMap) tuple[jpt.variables.VariableMap, float]

Calculate the most probable explanation of this leaf as a fully factorized distribution.

Returns:

the likelihood of the maximum as a float and the configuration as a VariableMap

k_mpe() Iterator[jpt.variables.LabelAssignment]

Compute the k most probable explanations of this leaf. :return:

number_of_parameters() int
Returns:

The number of relevant parameters in this decision node. Leafs require 1 + the sum of all distributions parameters. The 1 extra parameter represents the prior.

sample(amount) numpy.ndarray

Sample amount many samples from the leaf.

Returns:

A numpy array of size (amount, self.variables) containing the samples.

class jpt.trees.JPT(variables: list[jpt.variables.Variable], targets: list[str | jpt.variables.Variable] = None, features: list[str | jpt.variables.Variable] = None, min_samples_leaf: float | int = 1, min_impurity_improvement: float | None = None, max_leaves: int | None = None, max_depth: int | None = None, dependencies=None, min_eval_samples: float | int = 0)

Implementation of Joint Probability Trees (JPTs).

Create a JPT.

Parameters:
  • variables – The variables represented by this model.

  • targets – The variables where the information gain will be computed on.

  • features – The variables where splits are chosen from.

  • min_samples_leaf – If int, the minimum number of samples required to form a leaf. If float, the minimum fraction of samples.

  • min_eval_samples – Minimum number of EVALUATION samples required in each child partition when split validation is active in 'evaluation' mode. Only enforced when a split_validation_mask is passed to learn() and split_validation_mode='evaluation'. If int, the absolute minimum. If a float in (0, 1), the minimum fraction of the total training rows (same convention as min_samples_leaf). 0 disables the check (default).

  • min_impurity_improvement – The minimal information gain to justify a split.

  • max_leaves – The maximum number of leaves (deprecated).

  • max_depth – The maximum depth the tree may have.

  • dependencies

    Specifies which targets depend on which features. Accepts three forms:

    • None: every target depends on every feature (default, fully connected).

    • dict[Variable, list[Variable]]: explicit mapping from features to their dependent targets.

    • A DependencyDiscovery instance: a callable strategy that discovers dependencies from training data during learn(). The strategy is re-invoked on each call to learn() and its configuration is preserved during serialization.

logger
_variables
varnames: collections.OrderedDict[str, jpt.variables.Variable]
_targets
leaves: dict[int, Leaf]
innernodes: dict[int, DecisionNode]
priors: jpt.variables.VariableMap
min_samples_leaf = 1
min_eval_samples = 0
_keep_samples = False
min_impurity_improvement = 0
minimal_distances: jpt.variables.VariableMap
_numsamples = 0
root = None
max_leaves = None
max_depth
_reset() None

Delete all parameters of this model (not the hyperparameters)

property allnodes: MutableMapping[int, Node]
property variables: tuple[jpt.variables.Variable, Ellipsis]
property targets: tuple[jpt.variables.Variable, Ellipsis]
property features: tuple[jpt.variables.Variable, Ellipsis]
property numeric_variables: tuple[jpt.variables.Variable, Ellipsis]
property symbolic_variables: tuple[jpt.variables.Variable, Ellipsis]
property integer_variables: tuple[jpt.variables.Variable, Ellipsis]
property numeric_targets: tuple[jpt.variables.Variable, Ellipsis]
property symbolic_targets: tuple[jpt.variables.Variable, Ellipsis]
property integer_targets: tuple[jpt.variables.Variable, Ellipsis]
property numeric_features: tuple[jpt.variables.Variable, Ellipsis]
property symbolic_features: tuple[jpt.variables.Variable, Ellipsis]
property integer_features: tuple[jpt.variables.Variable, Ellipsis]
to_json() dict[str, Any]

Convert the tree to a JSON-serializable dictionary.

static from_json(data: dict[str, Any], variables: Iterable[jpt.variables.Variable] | None = None) JPT

Construct a tree from a json dict.

Data:

The JSON dictionary holding the serialized JPT data.

Variables:

(optional) An iterable holding the already de-serialized variables the JPT shall be constructed with.

__getstate__()
__setstate__(state)
__eq__(o) bool
encode(samples: numpy.ndarray) numpy.ndarray

Get the leaf index that describes the partition of each sample. Only works for fully initialized samples, i. e. a matrix of arbitrary many rows but #variables many columns. :param samples: the samples to evaluate :return: A 1D numpy array of integers containing the leaf index of every sample.

pdf(values: jpt.variables.VariableAssignment) float

Get the likelihood of one world :param values: A VariableMap mapping some variables to one value. :return: The likelihood as float

infer(query: dict[jpt.variables.Variable | str, Any] | jpt.variables.VariableAssignment, evidence: dict[jpt.variables.Variable | str, Any] | jpt.variables.VariableAssignment = None, fail_on_unsatisfiability: bool = True) float | None

For each candidate leaf l calculate the number of samples in which query is true:

(1)\[P(query|evidence) = \frac{p_q}{p_e}\]
(2)\[p_q = \frac{c}{N}\]
(3)\[c = \frac{\prod{F}}{x^{n-1}}\]

where Q is the set of variables in query, \(P_{l}\) is the set of variables that occur in l, \(F = \{v | v \in Q \wedge~v \notin P_{l}\}\) is the set of variables in the query that do not occur in l’s path, \(x = |S_{l}|\) is the number of samples in l, \(n = |F|\) is the number of free variables and N is the number of samples represented by the entire tree. reference to (1)

Parameters:
  • query (dict of {jpt.variables.Variable : jpt.learning.distributions.Distribution.value}) – the event to query for, i.e. the query part of the conditional P(query|evidence) or the prior P(query)

  • evidence (dict of {jpt.variables.Variable : jpt.learning.distributions.Distribution.value}) – the event conditioned on, i.e. the evidence part of the conditional P(query|evidence)

  • fail_on_unsatisfiability – whether an error is raised in case of unsatisfiable evidence or not.

posterior(variables: list[jpt.variables.Variable | str] = None, evidence: dict[jpt.variables.Variable | str, Any] | jpt.variables.VariableAssignment = None, fail_on_unsatisfiability: bool = True, report_inconsistencies: bool = False) jpt.variables.VariableMap | None

Compute the posterior distribution of every variable in variables. The result contains independent distributions. Be aware that they might not actually be independent.

Parameters:
  • variables – The query variables of the posterior to be computed

  • evidence – The evidence given for the posterior to be computed

  • fail_on_unsatisfiability – Rather or not an Unsatisfiability error is raised if the likelihood of the evidence is 0.

  • report_inconsistencies – In case of an Unsatisfiability error, the exception raise will contain information about the variable assignments that caused the inconsistency.

Returns:

jpt.trees.PosteriorResult containing distributions, candidates and weights

expectation(variables: Iterable[jpt.variables.Variable] | None = None, evidence: jpt.variables.VariableAssignment | dict[str, numbers.Number | jpt.base.intervals.Interval | str] | None = None, fail_on_unsatisfiability: bool = True) jpt.variables.VariableMap | None

Compute the expected value of all variables. If no variables are passed, it defaults to all variables not passed as evidence.

Parameters:
  • variables – The variables to compute the expectation distributions on

  • evidence – The raw evidence applied to the tree

  • fail_on_unsatisfiability – Rather or not an Unsatisfiability error is raised if the likelihood of the evidence is 0.

Returns:

VariableMap

mpe(evidence: Dict[jpt.variables.Variable | str, Any] | jpt.variables.VariableAssignment = None, fail_on_unsatisfiability: bool = True) Tuple[list[jpt.variables.LabelAssignment], float] | None

Calculate the most probable explanation of all variables if the tree given the evidence.

Parameters:
  • evidence – The evidence that is applied to the tree

  • fail_on_unsatisfiability – Rather or not an Unsatisfiability error is raised if the likelihood of the evidence is 0.

Returns:

List of LabelAssignments that describes all maxima of the tree given the evidence. Additionally, a float describing the likelihood of all solutions is returned.

kmpe(evidence: dict[jpt.variables.Variable | str, Any] | jpt.variables.VariableAssignment = None, fail_on_unsatisfiability: bool = True, k: int = 0) Iterator[Tuple[jpt.variables.LabelAssignment, float]] | None

Perform a k-MPE inference on this JPT under the given evidence.

k-MPE yields the k most probable explanation states in decreasing order.

Parameters:
  • evidence – The evidence to apply

  • fail_on_unsatisfiability – Rather to raise an Unsatisfiability Error on impossible evidence or not.

  • k – the number of solutions to return

Returns:

An iterator with states ordered by likelihood.

_preprocess_query(query: dict | jpt.variables.VariableMap, remove_none: bool = True, skip_unknown_variables: bool = False, allow_singular_values: bool = False, space: Literal['labels', 'values'] = 'labels') jpt.variables.LabelAssignment

Transform a query entered by a user into an internal representation that can be further processed.

Parameters:
  • query – the raw query

  • remove_none – Rather to remove None entries or not

  • skip_unknown_variables – skip preprocessing for variable that does not exist in tree (may happen in multiple reverse tree inference). If False, an exception is raised; default: False

  • allow_singular_values – Allow singular values, such that they are transformed to the daomain specification of numeric variables but not transformed to intervals via the PPF.

Returns:

the preprocessed VariableMap

_check_variable_assignment(assignment: jpt.variables.VariableAssignment | None)

Check the variable assignment for compatibility with the variables of this JPT.

apply(query: jpt.variables.VariableAssignment | dict[str, int | jpt.base.intervals.Interval | float | str]) Iterator[Leaf]

Iterator that yields leaves tha are consistent with a query.

A leaf is consistent with a query, if either of the following propositions hold for all constaints expressed by its path to the root node:

  1. the variable is not constrained by the query

  2. the variable is constrained by the query and the query is not consistent with the path

Parameters:

query – the preprocessed query, either an instance of a subclass of VariableAssignment or a dict mapping variables to their respective labels.

Returns:

__str__() str
__repr__() str
to_string() str
fancy_tree() str
pfmt() str
Returns:

a pretty-format string representation of this JPT.

_pfmt(node: Node, indent: int) str
Parameters:
  • node – The starting node

  • indent – the indentation of each new level

Returns:

a pretty-format string representation of this JPT from node downward.

learn(data: pandas.DataFrame | numpy.ndarray, keep_samples: bool = False, close_convex_gaps: bool = False, verbose: bool = False, prune_or_split: Callable[[JPT, Any, numpy.ndarray, numpy.ndarray], bool] | None = None, multicore: int | None = None, split_validation_mask: numpy.ndarray | None = None, split_validation_mode: str = 'both') JPT

Fit the jpt to data.

Parameters:
  • data ([[str or float or bool]]; (according to self.variables)) – The training examples (assumed in row-shape)

  • keep_samples – If true, stores the indices of the original data samples in the leaf nodes. For debugging purposes only. Default is false.

  • close_convex_gaps

  • prune_or_split – A callable (jpt, partition, indices, data) -> bool that is invoked before each split. Returns True to prune (make the node a leaf) or False to allow splitting. indices and data are numpy arrays.

  • multicore – The number of cores to use for learning. If None, all available cores are used.

  • verbose

  • split_validation_mask – A boolean or uint8 array of length len(data). True/1 marks training samples whose feature values serve as candidate split points; False/0 marks evaluation samples whose feature values are excluded from candidates. Target values of all samples always contribute to the impurity score (unless split_validation_mode restricts this). None disables split validation (default).

  • split_validation_mode – Controls which targets contribute to the impurity score: 'both' (default) uses all targets, 'training' uses only training targets, 'evaluation' uses only evaluation targets.

Returns:

the fitted model

fit
static sample(sample, ft)
likelihood(data: pandas.DataFrame | numpy.ndarray, dirac_scaling: float = 2.0, min_distances: Dict = None, preprocess: bool = True, multicore: int | None = None, verbose: bool = False, single_likelihoods: bool = False, variables: Iterable[jpt.variables.Variable] = None) numpy.ndarray

Get the probabilities of a list of worlds. The worlds must be fully assigned with scalar values (no intervals or sets).

Parameters:
  • variables – Which variables in consider for their likelihood computat

  • data – An array containing the worlds. The shape is (x, len(variables)).

  • dirac_scaling – the minimal distance between the samples within a dimension are multiplied by this factor if a durac impulse is used to model the variable.

  • min_distances – A dict mapping the variables to the minimal distances between the observations. This can be useful to use the same likelihood parameters for different test sets for example in cross validation processes.

  • verbose – print status information to the console

  • multicore – how many cores should be used (defaults to all)

  • preprocess – whether to apply the preprocessing to the data passed.

  • single_likelihoods – will not only return the overall likelihoods but also the likelihoods per variable

Returns:

A np.ndarray with shape (x, ) containing the probabilities.

parallel_likelihood(data: numpy.ndarray | pandas.DataFrame, dirac_scaling: float = 2.0, min_distances: Dict = None, single_likelihoods: bool = False) numpy.ndarray

Get the probabilities of a list of worlds. The worlds must be fully assigned with scalar values (no intervals or sets).

Parameters:
  • data – An array containing the worlds. The shape is (x, len(variables)).

  • dirac_scaling – the minimal distance between the samples within a dimension are multiplied by this factor if a durac impulse is used to model the variable.

  • min_distances – A dict mapping the variables to the minimal distances between the observations. This can be useful to use the same likelihood parameters for different test sets for example in cross validation processes.

  • single_likelihoods – will not only return the overall likelihoods but also the likelihoods per variable

Returns:

An np.array with shape (x, ) containing the probabilities.

reverse(query: Dict, confidence: float = 0.05) List[tuple]

Determines the leaf nodes that match query best and returns them along with their respective confidence.

Parameters:
  • query – a mapping from featurenames to either numeric value intervals or an iterable of categorical values

  • confidence – the confidence level for this MPE inference

Returns:

a tuple of probabilities and jpt.trees.Leaf objects that match requirement (representing path to root)

plot(title: str = 'unnamed', filename: str | None = None, directory: str = None, plotvars: Iterable[jpt.variables.Variable] = None, view: bool = True, max_symb_values: int = 10, nodefill: str = None, leaffill: str = None, alphabet: bool = False, verbose: bool = False, engine=None) str

Generates an SVG representation of the generated regression tree.

Parameters:
  • title – title of the plot

  • filename – the name of the JPT (will also be used as filename; extension will be added automatically)

  • directory – the location to save the SVG file to

  • plotvars – the variables to be plotted in the graph

  • view – whether the generated SVG file will be opened automatically

  • max_symb_values – limit the maximum number of symbolic values that are plotted to this number

  • nodefill – the color of the inner nodes in the plot; accepted formats: RGB, RGBA, HSV, HSVA or color name

  • leaffill – the color of the leaf nodes in the plot; accepted formats: RGB, RGBA, HSV, HSVA or color name

  • alphabet – whether to plot symbolic variables in alphabetic order, if False, they are sorted by probability (descending); default is False

  • verbose

  • engine – the rendering engine for the distribution plots in the leafs; either ‘matplotlib’ or ‘plotly’;

Returns:

(str) the path under which the rendered image has been saved.

pickle(fpath: str) None

Pickles the fitted regression tree to a file at the given location fpath.

Parameters:

fpath – the location for the pickled file

static calcnorm(sigma: float, mu: float, intervals)

Computes the CDF for a multivariate normal distribution.

Parameters:
  • sigma – the standard deviation

  • mu – the expected value

  • intervals (list of matcalo.utils.utils.Interval) – the boundaries of the integral

Returns:

copy() JPT
Returns:

a new copy of this jpt where all references are the original tree are cut.

conditional_jpt(evidence: jpt.variables.VariableAssignment | None = None, fail_on_unsatisfiability: bool = True) JPT | None

Apply evidence on a JPT and get a new JPT that represent P(x|evidence).

Parameters:
  • evidence – A VariableAssignment mapping the observed variables to there observed values

  • fail_on_unsatisfiability – whether an error is raised in case of unsatisfiable evidence or not

multiply_by_leaf_prior(prior: dict[int, float]) JPT

Multiply every leafs prior by the given priors. This serves as handling the factor message from factor nodes. Be vary since this method overwrites the JPT in-place.

Parameters:

prior – The priors, a Dict mapping from leaf indices to float

Returns:

self

normalize() JPT

Normalize the tree s. t. the sum of all leaf priors is 1. :return: self

save(file: str | IO, protocol: Literal['pickle', 'json'] = 'pickle') None

Write this JPT persistently to disk.

Parameters:
  • file – either a string or file-like object.

  • protocol

dump
dumps(protocol: Literal['pickle', 'json'] = 'pickle') bytes
static load(file: str | IO, protocol: Literal['pickle', 'json'] = 'pickle') JPT

Load a JPT from disk.

Parameters:
  • file – either a string or file-like object.

  • protocol

Returns:

the JPT described in file

static loads(data: typing_extensions.Buffer, protocol: Literal['pickle', 'json'] = 'pickle') JPT
depth() int
Returns:

the maximal depth of a leaf in the tree.

total_samples() int
Returns:

the total number of samples represented by this tree.

number_of_parameters() int
Returns:

The number of relevant parameters in the entire tree

bind(*arg, **kwargs) jpt.variables.LabelAssignment

Returns a LabelAssignment object with the assignments passed.

This method accepts one optional positional argument, which – if passed – must be a dictionary of the desired variable assignments.

Keyword arguments may specify additional variable, value pairs.

If a positional argument is passed, the following options may be passed in addition as keyword arguments:

Parameters:
  • allow_singular_values – Allow singular values, such that they are transformed to the daomain specification of numeric variables but not transformed to intervals via the PPF.

  • space – Literal[‘values’, ‘labels’] Whether the variables shall be assigned to terms in value or label space of the JPT.

moment(order: int = 1, center: jpt.variables.VariableAssignment | None = None, evidence: jpt.variables.VariableAssignment | None = None, fail_on_unsatisfiability: bool = True) jpt.variables.VariableMap | None

Calculate the order of each numeric/integer random variable given the evidence.

Parameters:
  • order – The order of the moment

  • center – A VariableAssignment mapping each numeric/integer variable to some constant. If a variable has a constant, it will be interpreted as ‘c’ for the central moment. If it is not set, 0 will be used by default.

  • evidence – The evidence given for the posterior to be computed

  • fail_on_unsatisfiability – Rather or not an Unsatisfiability error is raised if the likelihood of the evidence is 0.

get_hyperparameters_dict() dict[str, Any]

Get all hyperparameters as dict that can be used for MLFlow model tracking.

prune(similarity_threshold: float, approximate: float | dict[jpt.variables.Variable | str, float] | jpt.variables.VariableMap | None = None) JPT

Prune this tree by repeatedly merging leaves with very similiar distributions.

Parameters:
  • similarity_threshold – the average similarity of distributions in [0, 1] that two leaves must exhibit in order to be considered for a merge.

  • approximate

Returns: