jpt.distributions.univariate
Submodules
Classes
Abstract supertype of all domains and distributions |
|
Wrapper class for numeric domains and distributions. |
|
Scaled numeric distribution represented by mean and variance. |
|
Abstract supertype of all domains and distributions |
|
Abstract supertype of all symbolic domains and distributions. |
|
Wrapper class for Boolean domains and distributions. |
|
Extension of |
Functions
|
|
|
|
|
Package Contents
- class jpt.distributions.univariate.Distribution(**settings)
Abstract supertype of all domains and distributions
- SETTINGS
- _cl = 'jpt.distributions.univariate.distribution.Distribution'
- settings
- __getattr__(name)
- classmethod hash()
- Abstractmethod:
- __hash__()
- __getitem__(value)
- classmethod value2label(value)
- Abstractmethod:
- classmethod label2value(label)
- Abstractmethod:
- abstract _sample(n: int) Iterable
- abstract _sample_one()
- sample(n: int) Iterable
- sample_one() Any
- abstract p(value) float
- abstract _p(value) float
- abstract mpe()
- abstract crop(restriction: Set) Distribution
- abstract _crop(restriction: Set) Distribution
- abstract entropy() float
- static merge(distributions: Iterable[Distribution], weights: Iterable[numbers.Real]) Distribution
- Abstractmethod:
- abstract update(dist: Distribution, weight: float) Distribution
- abstract fit(data: numpy.ndarray, rows: numpy.ndarray = None, col: numbers.Integral = None) Distribution
- abstract _fit(data: numpy.ndarray, rows: numpy.ndarray = None, col: numbers.Integral = None) Distribution
- abstract set(params: Any) Distribution
- abstract kl_divergence(other: Distribution)
- abstract number_of_parameters() int
- static jaccard_similarity(d1: Distribution, d2: Distribution) float
- Abstractmethod:
- abstract plot(engine: str, title: str = None, fname: str = None, directory: str = '/tmp', view: bool = False, **kwargs) Any
Generates a plot of the distribution.
- Parameters:
title – the name of the variable this distribution represents
fname – the name of the file to be stored. Available file formats: png, svg, jpeg, webp, html
directory – the directory to store the generated plot files
view – whether to display generated plots, default False (only stores files)
- Returns:
the figure object of the plotting engine
- abstract to_json()
- __reduce__()
- static type_from_json(data: Dict[str, Any]) Type[Distribution]
- static from_json(dtype: Dict[str, Any], dinst: Dict[str, Any] = None) Distribution | Type[Distribution]
- class jpt.distributions.univariate.Numeric(**settings)
Bases:
jpt.distributions.univariate.DistributionWrapper class for numeric domains and distributions.
- PRECISION = 'precision'
- values
- labels
- SETTINGS
- _quantile: jpt.distributions.qpd.QuantileDistribution = None
- to_json
- classmethod hash()
- __str__()
- __getitem__(value)
- classmethod value2label(value: float | jpt.base.intervals.NumberSet) float | jpt.base.intervals.NumberSet
- classmethod label2value(label: numbers.Real | jpt.base.intervals.NumberSet) numbers.Real | jpt.base.intervals.NumberSet
- classmethod equiv(other)
- property cdf
- property pdf
- property ppf
- approximate_fast(eps: float)
- _sample(n)
- _sample_one()
- number_of_parameters() int
- Returns:
The number of relevant parameters in this decision node. 1 if this is a dirac impulse, number of intervals times two else
- _expectation() float
- _variance() float
- expectation() float
- variance() float
- quantile(gamma: numbers.Real) numbers.Real
- create_dirac_impulse(value)
Create a dirac impulse at the given value aus quantile distribution.
- is_dirac_impulse() bool
Checks if this distribution is a dirac impulse.
- mpe()
- _mpe(value_transform: Callable | None = None)
Calculate the most probable configuration of this quantile distribution.
- Returns:
The mpe itself as UnionSet and the likelihood of the mpe as float
- _k_mpe(k: int | None = None) List[Tuple[jpt.base.intervals.NumberSet, float]]
Calculate the
kmost probable explanation states.- Parameters:
k – The number of solutions to generate, defaults to the maximum possible number.
- Returns:
A list containing a tuple containing the likelihood and state in descending order.
- k_mpe(k: int | None = None) List[Tuple[jpt.base.intervals.NumberSet, float]]
Calculate the
kmost probable explanation states.- Parameters:
k – The number of solutions to generate, defaults to the maximum possible number.
- Returns:
A list containing a tuple containing the likelihood and state in descending order.
- fit
- _p(value: numbers.Number | jpt.base.intervals.NumberSet) numbers.Real
- p(labels: numbers.Number | jpt.base.intervals.NumberSet | List[float]) numbers.Real
- copy()
- _crop(restriction: jpt.base.intervals.NumberSet | numbers.Number) Numeric
Apply a restriction to this distribution. The restricted distrubtion will only assign mass to the given range and will preserve the relativity of the pdf.
- Parameters:
restriction (float or int or ContinuousSet) – The range to limit this distribution (or singular value)
- classmethod type_to_json()
- inst_to_json()
- static from_json(data)
- classmethod type_from_json(data: Dict[str, Any])
- insert_convex_fragments(left: jpt.base.intervals.ContinuousSet | None, right: jpt.base.intervals.ContinuousSet | None, number_of_samples: int)
Insert fragments of distributions on the right and left part of this distribution. This should only be used to create a convex hull around the JPTs domain which density is never 0.
- Parameters:
right – The right (lower) interval to add on if needed and None else
left – The left (upper) interval to add on if needed and None else
number_of_samples – The number of samples to use as basis for the weight
- classmethod cumsum(distributions: Iterable[Numeric], error_max: float = np.inf, n_segments: int = None) Iterable[Numeric]
Generator yielding the distributions that correspond to the cumulative sums of the passed distributions.
- Parameters:
distributions –
error_max –
n_segments –
- Returns:
- moment(order: int, center: float) float
- _moment(order: int, center: float, value_transform: Callable | None = None) float
Calculate the central moment of the r-th order almost everywhere.
\[\int (x - c)^{r} p(x)\]- Parameters:
order – The order of the moment to calculate
center – The constant to subtract in the basis of the exponent If center is 0, the result corresponds to the
order-th raw moment. If center is set to the distributions mean (ie its expectation, or self._moment(1, 0)) the result is the central moment of the distribution.
- entropy() float
- plot(engine=None, **kwargs) Any
Plots the distribution using the given engine.
- Parameters:
engine – Can be either one of
["plotly", "matplotlib"], or an instance of a rendering engine subclassingDistributionRendering.kwargs – The keyword arguments to pass to the engine as defined in the
.plot_numeric()function ofDistributionRenderingor its respective subclass defined byengine.
- Returns:
the figure object of the plotting engine
- class jpt.distributions.univariate.ScaledNumeric(**settings)
Bases:
NumericScaled numeric distribution represented by mean and variance.
- classmethod type_to_json()
- to_json
- static type_from_json(data)
- classmethod from_json(data)
- class jpt.distributions.univariate.Integer(**settings)
Bases:
jpt.distributions.univariate.DistributionAbstract supertype of all domains and distributions
- values: IntegerLabelToValueMap | None
- labels
- OPEN_DOMAIN = 'open_domain'
- AUTO_DOMAIN = 'auto_domain'
- SETTINGS
- min() int | None
- max() int | None
- _min() int | None
- _max() int | None
- _params: Dict[int, float] | None = None
- to_json: types.FunctionType
- classmethod hash()
- property cdf: jpt.base.functions.PiecewiseFunction
- classmethod equiv(other: Type[jpt.distributions.univariate.Distribution]) bool
- classmethod type_to_json() Dict[str, Any]
- inst_to_json() Dict[str, Any]
- static type_from_json(data)
- copy()
- property probabilities: Dict[int, float]
- n_values() int | None
- classmethod value2label(value: int | Iterable[int] | jpt.base.intervals.IntSet | jpt.base.intervals.UnionSet) int | Iterable[int] | jpt.base.intervals.IntSet | jpt.base.intervals.UnionSet
- classmethod label2value(label: int | Iterable[int] | jpt.base.intervals.IntSet | jpt.base.intervals.UnionSet) int | Iterable[int] | jpt.base.intervals.IntSet | jpt.base.intervals.UnionSet
- _sample(n: int) Iterable[int]
- _sample_one() int
- sample(n: int) Iterable[int]
- sample_one() int
- property _pdf: types.FunctionType
- property pdf: types.FunctionType
- p(labels: int | Iterable[int]) float
- _p(values: int | Iterable[int]) float
- expectation() float
- _expectation() float
- variance() float
- _variance() float
- _k_mpe(k: int | None = None) Iterable[Tuple[jpt.base.intervals.NumberSet, float]]
Calculate the
kmost probable explanation states.- Parameters:
k – The number of solutions to generate
- Returns:
An list containing a tuple containing the likelihood and state in descending order.
- k_mpe(k: int = None) Iterable[Tuple[jpt.base.intervals.NumberSet, float]]
- mpe()
- _mpe()
- mode()
- _mode()
- __eq__(other) bool
- __str__()
- __repr__()
- infinite() bool
- finite() bool
- _sorted(exhaustive: bool = True, reverse: bool = False, max_items: int = None) Iterable[Tuple[int, float]]
- sorted(exhaustive: bool = True, reverse: bool = False, max_items: int = None) Iterable[Tuple[int, float]]
- _items(exhaustive: bool = False, max_items: int = None) Iterable[Tuple[int, float]]
Return a list of (probability, value) pairs representing this distribution.
- items(exhaustive: bool = True, max_items: int = None) Iterable[Tuple[int, float]]
Return a list of (probability, label) pairs representing this distribution.
- number_of_parameters() int
- moment(order: int = 1, center: float = 0) float
Calculate the central moment of the r-th order almost everywhere.
\[\int (x-c)^{r} p(x)\]- Parameters:
order – The order of the moment to calculate
center – The constant to subtract in the basis of the exponent
- plot(engine=None, **kwargs) Any
Plots the distribution using the given engine.
- Parameters:
engine – Can be either one of
["plotly", "matplotlib"], or an instance of a rendering engine subclassingDistributionRendering.kwargs – The keyword arguments to pass to the engine as defined in the
.plot_integer()function ofDistributionRenderingor its respective subclass defined byengine.
- Returns:
the figure object of the plotting engine
- jpt.distributions.univariate.IntegerType(name: str, lmin: int | None = None, lmax: int | None = None) Type[Integer]
- class jpt.distributions.univariate.Multinomial(**settings)
Bases:
jpt.distributions.univariate.DistributionAbstract supertype of all symbolic domains and distributions.
- values: MultinomialValueMap = None
- labels: MultinomialValueMap = None
- _params: numpy.ndarray | None = None
- to_json: types.MethodType
- classmethod hash()
- classmethod value2label(value: int | Iterable[int]) jpt.base.utils.Symbol | Collection[jpt.base.utils.Symbol]
- classmethod label2value(label: jpt.base.utils.Symbol | Collection[jpt.base.utils.Symbol]) int | Collection[int]
- classmethod pfmt(max_values=10, labels_or_values='labels') str
Returns a pretty-formatted string representation of this class.
By default, a set notation with value labels is used. By setting
labels_or_valuesto"values", the internal value representation is used. If the domain comprises more thanmax_valuesvalues, the middle part of the list of values is abbreviated by “…”.
- property probabilities
- n_values() int
- __contains__(item)
- classmethod equiv(other)
- static jaccard_similarity(*d: Multinomial) float
Calculate the similarity of two or more Multinomial distributions.
\[\text{sim}(D_1, \ldots, D_n) = \frac{\sum_{x \in \text{dom}(D)} \min(p_i(x))} {\sum_{x \in \text{dom}(D)} \max(p_i(x))}\]Adapted from the Jaccard coefficient:
\[\text{sim}(S_1, \ldots, S_n) = \frac{|\bigcap_{i}^{n} S_i|}{|\bigcup_{i}^{n} S_i|}\]
- mover_dist(other: Multinomial) float
- similarity(other: Multinomial) float
- distance(other: Multinomial) float
- __getitem__(value)
- __setitem__(label, p)
- __eq__(other)
- __str__()
- __repr__()
- sorted() Iterable[Tuple[float, jpt.base.utils.Symbol]]
Generate a sequence of (label, prob) pairs representing this distribution, ordered by descending probability. :return:
- _items() Iterable[Tuple[float, int]]
Generate a sequence of (probability, value) pairs representing this distribution.
- items() Iterable[Tuple[float, jpt.base.utils.Symbol]]
Generate a sequence of (probability, label) pairs representing this distribution.
- copy()
- _pdf(value: int) float
- pdf(label: jpt.base.utils.Symbol) float
- p(event: jpt.base.utils.Symbol | Set[jpt.base.utils.Symbol] | List[jpt.base.utils.Symbol] | Tuple[jpt.base.utils.Symbol] | numpy.ndarray) float
Compute the probability of a certain
eventgiven this multinomial distribution.An event can be atomic random event, or a disjunction thereof, e.g. given the domain values {‘Head’, ‘Tail’},
eventmay bedist.p(‘Head’) dist.p({‘Tail’}) dist.p({‘Head’, ‘Tail’})
- Parameters:
event – the event in label space, the prob’ of which is to be computed.
- Returns:
the probability of the
event
- _p(event: int | Set[int] | List[int] | Tuple[int] | numpy.ndarray) float
Compute the probability of a certain
eventgiven this multinomial distribution.See also
Multinomial.p()- Parameters:
event – the event int value space, the prob’ of which is to be computed.
- Returns:
the probability of the
event
- create_dirac_impulse(value: int) Multinomial
Create a singular modification of this distribution object, in which the
valuehas probability1, whereas all other events have prob0.- Parameters:
value – the singular value to get assigned prob
1.- Returns:
the created distribution object
- _sample(n: int) Iterable[int]
Returns
nsample values according to their respective probability
- _sample_one() jpt.base.utils.Symbol
Returns one sample value according to its probability
- _expectation() Set[int]
Returns the value with the highest probability for this variable
- expectation() Set[jpt.base.utils.Symbol]
For symbolic variables the expectation is equal to the mpe. :return: The set of all most likely values
- mpe() Tuple[Set[jpt.base.utils.Symbol], float]
- _mpe() Tuple[Set[int], float]
Calculate the most probable configuration of this distribution in value space.
- Returns:
The likelihood of the mpe itself as Set and the likelihood of the mpe as float
- _k_mpe(k: int = None) List[Tuple[Set[jpt.base.utils.Symbol], float]]
- k_mpe(k: int | None = None) List[Tuple[Set[jpt.base.utils.Symbol], float]]
- mode() Set
- _mode() Set
- kl_divergence(other: Multinomial) float
Compute the KL-divergence of this distribution to the
otherdistribution. :param other: :return:
- _crop(restriction: int | Collection[int]) Multinomial
- crop(restriction: jpt.base.utils.Symbol | Collection[jpt.base.utils.Symbol]) Multinomial
Apply a restriction to this distribution such that all values are in the given set.
- Parameters:
restriction – The values to remain
- Returns:
Copy of self that is consistent with the restriction
- _fit(data: numpy.ndarray, rows: numpy.ndarray = None, col: int = None) Multinomial
- set(params: Iterable[numbers.Real]) Multinomial
- update(dist: Multinomial, weight: float) Multinomial
Update this multinomial distribution with
distandweight.The resulting distribution will be a weighted mean of
selfanddist, whereselfwill have a weight of(1-weight), anddistwill have a weight ofweight.- Parameters:
dist – the update distribution
weight – the weight
- Returns:
- static merge(distributions: Iterable[Multinomial], weights: Iterable[float]) Multinomial
Merge the
distributionsunder consideration ofweights.- Parameters:
distributions –
weights –
- Returns:
- classmethod type_to_json()
- inst_to_json()
- static type_from_json(data)
- classmethod from_json(data)
- is_dirac_impulse()
- number_of_parameters() int
- Returns:
The number of relevant parameters in this decision node. 1 if this is a dirac impulse, number of parameters else
- plot(engine=None, **kwargs) Any
Plots the distribution using the given engine.
- Parameters:
engine – Can be either one of
["plotly", "matplotlib"], or an instance of a rendering engine subclassingDistributionRendering.kwargs – The keyword arguments to pass to the engine as defined in the
.plot_multinomial()function ofDistributionRenderingor its respective subclass defined byengine.
- Returns:
the figure object of the plotting engine
- jpt.distributions.univariate.SymbolicType(name: str, labels: Iterable[Any]) Type[Multinomial]
- class jpt.distributions.univariate.Bool(**settings)
Bases:
MultinomialWrapper class for Boolean domains and distributions.
- values
- labels
- __setitem__(v, p)
- class jpt.distributions.univariate.Gaussian(mean=None, cov=None, data=None, weights=None)
Bases:
dnutils.stats.GaussianExtension of
dnutils.stats.GaussianCreates a new Gaussian distribution.
- Parameters:
mean (float if multivariate else [float] if multivariate) – the mean of the Gaussian
cov (float if multivariate else [[float]] if multivariate) – the covariance of the Gaussian
data ([[float]]) – if
meanandcovare not provided,datamay be a data set (matrix) from which the parameters of the distribution are estimated.weights ([float]) – [optional] weights for the data points. The weight do not need to be normalized.
- PRECISION = 1e-15
- _cl = 'jpt.distributions.univariate.gaussian.Gaussian'
- _sum_w = 0
- _sum_w_sq = 0
- _mean
- _cov
- data = []
- mean()
- cov()
- var()
- property std
- deviation(x)
Computes the deviation of
xin multiples of the standard deviation.- Parameters:
x –
- Returns:
- __add__(alpha)
- __radd__(other)
- __iadd__(other)
- __mul__(alpha)
- __rmul__(other)
- __imul__(other)
- dim()
- sample(n)
Return
nsamples from this Gaussian distribution.- Parameters:
n – number of samples
- Returns:
array of shape
(n,)for 1-D or(n, d)for d-dimensional Gaussians
- property pdf
- cdf(*x)
- eval(lower, upper)
- copy()
- __eq__(other)
- linreg()
Compute a 4-tuple
<m, b, rss, noise>of a linear regression represented by this Gaussian.- Returns:
m- the slope of the lineb- the intercept of the linerss- the residual sum-of-squares errornoise- the square of the sample correlation coefficientr^2
- update_all(data, weights=None)
Update the distribution with new data points given in
data.
- estimate(data, weights=None)
Estimate the distribution parameters with subject to the given data points.
- update(x, w=1)
update the Gaussian distribution with a new data point
xand weightw.
- retract(x, w=1)
Retract the data point x with weight w from the Gaussian distribution.
In case the data points are being kept in the distribution, it must actually exist and have the right weight associated. Otherwise, a ValueError will be raised.
- sym()
- plot(engine=None, **kwargs) Any
Plots the distribution using the given engine.
- Parameters:
engine – Can be either one of
["plotly", "matplotlib"], or an instance of a rendering engine subclassingDistributionRendering.kwargs – The keyword arguments to pass to the engine as defined in the
.plot_gaussian()function ofDistributionRenderingor its respective subclass defined byengine.
- Returns:
the figure object of the plotting engine