jpt.variables

© Copyright 2021, Mareike Picklum, Daniel Nyga.

Attributes

INVERT_IMPURITY

Classes

Variable

Abstract class for a variable name along with its

NumericVariable

Represents a continuous variable.

IntegerVariable

Represents an integer-valued variable.

SymbolicVariable

Represents a symbolic variable.

VariableMap

Convenience class for mapping a Variable object

VariableAssignment

Specialization of a VariableMap that maps a set

LabelAssignment

Maps a set of variables to values represented by

ValueAssignment

Maps a set of variables to values represented by

Functions

infer_from_dataframe(df[, scale_numeric_types, ...])

Creates the Variable instances from column types

Module Contents

class jpt.variables.Variable(name: str, domain: type[jpt.distributions.Distribution] | None = None, **settings)

Abstract class for a variable name along with its distribution class type.

Parameters:
  • name – name of the variable

  • domain – the class type (not an instance!) of the represented Distribution

MIN_IMPURITY_IMPROVEMENT = 'min_impurity_improvement'
SETTINGS
_name
_domain = None
settings
__getattr__(name)
property name: str
property domain
distribution() jpt.distributions.Distribution

Create and return a new instance of the distribution type attached to this variable.

__str__()
__repr__()
__eq__(other)
__hash__()
property symbolic: bool
property numeric: bool
property integer: bool
abstract str(assignment, **kwargs) str
to_json() dict[str, Any]
static from_json(data: dict[str, Any]) NumericVariable | SymbolicVariable | IntegerVariable
__getstate__()
__setstate__(state)
copy()
abstract assignment2set(assignment: Any)

Return a canonical representation of the variable assignment as a set in the corresponding type of set.

For a NumericVariable, a scalar assignment will be converted to a ContinuousSet instance, for a SymbolicVariable, a single value will be converted to a set collection.

If assignment is already in its canonical set representation, it will not be modified and returned as passed.

class jpt.variables.NumericVariable(name: str, domain: type[jpt.distributions.Numeric] | None = Numeric, min_impurity_improvement: float | None = None, blur: float | None = None, max_std: float | None = None, precision: float | None = None)

Bases: Variable

Represents a continuous variable.

Parameters:
  • name – name of the variable

  • domain – the class type (not an instance!) of the represented Distribution

BLUR = 'blur'
MAX_STDEV = 'max_std_lbl'
PRECISION = 'precision'
SETTINGS
to_json() dict[str, Any]
static from_json(data: dict[str, Any]) NumericVariable
property _max_std
property max_std
str(assignment: list | set | numbers.Number | jpt.base.intervals.NumberSet, **kwargs) str

Construct a pretty-formatted string representation of the respective variable assignment.

Parameters:
  • assignment – the value(s) assigned to this variable.

  • fmt – [“set” | “logic”] use either set or logical notation.

  • precision – (int) the number of decimals to use for rounding.

assignment2set(assignment: float | jpt.base.intervals.NumberSet) jpt.base.intervals.NumberSet

Return a canonical representation of the variable assignment as a set in the corresponding type of set.

For a NumericVariable, a scalar assignment will be converted to a ContinuousSet instance, for a SymbolicVariable, a single value will be converted to a set collection.

If assignment is already in its canonical set representation, it will not be modified and returned as passed.

class jpt.variables.IntegerVariable(name: str, domain: type[jpt.distributions.Integer] | None, min_impurity_improvement: float | None = None)

Bases: Variable

Represents an integer-valued variable.

Parameters:
  • name – name of the variable

  • domain – the class type (not an instance!) of the represented Distribution

str(assignment, **kwargs) str
_intset_logic_str(i: jpt.base.intervals.IntSet) str
assignment2set(assignment: int | jpt.base.intervals.IntSet) jpt.base.intervals.IntSet

Return a canonical representation of the variable assignment as a set in the corresponding type of set.

For a NumericVariable, a scalar assignment will be converted to a ContinuousSet instance, for a SymbolicVariable, a single value will be converted to a set collection.

If assignment is already in its canonical set representation, it will not be modified and returned as passed.

static from_json(data: dict[str, Any]) IntegerVariable
to_json() dict[str, Any]
jpt.variables.INVERT_IMPURITY = 'invert_impurity'
class jpt.variables.SymbolicVariable(name: str, domain: type[jpt.distributions.Multinomial] | None, min_impurity_improvement: float | None = None, invert_impurity: bool | None = None)

Bases: Variable

Represents a symbolic variable.

Parameters:
  • name – name of the variable

  • domain – the class type (not an instance!) of the represented Distribution

SETTINGS
static from_json(data: dict[str, Any]) SymbolicVariable
to_json() dict[str, Any]
str(assignment: set | numbers.Number, **kwargs) str
assignment2set(assignment: Any)

Return a canonical representation of the variable assignment as a set in the corresponding type of set.

For a NumericVariable, a scalar assignment will be converted to a ContinuousSet instance, for a SymbolicVariable, a single value will be converted to a set collection.

If assignment is already in its canonical set representation, it will not be modified and returned as passed.

jpt.variables.infer_from_dataframe(df, scale_numeric_types: bool = True, min_impurity_improvement: float | None = None, blur: float | None = None, max_std: float | None = None, precision: float | None = None, unique_domain_names: bool = False, excluded_columns: dict[str, type] | None = None, remove_nan: bool = False)

Creates the Variable instances from column types in a Pandas or Spark data frame.

Parameters:
  • df – the data frame object to generate the variables from.

  • scale_numeric_types – whether or not to use scaled types for the numeric variables.

  • min_impurity_improvement – the minimum improvement that a split must induce to be acceptable.

  • blur – blur parameter for numeric variables.

  • max_std – maximum standard deviation.

  • precision – precision in [0, 1].

  • unique_domain_names – for multiple calls of infer_from_dataframe containing duplicate column names the generated domain names will be unique.

  • excluded_columns – user-provided domains for specific columns.

  • remove_nan – skip all None or NaN or Inf values in the data to construct the numeric variable domains.

class jpt.variables.VariableMap(data: list[tuple] | dict = None, variables: collections.abc.Iterable[Variable] = None)

Convenience class for mapping a Variable object to anything else. This special map, however, supports accessing the image set both by the variable object instance itself _and_ its name.

data may be an iterable of (variable, value) pairs.

_variables
_map
property variables: set[Variable]
property varnames: dict[str, Variable]
property map: dict
__getitem__(key: str | Variable) Any
__setitem__(variable: str | Variable, value: Any) None
__delitem__(key: str | Variable) None
__contains__(item: str | Variable) bool
__iter__()
__len__()
__bool__()
__eq__(o: VariableMap)
__hash__()
__isub__(other)
__iadd__(other)
get(key: str | Variable, default: Any = None) Any
keys() collections.abc.Iterator[Variable]
values() collections.abc.Iterator[Any]
items() collections.abc.Iterator[tuple]
to_json() dict[str, Any]
update(varmap: VariableMap) VariableMap
copy(deep: bool = False) VariableMap
classmethod from_json(variables: collections.abc.Iterable[Variable], d: dict[str, Any], typ=None, args=()) VariableMap
__repr__()
class jpt.variables.VariableAssignment(data: collections.abc.Iterable[tuple] = None, variables: collections.abc.Iterable[Variable] = None)

Bases: VariableMap

Specialization of a VariableMap that maps a set of variables to values of the respective variables. This is an abstract base class that cannot be instantiated. There exist two specializations LabelAssignment and ValueAssignment that are supposed to be used instead.

data may be an iterable of (variable, value) pairs.

scalar2sets()
classmethod from_json(variables: collections.abc.Iterable[Variable], d: dict[str, Any], typ=None, args=()) VariableMap
class jpt.variables.LabelAssignment(data: collections.abc.Iterable[tuple] = None, variables: collections.abc.Iterable[Variable] = None)

Bases: VariableAssignment

Maps a set of variables to values represented by their exterior representation, i.e. the perspective of a user.

data may be an iterable of (variable, value) pairs.

__setitem__(variable: Variable, value: set[int] | set[str] | jpt.base.intervals.NumberSet | numbers.Number | str) None
value_assignment() ValueAssignment
to_json() dict[str, Any]

Convert this LabelAssignment to a json serializable dictionary. To achieve that sets are replaced with lists.

class jpt.variables.ValueAssignment(data: collections.abc.Iterable[tuple] = None, variables: collections.abc.Iterable[Variable] = None)

Bases: VariableAssignment

Maps a set of variables to values represented by their interior representation, i.e. the internal value representation used by JPTs.

data may be an iterable of (variable, value) pairs.

__setitem__(variable: Variable, value: set[int] | jpt.base.intervals.NumberSet | numbers.Number) None
label_assignment() LabelAssignment