jpt.variables ============= .. py:module:: jpt.variables .. autoapi-nested-parse:: © Copyright 2021, Mareike Picklum, Daniel Nyga. Attributes ---------- .. autoapisummary:: jpt.variables.INVERT_IMPURITY Classes ------- .. autoapisummary:: jpt.variables.Variable jpt.variables.NumericVariable jpt.variables.IntegerVariable jpt.variables.SymbolicVariable jpt.variables.VariableMap jpt.variables.VariableAssignment jpt.variables.LabelAssignment jpt.variables.ValueAssignment Functions --------- .. autoapisummary:: jpt.variables.infer_from_dataframe Module Contents --------------- .. py:class:: Variable(name: str, domain: type[jpt.distributions.Distribution] | None = None, **settings) Abstract class for a variable name along with its distribution class type. :param name: name of the variable :param domain: the class type (not an instance!) of the represented Distribution .. py:attribute:: MIN_IMPURITY_IMPROVEMENT :value: 'min_impurity_improvement' .. py:attribute:: SETTINGS .. py:attribute:: _name .. py:attribute:: _domain :value: None .. py:attribute:: settings .. py:method:: __getattr__(name) .. py:property:: name :type: str .. py:property:: domain .. py:method:: distribution() -> jpt.distributions.Distribution Create and return a new instance of the distribution type attached to this variable. .. py:method:: __str__() .. py:method:: __repr__() .. py:method:: __eq__(other) .. py:method:: __hash__() .. py:property:: symbolic :type: bool .. py:property:: numeric :type: bool .. py:property:: integer :type: bool .. py:method:: str(assignment, **kwargs) -> str :abstractmethod: .. py:method:: to_json() -> dict[str, Any] .. py:method:: from_json(data: dict[str, Any]) -> NumericVariable | SymbolicVariable | IntegerVariable :staticmethod: .. py:method:: __getstate__() .. py:method:: __setstate__(state) .. py:method:: copy() .. py:method:: assignment2set(assignment: Any) :abstractmethod: Return a canonical representation of the variable ``assignment`` as a set in the corresponding type of set. For a ``NumericVariable``, a scalar ``assignment`` will be converted to a ``ContinuousSet`` instance, for a ``SymbolicVariable``, a single value will be converted to a ``set`` collection. If ``assignment`` is already in its canonical set representation, it will not be modified and returned as passed. .. py:class:: NumericVariable(name: str, domain: type[jpt.distributions.Numeric] | None = Numeric, min_impurity_improvement: float | None = None, blur: float | None = None, max_std: float | None = None, precision: float | None = None) Bases: :py:obj:`Variable` Represents a continuous variable. :param name: name of the variable :param domain: the class type (not an instance!) of the represented Distribution .. py:attribute:: BLUR :value: 'blur' .. py:attribute:: MAX_STDEV :value: 'max_std_lbl' .. py:attribute:: PRECISION :value: 'precision' .. py:attribute:: SETTINGS .. py:method:: to_json() -> dict[str, Any] .. py:method:: from_json(data: dict[str, Any]) -> NumericVariable :staticmethod: .. py:property:: _max_std .. py:property:: max_std .. py:method:: str(assignment: list | set | numbers.Number | jpt.base.intervals.NumberSet, **kwargs) -> str Construct a pretty-formatted string representation of the respective variable assignment. :param assignment: the value(s) assigned to this variable. :param fmt: ["set" | "logic"] use either set or logical notation. :param precision: (int) the number of decimals to use for rounding. .. py:method:: assignment2set(assignment: float | jpt.base.intervals.NumberSet) -> jpt.base.intervals.NumberSet Return a canonical representation of the variable ``assignment`` as a set in the corresponding type of set. For a ``NumericVariable``, a scalar ``assignment`` will be converted to a ``ContinuousSet`` instance, for a ``SymbolicVariable``, a single value will be converted to a ``set`` collection. If ``assignment`` is already in its canonical set representation, it will not be modified and returned as passed. .. py:class:: IntegerVariable(name: str, domain: type[jpt.distributions.Integer] | None, min_impurity_improvement: float | None = None) Bases: :py:obj:`Variable` Represents an integer-valued variable. :param name: name of the variable :param domain: the class type (not an instance!) of the represented Distribution .. py:method:: str(assignment, **kwargs) -> str .. py:method:: _intset_logic_str(i: jpt.base.intervals.IntSet) -> str .. py:method:: assignment2set(assignment: int | jpt.base.intervals.IntSet) -> jpt.base.intervals.IntSet Return a canonical representation of the variable ``assignment`` as a set in the corresponding type of set. For a ``NumericVariable``, a scalar ``assignment`` will be converted to a ``ContinuousSet`` instance, for a ``SymbolicVariable``, a single value will be converted to a ``set`` collection. If ``assignment`` is already in its canonical set representation, it will not be modified and returned as passed. .. py:method:: from_json(data: dict[str, Any]) -> IntegerVariable :staticmethod: .. py:method:: to_json() -> dict[str, Any] .. py:data:: INVERT_IMPURITY :value: 'invert_impurity' .. py:class:: SymbolicVariable(name: str, domain: type[jpt.distributions.Multinomial] | None, min_impurity_improvement: float | None = None, invert_impurity: bool | None = None) Bases: :py:obj:`Variable` Represents a symbolic variable. :param name: name of the variable :param domain: the class type (not an instance!) of the represented Distribution .. py:attribute:: SETTINGS .. py:method:: from_json(data: dict[str, Any]) -> SymbolicVariable :staticmethod: .. py:method:: to_json() -> dict[str, Any] .. py:method:: str(assignment: set | numbers.Number, **kwargs) -> str .. py:method:: assignment2set(assignment: Any) Return a canonical representation of the variable ``assignment`` as a set in the corresponding type of set. For a ``NumericVariable``, a scalar ``assignment`` will be converted to a ``ContinuousSet`` instance, for a ``SymbolicVariable``, a single value will be converted to a ``set`` collection. If ``assignment`` is already in its canonical set representation, it will not be modified and returned as passed. .. py:function:: infer_from_dataframe(df, scale_numeric_types: bool = True, min_impurity_improvement: float | None = None, blur: float | None = None, max_std: float | None = None, precision: float | None = None, unique_domain_names: bool = False, excluded_columns: dict[str, type] | None = None, remove_nan: bool = False) Creates the ``Variable`` instances from column types in a Pandas or Spark data frame. :param df: the data frame object to generate the variables from. :param scale_numeric_types: whether or not to use scaled types for the numeric variables. :param min_impurity_improvement: the minimum improvement that a split must induce to be acceptable. :param blur: blur parameter for numeric variables. :param max_std: maximum standard deviation. :param precision: precision in ``[0, 1]``. :param unique_domain_names: for multiple calls of infer_from_dataframe containing duplicate column names the generated domain names will be unique. :param excluded_columns: user-provided domains for specific columns. :param remove_nan: skip all ``None`` or ``NaN`` or ``Inf`` values in the data to construct the numeric variable domains. .. py:class:: VariableMap(data: list[tuple] | dict = None, variables: collections.abc.Iterable[Variable] = None) Convenience class for mapping a ``Variable`` object to anything else. This special map, however, supports accessing the image set both by the variable object instance itself _and_ its name. ``data`` may be an iterable of (variable, value) pairs. .. py:attribute:: _variables .. py:attribute:: _map .. py:property:: variables :type: set[Variable] .. py:property:: varnames :type: dict[str, Variable] .. py:property:: map :type: dict .. py:method:: __getitem__(key: str | Variable) -> Any .. py:method:: __setitem__(variable: str | Variable, value: Any) -> None .. py:method:: __delitem__(key: str | Variable) -> None .. py:method:: __contains__(item: str | Variable) -> bool .. py:method:: __iter__() .. py:method:: __len__() .. py:method:: __bool__() .. py:method:: __eq__(o: VariableMap) .. py:method:: __hash__() .. py:method:: __isub__(other) .. py:method:: __iadd__(other) .. py:method:: get(key: str | Variable, default: Any = None) -> Any .. py:method:: keys() -> collections.abc.Iterator[Variable] .. py:method:: values() -> collections.abc.Iterator[Any] .. py:method:: items() -> collections.abc.Iterator[tuple] .. py:method:: to_json() -> dict[str, Any] .. py:method:: update(varmap: VariableMap) -> VariableMap .. py:method:: copy(deep: bool = False) -> VariableMap .. py:method:: from_json(variables: collections.abc.Iterable[Variable], d: dict[str, Any], typ=None, args=()) -> VariableMap :classmethod: .. py:method:: __repr__() .. py:class:: VariableAssignment(data: collections.abc.Iterable[tuple] = None, variables: collections.abc.Iterable[Variable] = None) Bases: :py:obj:`VariableMap` Specialization of a ``VariableMap`` that maps a set of variables to values of the respective variables. This is an abstract base class that cannot be instantiated. There exist two specializations ``LabelAssignment`` and ``ValueAssignment`` that are supposed to be used instead. ``data`` may be an iterable of (variable, value) pairs. .. py:method:: scalar2sets() .. py:method:: from_json(variables: collections.abc.Iterable[Variable], d: dict[str, Any], typ=None, args=()) -> VariableMap :classmethod: .. py:class:: LabelAssignment(data: collections.abc.Iterable[tuple] = None, variables: collections.abc.Iterable[Variable] = None) Bases: :py:obj:`VariableAssignment` Maps a set of variables to values represented by their exterior representation, i.e. the perspective of a user. ``data`` may be an iterable of (variable, value) pairs. .. py:method:: __setitem__(variable: Variable, value: set[int] | set[str] | jpt.base.intervals.NumberSet | numbers.Number | str) -> None .. py:method:: value_assignment() -> ValueAssignment .. py:method:: to_json() -> dict[str, Any] Convert this LabelAssignment to a json serializable dictionary. To achieve that sets are replaced with lists. .. py:class:: ValueAssignment(data: collections.abc.Iterable[tuple] = None, variables: collections.abc.Iterable[Variable] = None) Bases: :py:obj:`VariableAssignment` Maps a set of variables to values represented by their interior representation, i.e. the internal value representation used by JPTs. ``data`` may be an iterable of (variable, value) pairs. .. py:method:: __setitem__(variable: Variable, value: set[int] | jpt.base.intervals.NumberSet | numbers.Number) -> None .. py:method:: label_assignment() -> LabelAssignment