jpt.learning.dependency

Dependency discovery for JPT learning.

Submodules

Classes

DependencyDiscovery

Abstract base class for dependency discovery.

XiDependencyDiscovery

Dependency discovery via Chatterjee's xi.

Package Contents

class jpt.learning.dependency.DependencyDiscovery

Bases: abc.ABC

Abstract base class for dependency discovery.

Subclasses implement a strategy for determining which target variables depend on which features, given the training data. The result is used by the JPT learning algorithm to restrict impurity computation to dependent variable pairs.

Implementations must be serializable via to_json()/from_json() so that the discovery strategy is preserved when the JPT model is saved and loaded.

_REGISTRY: dict[str, type[DependencyDiscovery]]
classmethod __init_subclass__(**kwargs: Any) None

Auto-register subclasses for deserialization.

abstract __call__(data: numpy.ndarray, features: list[jpt.variables.Variable], targets: list[jpt.variables.Variable], variables: list[jpt.variables.Variable]) dict[jpt.variables.Variable, list[jpt.variables.Variable]]

Discover dependencies from data.

Parameters:
  • data – preprocessed data array (n_samples x n_variables)

  • features – list of feature Variables

  • targets – list of target Variables

  • variables – list of all Variables (defines column order)

Returns:

dict mapping each feature Variable to a list of dependent target Variables

abstract to_json() dict[str, Any]

Serialize the strategy configuration.

Must include a 'type' key with the class name for deserialization dispatch.

Returns:

JSON-serializable dict

classmethod from_json(data: dict[str, Any]) DependencyDiscovery | None

Deserialize a strategy from JSON.

Dispatches to the appropriate subclass based on the 'type' key.

Parameters:

data – dict from to_json()

Returns:

DependencyDiscovery instance

class jpt.learning.dependency.XiDependencyDiscovery(alpha: float = 0.05)

Bases: jpt.learning.dependency.base.DependencyDiscovery

Dependency discovery via Chatterjee’s xi.

Computes the xi correlation between all feature-target pairs and retains only those where the correlation is statistically significant under the asymptotic null distribution of xi.

Under H0 (independence with continuous Y), sqrt(n) * xi ~ N(0, 2/5).

Parameters:

alpha – significance level for the independence test

alpha: float = 0.05
__call__(data: numpy.ndarray, features: list[jpt.variables.Variable], targets: list[jpt.variables.Variable], variables: list[jpt.variables.Variable]) dict[jpt.variables.Variable, list[jpt.variables.Variable]]

Discover dependencies via xi correlation.

Parameters:
  • data – data array (n x d)

  • features – list of feature Variables

  • targets – list of target Variables

  • variables – list of all Variables

Returns:

dict mapping features to their dependent targets

to_json() dict[str, Any]

Serialize configuration.

Returns:

JSON-serializable dict

classmethod from_json(data: dict[str, Any]) XiDependencyDiscovery

Restore from JSON.

Parameters:

data – dict from to_json()

Returns:

XiDependencyDiscovery instance