jpt.learning.dependency
Dependency discovery for JPT learning.
Submodules
Classes
Abstract base class for dependency discovery. |
|
Dependency discovery via Chatterjee's xi. |
Package Contents
- class jpt.learning.dependency.DependencyDiscovery
Bases:
abc.ABCAbstract base class for dependency discovery.
Subclasses implement a strategy for determining which target variables depend on which features, given the training data. The result is used by the JPT learning algorithm to restrict impurity computation to dependent variable pairs.
Implementations must be serializable via
to_json()/from_json()so that the discovery strategy is preserved when the JPT model is saved and loaded.- _REGISTRY: dict[str, type[DependencyDiscovery]]
- classmethod __init_subclass__(**kwargs: Any) None
Auto-register subclasses for deserialization.
- abstract __call__(data: numpy.ndarray, features: list[jpt.variables.Variable], targets: list[jpt.variables.Variable], variables: list[jpt.variables.Variable]) dict[jpt.variables.Variable, list[jpt.variables.Variable]]
Discover dependencies from data.
- Parameters:
data – preprocessed data array (n_samples x n_variables)
features – list of feature Variables
targets – list of target Variables
variables – list of all Variables (defines column order)
- Returns:
dict mapping each feature Variable to a list of dependent target Variables
- abstract to_json() dict[str, Any]
Serialize the strategy configuration.
Must include a
'type'key with the class name for deserialization dispatch.- Returns:
JSON-serializable dict
- classmethod from_json(data: dict[str, Any]) DependencyDiscovery | None
Deserialize a strategy from JSON.
Dispatches to the appropriate subclass based on the
'type'key.- Parameters:
data – dict from
to_json()- Returns:
DependencyDiscovery instance
- class jpt.learning.dependency.XiDependencyDiscovery(alpha: float = 0.05)
Bases:
jpt.learning.dependency.base.DependencyDiscoveryDependency discovery via Chatterjee’s xi.
Computes the xi correlation between all feature-target pairs and retains only those where the correlation is statistically significant under the asymptotic null distribution of xi.
Under H0 (independence with continuous Y), sqrt(n) * xi ~ N(0, 2/5).
- Parameters:
alpha – significance level for the independence test
- alpha: float = 0.05
- __call__(data: numpy.ndarray, features: list[jpt.variables.Variable], targets: list[jpt.variables.Variable], variables: list[jpt.variables.Variable]) dict[jpt.variables.Variable, list[jpt.variables.Variable]]
Discover dependencies via xi correlation.
- Parameters:
data – data array (n x d)
features – list of feature Variables
targets – list of target Variables
variables – list of all Variables
- Returns:
dict mapping features to their dependent targets
- to_json() dict[str, Any]
Serialize configuration.
- Returns:
JSON-serializable dict
- classmethod from_json(data: dict[str, Any]) XiDependencyDiscovery
Restore from JSON.
- Parameters:
data – dict from
to_json()- Returns:
XiDependencyDiscovery instance