Quick Start
Install pyjpt from PyPI:
pip install pyjpt[matplotlib]
Complete Workflow
The example below loads the Iris dataset, fits a JPT, and asks three kinds of probabilistic questions — all in under 25 lines.
import pandas as pd
import sklearn.datasets
from jpt.variables import infer_from_dataframe
from jpt.trees import JPT
# 1 ── Load data
iris = sklearn.datasets.load_iris()
df = pd.DataFrame(iris.data, columns=iris.feature_names)
df['species'] = [iris.target_names[t] for t in iris.target]
# 2 ── Infer variable types and fit
variables = infer_from_dataframe(df)
model = JPT(variables, min_samples_leaf=0.1)
model.fit(df)
# 3 ── Marginal probability
p = model.infer(query={'species': 'setosa'})
print(f'P(setosa) = {p:.3f}')
# 4 ── Conditional probability P(setosa | petal length ∈ [1, 2])
p_cond = model.infer(
query={'species': 'setosa'},
evidence={'petal length (cm)': [1.0, 2.0]}
)
print(f'P(setosa | petal length ∈ [1,2]) = {p_cond:.3f}')
# 5 ── Full posterior over species given petal width ≤ 0.5
post = model.posterior(
variables=['species'],
evidence={'petal width (cm)': [0.0, 0.5]}
)
for label in model.varnames['species'].domain:
print(f' P({label} | narrow petal) = {post[model.varnames["species"]].p(label):.3f}')
# 6 ── Most probable explanation
assignment, likelihood = model.mpe(evidence={'species': 'virginica'})
print(f'MPE (virginica): {assignment[0]} likelihood={likelihood:.4f}')
How It Works
jpt.variables.infer_from_dataframe() inspects the DataFrame’s
column dtypes and creates one variable per column:
float/intcolumns →NumericVariableobject/categorycolumns →SymbolicVariable
JPT builds a decision-tree partition of the data
space. Each leaf stores an independent factorised distribution over all
variables. The min_samples_leaf parameter controls how fine-grained
that partition becomes — smaller values produce more leaves and a more
expressive model at the cost of higher variance.
Next Steps
Introduction — understand what JPTs are and how they work
Tutorials — step-by-step tutorials
How-to Guides — task-oriented recipes for classification, regression, visualisation, and model persistence
API Reference — full API reference