pyjpt — Joint Probability Trees

Learn joint distributions from data. Query, predict, explain.


pyjpt is a Python library for learning and querying joint probability distributions directly from data — no structural assumptions, no manual feature engineering. Feed it a DataFrame, get a model that can answer marginal and conditional probability queries, compute posteriors, find most-probable explanations, and generate samples — all in a single, interpretable tree structure that handles mixed symbolic and numeric data out of the box.

pip install pyjpt[matplotlib]

import pandas as pd
from jpt.variables import infer_from_dataframe
from jpt.trees import JPT

model = JPT(infer_from_dataframe(df), min_samples_leaf=0.1)
model.fit(df)

p    = model.infer(query={'species': 'setosa'})
post = model.posterior(['species'], evidence={'petal length (cm)': [1, 2]})
mpe  = model.mpe(evidence={'species': 'virginica'})

Why pyjpt?

  • Hybrid — symbolic and numeric variables in a single model, no encoding needed

  • No assumptions — tree partition and distributions are learned directly from data

  • Tractable — marginals, conditionals, posteriors, MPE and k-MPE in one tree pass

  • White-box — every inference result traces back to interpretable leaves

  • Linear scaling — training and inference scale linearly in the number of leaves

Tutorials

How-to Guides

Examples


Lead Developers

Note

If you use pyjpt in scientific publications, any acknowledgement is highly appreciated. The original paper can be found here:

@inproceedings{nyga23jpts,
    title={{Joint Probability Trees}},
    author={Daniel Nyga and Mareike Picklum and Tom Schierenbeck
            and Michael Beetz},
    year={2023},
    booktitle = {arxiv.org},
    note = {Preprint},
    url = {http://arxiv.org/abs/2302.07167}
}

Indices and tables