Learn joint distributions from data. Query, predict, explain.

pyjpt is a Python library for learning and querying joint probability distributions directly from data — no structural assumptions, no manual feature engineering. Feed it a DataFrame, get a model that can answer marginal and conditional probability queries, compute posteriors, find most-probable explanations, and generate samples — all in a single, interpretable tree structure that handles mixed symbolic and numeric data out of the box.

pip install pyjpt[matplotlib]

import pandas as pd
from jpt.variables import infer_from_dataframe
from jpt.trees import JPT

model = JPT(infer_from_dataframe(df), min_samples_leaf=0.1)
model.fit(df)

p    = model.infer(query={'species': 'setosa'})
post = model.posterior(['species'], evidence={'petal length (cm)': [1, 2]})
mpe  = model.mpe(evidence={'species': 'virginica'})

Why pyjpt?

Hybrid — symbolic and numeric variables in a single model, no encoding needed
No assumptions — tree partition and distributions are learned directly from data
Tractable — marginals, conditionals, posteriors, MPE and k-MPE in one tree pass
White-box — every inference result traces back to interpretable leaves
Linear scaling — training and inference scale linearly in the number of leaves

Getting Started

Tutorials

Tutorials

How-to Guides

How-to Guides

Examples

Examples

Reference

Further Information

Lead Developers

Daniel Nyga (📧, LinkedIn)
Mareike Picklum (📧, LinkedIn)
Tom Schierenbeck (📧)

Note

If you use pyjpt in scientific publications, any acknowledgement is highly appreciated. The original paper can be found here:

@inproceedings{nyga23jpts,
    title={{Joint Probability Trees}},
    author={Daniel Nyga and Mareike Picklum and Tom Schierenbeck
            and Michael Beetz},
    year={2023},
    booktitle = {arxiv.org},
    note = {Preprint},
    url = {http://arxiv.org/abs/2302.07167}
}

Why pyjpt?

Lead Developers

Indices and tables