pdpbox.pdp.pdp_interact

pdpbox.pdp.pdp_interact(model, dataset, model_features, features, num_grid_points=None, grid_types=None, percentile_ranges=None, grid_ranges=None, cust_grid_points=None, memory_limit=0.5, n_jobs=1, predict_kwds={}, data_transformer=None)

Calculate PDP interaction plot

Parameters:
model: a fitted sklearn model
dataset: pandas DataFrame

data set on which the model is trained

model_features: list or 1-d array

list of model features

features: list

[feature1, feature2]

num_grid_points: list, default=None

[feature1 num_grid_points, feature2 num_grid_points]

grid_types: list, default=None

[feature1 grid_type, feature2 grid_type]

percentile_ranges: list, default=None

[feature1 percentile_range, feature2 percentile_range]

grid_ranges: list, default=None

[feature1 grid_range, feature2 grid_range]

cust_grid_points: list, default=None

[feature1 cust_grid_points, feature2 cust_grid_points]

memory_limit: float, (0, 1)

fraction of memory to use

n_jobs: integer, default=1

number of jobs to run in parallel. make sure n_jobs=1 when you are using XGBoost model. check: 1. https://pythonhosted.org/joblib/parallel.html#bad-interaction-of-multiprocessing-and-third-party-libraries 2. https://github.com/scikit-learn/scikit-learn/issues/6627

predict_kwds: dict, optional, default={}

keywords to be passed to the model’s predict function

data_transformer: function or None, optional, default=None

function to transform the data set as some features changing values

Returns:
pdp_interact_out: instance of PDPInteract