shapFlex Consistency

Nickalus Redell

2020-01-29

Purpose

Example

Load Packages and Data

Model Training

Stochastic Shapley Values

Predict function

  • For shapFlex, the required user-defined prediction function takes 2 positional arguments and returns a 1-column data.frame.

  • Note the creation of the catboost-specific data format inside this function.

Tree-Based Shapley Values

Results

## # A tibble: 13 x 2
##    feature_name   cor_coef
##    <chr>             <dbl>
##  1 age               0.997
##  2 capital_gain      0.999
##  3 capital_loss      0.986
##  4 education         0.992
##  5 education_num     0.998
##  6 hours_per_week    0.995
##  7 marital_status    0.993
##  8 native_country    0.997
##  9 occupation        0.997
## 10 race              0.997
## 11 relationship      0.991
## 12 sex               0.967
## 13 workclass         0.982