I train a linear model to predict house price, and then I compare the Shapley values calculation manually vs the values returned by the SHAP library and they are slightly different.
My understanding is that for linear models the Shapley value is given by coeff * features for obs – coeffs * mean(features in training set). Or as stated in the SHAP documentation: coef[i] * (x[i] - X.mean(0)[i])
, where i is one feature.
The question is, why does SHAP return different values from the manual calculation?
Here is the code:
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import MinMaxScaler
import shap
X, y = fetch_california_housing(return_X_y=True, as_frame=True)
X = X.drop(columns = ["Latitude", "Longitude", "AveBedrms"])
X_train, X_test, y_train, y_test = train_test_split(
X, y, test_size=0.3, random_state=0,
)
scaler = MinMaxScaler().set_output(transform="pandas").fit(X_train)
X_train = scaler.transform(X_train)
X_test = scaler.transform(X_test)
linreg = LinearRegression().fit(X_train, y_train)
coeffs = pd.Series(linreg.coef_, index=linreg.feature_names_in_)
X_test.reset_index(inplace=True, drop=True)
obs = 6188
# manual shapley calculation
effect = coeffs * X_test.loc[obs]
effect - coeffs * X_train.mean()
which returns:
MedInc 0.123210
HouseAge -0.459784
AveRooms -0.128162
Population 0.032673
AveOccup -0.001993
dtype: float64
And the SHAP library returns something slightly different:
explainer = shap.LinearExplainer(linreg, X_train)
shap_values = explainer(X_test)
shap_values[obs]
Here the result:
.values =
array([ 0.12039244, -0.47172515, -0.12767778, 0.03473923, -0.00251017])
.base_values =
2.0809714707337523
.data =
array([0.25094137, 0.01960784, 0.06056066, 0.07912217, 0.00437137])
It is set to ignore interactions:
explainer.feature_perturbation
returning
'interventional'
Thank you!