LIME & SHAP：这有两种解释自然语言处理模型的方法

2019-07-05 11:36:42

上周，我看了一个关于“NLP的实践特性工程”的演讲。主要是关于LIME和SHAP在文本分类可解释性方面是如何工作的。我决定写一篇关于它们的文章，因为它们很有趣、易于使用，而且视觉上很吸引人。

所有的机器学习模型都是在更高的维度上运行的，而不是在人脑可以直接看到的维度上运行的，这些机器学习模型都可以被称为黑盒模型，它可以归结为模型的可解释性。特别是在NLP领域中，特征的维数往往很大，说明特征的重要性变得越来越复杂。

LIME & SHAP不仅帮助我们向终用户解释NLP模型的工作原理，而且帮助我们自己解释NLP模型是如何工作的。

利用 Stack Overflow 问题标签分类数据集，我们将构建一个多类文本分类模型，然后分别应用LIME和SHAP对模型进行解释。由于我们之前已经做过多次文本分类，所以我们将快速构建NLP模型，并着重于模型的可解释性。

数据预处理、特征工程和逻辑回归

import pandas as pd

import numpy as np

import sklearn

import sklearn.ensemble

import sklearn.metrics

from sklearn.utils import shuffle

from __future__ import print_function

from io import StringIO

import re

from bs4 import BeautifulSoup

from nltk.corpus import stopwords

from sklearn.model_selection import train_test_split

from sklearn.feature_extraction.text import CountVectorizer

from sklearn.linear_model import LogisticRegression

from sklearn.metrics import accuracy_score, f1_score, precision_score, recall_score

import lime

from lime import lime_text

from lime.lime_text import LimeTextExplainer

from sklearn.pipeline import make_pipeline

df = pd.read_csv('stack-overflow-data.csv')

df = df[pd.notnull(df['tags'])]

df = df.sample(frac=0.5, random_state=99).reset_index(drop=True)

df = shuffle(df, random_state=22)

df = df.reset_index(drop=True)

df['class_label'] = df['tags'].factorize()[0]

class_label_df = df[['tags', 'class_label']].drop_duplicates().sort_values('class_label')

label_to_id = dict(class_label_df.values)

id_to_label = dict(class_label_df[['class_label', 'tags']].values)

REPLACE_BY_SPACE_RE = re.compile('[/(){}[]|@,;]')

BAD_SYMBOLS_RE = re.compile('[^0-9a-z #+_]')

# STOPWORDS = set(stopwords.words('english'))

def clean_text(text):

"""

text: a string

return: modified initial string

"""

text = BeautifulSoup(text, "lxml").text # HTML decoding. BeautifulSoup's text attribute will return a string stripped of any HTML tags and metadata.

text = text.lower() # lowercase text

text = REPLACE_BY_SPACE_RE.sub(' ', text) # replace REPLACE_BY_SPACE_RE symbols by space in text. substitute the matched string in REPLACE_BY_SPACE_RE with space.

text = BAD_SYMBOLS_RE.sub('', text) # remove symbols which are in BAD_SYMBOLS_RE from text. substitute the matched string in BAD_SYMBOLS_RE with nothing.

# text = ' '.join(word for word in text.split() if word not in STOPWORDS) # remove stopwors from text

return text

df['post'] = df['post'].apply(clean_text)

list_corpus = df["post"].tolist()

list_labels = df["class_label"].tolist()

X_train, X_test, y_train, y_test = train_test_split(list_corpus, list_labels, test_size=0.2, random_state=40)

vectorizer = CountVectorizer(analyzer='word',token_pattern=r'w{1,}', ngram_range=(1, 3), stop_words = 'english', binary=True)

train_vectors = vectorizer.fit_transform(X_train)

test_vectors = vectorizer.transform(X_test)

logreg = LogisticRegression(n_jobs=1, C=1e5)

logreg.fit(train_vectors, y_train)

pred = logreg.predict(test_vectors)

accuracy = accuracy_score(y_test, pred)

precision = precision_score(y_test, pred, average='weighted')

recall = recall_score(y_test, pred, average='weighted')f1 = f1_score(y_test, pred, average='weighted')print("accuracy = %.3f, precision = %.3f, recall = %.3f, f1 = %.3f" % (accuracy, precision, recall, f1))

我们现在目标并不是产生好的结果。我想尽快进入LIME & SHAP，这就是接下来发生的事情。

用LIME解释文本预测

从现在开始，这是有趣的部分。下面的代码片段主要是从LIME教程中借来的。

c = make_pipeline(vectorizer, logreg)

class_names=list(df.tags.unique())

explainer = LimeTextExplainer(class_names=class_names)

idx = 1877

exp = explainer.explain_instance(X_test[idx], c.predict_proba, num_features=6, labels=[4, 8])

print('Document id: %d' % idx)

print('Predicted class =', class_names[logreg.predict(test_vectors[idx]).reshape(1,-1)[0,0]])

print('True class: %s' % class_names[y_test[idx]])

我们在测试集中随机选择一个文档，它恰好是一个标记为sql的文档，我们的模型也预测它是sql。使用这个文档，我们为标签4 (sql)和标签8 (python)生成解释。

print ('Explanation for class %s' % class_names[4])

print (''.join(map(str, exp.as_list(label=4))))

print ('Explanation for class %s' % class_names[8])

print (''.join(map(str, exp.as_list(label=8))))

很明显，这个文档对标签sql有高的解释。我们还注意到正负号与特定的标签有关，例如单词"sql"对类sql是正的，而对类python是负的，反之亦然。

我们要为这个文档生成2类标签顶部。

exp = explainer.explain_instance(X_test[idx], c.predict_proba, num_features=6, top_labels=2)

print(exp.available_labels())

它给出了sql和python。

exp.show_in_notebook(text=False)

让我来解释一下这种可视化:

1. 对于本文档，词 "sql"对于类sql具有高的正分数。

2. 我们的模型预测该文档应该标记为sql，其概率为。

3. 如果我们从文档中删除word"sql"，我们期望模型预测label sql的概率为 - 65% = 35%。

4. 另一方面，单词"sql"对于类python是负面的，我们的模型已经了解到单词"range"对于类python有一个小的正面得分。

我们可能想放大并研究类sql的解释，以及文档本身。

exp.show_in_notebook(text=y_test[idx], labels=(4,))

使用SHAP解释文本预测

以下过程是从本教程中学到的。

from sklearn.preprocessing import MultiLabelBinarizer

import tensorflow as tf

from tensorflow.keras.preprocessing import text

import keras.backend.tensorflow_backend as K

K.set_session

import shap

tags_split = [tags.split(',') for tags in df['tags'].values]

tag_encoder = MultiLabelBinarizer()

tags_encoded = tag_encoder.fit_transform(tags_split)

num_tags = len(tags_encoded[0])

train_size = int(len(df) * .8)

y_train = tags_encoded[: train_size]

y_test = tags_encoded[train_size:]

class TextPreprocessor(object):

def __init__(self, vocab_size):

self._vocab_size = vocab_size

self._tokenizer = None

def create_tokenizer(self, text_list):

tokenizer = text.Tokenizer(num_words = self._vocab_size)

tokenizer.fit_on_texts(text_list)

self._tokenizer = tokenizer

def transform_text(self, text_list):

text_matrix = self._tokenizer.texts_to_matrix(text_list)

return text_matrix

VOCAB_SIZE = 500

train_post = df['post'].values[: train_size]

test_post = df['post'].values[train_size: ]

processor = TextPreprocessor(VOCAB_SIZE)

processor.create_tokenizer(train_post)

X_train = processor.transform_text(train_post)

X_test = processor.transform_text(test_post)

def create_model(vocab_size, num_tags):

model = tf.keras.models.Sequential()

model.add(tf.keras.layers.Dense(50, input_shape = (VOCAB_SIZE,), activation='relu'))

model.add(tf.keras.layers.Dense(25, activation='relu'))

model.add(tf.keras.layers.Dense(num_tags, activation='sigmoid'))

model.compile(loss = 'binary_crossentropy', optimizer='adam', metrics = ['accuracy'])

return model

model = create_model(VOCAB_SIZE, num_tags)

model.fit(X_train, y_train, epochs = 2, batch_size=128, validation_split=0.1)

print('Eval loss/accuracy:{}'.format(model.evaluate(X_test, y_test, batch_size = 128)))

模型训练完成后，我们使用前200个训练文档作为背景数据集进行集成，并创建一个SHAP explainer对象。

我们在测试集的子集上获得各个预测的属性值。

将索引转换为单词。

使用SHAP的summary_plot方法来显示影响模型预测的主要特性。

attrib_data = X_train[:200]

explainer = shap.DeepExplainer(model, attrib_data)

num_explanations = 20

shap_vals = explainer.shap_values(X_test[:num_explanations])

words = processor._tokenizer.word_index

word_lookup = list()

for i in words.keys():

word_lookup.append(i)

word_lookup = [''] + word_lookup

shap.summary_plot(shap_vals, feature_names=word_lookup, class_names=tag_encoder.classes_)