1

Pattern-Guided Integrated Gradients

Integrated Gradients (IG) and PatternAttribution (PA) are two established explainability methods for neural networks. Both methods are theoretically well-founded. However, they were designed to overcome different challenges. In this work, we combine the two methods into a new method, Pattern-Guided Integrated Gradients (PGIG). PGIG inherits important properties from both parent methods and passes stress tests that the originals fail. In addition, we benchmark PGIG against nine alternative explainability approaches (including its parent methods) in a large-scale image degradation experiment and find that it outperforms all of them.

Bootstrapping Named Entity Recognition in E-Commerce with Positive Unlabeled Learning

In this work, we introduce a bootstrapped, iterative NER model that integrates a PU learning algorithm for recognizing named entities in a low-resource setting. Our approach combines dictionary-based labeling with syntactically-informed label …

Considering Likelihood in NLP Classification Explanations with Occlusion and Language Modeling

Recently, state-of-the-art NLP models gained an increasing syntactic and semantic understanding of language, and explanation methods are crucial to understand their decisions. Occlusion is a well established method that provides explanations on …

Probing Linguistic Features of Sentence-Level Representations in Neural Relation Extraction

Despite the recent progress, little is known about the features captured by state-of-the-art neural relation extraction (RE) models. Common methods encode the source sentence, conditioned on the entity mentions, before classifying the relation. …

TACRED Revisited: A Thorough Evaluation of the TACRED Relation Extraction Task

TACRED is one of the largest, most widely used crowdsourced datasets in Relation Extraction (RE). But, even with recent advances in unsupervised pre-training and knowledge enhanced neural RE, models still show a high error rate. In this paper, we …

Evaluating German Transformer Language Models with Syntactic Agreement Tests

Pre-trained transformer language models (TLMs) have recently refashioned natural language processing (NLP): Most stateof-the-art NLP models now operate on top of TLMs to benefit from contextualization and knowledge induction. To explain their …

Abstractive Text Summarization based on Language Model Conditioning and Locality Modeling

We explore to what extent knowledge about the pre-trained language model that is used is beneficial for the task of abstractive summarization. To this end, we experiment with conditioning the encoder and decoder of a Transformer-based neural model on …

Cross-lingual Neural Vector Conceptualization

Recently, Neural Vector Conceptualization (NVC) was proposed as a means to interpret samples from a word vector space. For NVC, a neural model activates higher order concepts it recognizes in a word vector instance. To this end, the model first needs …

Layerwise Relevance Visualization in Convolutional Text Graph Classifiers

Representations in the hidden layers of Deep Neural Networks (DNN) are often hard to interpret since it is difficult to project them into an interpretable domain. Graph Convolutional Networks (GCN) allow this projection, but existing explainability …

Fine-Tuning Pre-Trained Transformer Language Models to Distantly Supervised Relation Extraction

We show that generative language model pre-training combined with selective attention improves recall for long-tail relations in distantly supervised neural relation extraction.