1

LLMCheckup: Conversational Examination of Large Language Models via Interpretability Tools and Self-Explanations

Interpretability tools that offer explanations in the form of a dialogue have demonstrated their efficacy in enhancing users' understanding (Slack et al., 2023; Shen et al., 2023), as one-off explanations may fall short in providing sufficient …

Retrieval-Augmented Knowledge Integration into Language Models: A Survey

Assessing Authenticity and Anonymity of Synthetic User-generated Content in the Medical Domain

Since medical text cannot be shared easily due to privacy concerns, synthetic data bears much potential for natural language processing applications. In the context of social media and user-generated messages about drug intake and adverse drug …

InterroLang: Exploring NLP Models and Datasets through Dialogue-based Explanations

While recently developed NLP explainability methods let us open the black box in various ways (Madsen et al., 2022), a missing ingredient in this endeavor is an interactive tool offering a conversational interface. Such a dialogue system can help …

Factuality Detection using Machine Translation - a Use Case for German Clinical Text

Factuality can play an important role when automatically processing clinical text, as it makes a difference if particular symptoms are explicitly not present, possibly present, not mentioned, or affirmed. In most cases, a sufficient number of …

Inseq: An Interpretability Toolkit for Sequence Generation Models

Past work in natural language processing interpretability focused mainly on popular classification tasks while largely overlooking generation settings, partly due to a lack of dedicated tools. In this work, we introduce Inseq, a Python library to …

Neural Machine Translation Methods for Translating Text to Sign Language Glosses

State-of-the-art techniques common to low resource Machine Translation (MT) are applied to improve MT of spoken language text to Sign Language (SL) glosses. In our experiments, we improve the performance of the transformer-based models via (1) data …

Saliency Map Verbalization: Comparing Feature Importance Representations from Model-free and Instruction-based Methods

Saliency maps can explain a neural model's predictions by identifying important input features. They are difficult to interpret for laypeople, especially for instances with many features. In order to make them more accessible, we formalize the …

Efficient Language Model Training through Cross-Lingual and Progressive Transfer Learning

Most Transformer language models are primarily pretrained on English text, limiting their use for other languages. As the model sizes grow, the performance gap between English and other languages with fewer compute and data resources increases even …

MultiTACRED: A Multilingual Version of the TAC Relation Extraction Dataset

Relation extraction (RE) is a fundamental task in information extraction, whose extension to multilingual settings has been hindered by the lack of supervised resources comparable in size to large English datasets such as TACRED (Zhang et al., 2017). …