Two papers from DFKI-NLP authors have been accepted for publication at EMNLP 2022, the 2022 Conference on Empirical Methods in Natural Language Processing. The conference is planned to be a hybrid meeting and will take place in Abu Dhabi, from Dec 7th to Dec 11th, 2022. The first paper introduces an efficient and effective method that constructs prompts from relation triples and involves only minimal translation for the class labels, in the case of in-language prompting. We evaluate its performance in fully supervised, few-shot and zero-shot scenarios across 14 languages, soft prompt variants, and English-task training in cross-lingual settings. The second paper proposes neighborhood contrastive learning for the representation learning of scientific document and achieves new state-of-the-art results on the SciDocs benchmark.
One paper from DFKI-NLP authors has been accepted for publication at the Workshop on Information Extraction from Scientific Publications (WIESP). The workshop will be held at AACL-IJCNLP 2022, the 2nd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 12th International Joint Conference on Natural Language Processing, which will take place as an online-only event from Nov. 20 to Nov. 23, 2022. The paper proposes a new method for argument mining in full-text scientific documents by combining argumentative discourse unit recognition with relation extraction.
One paper from DFKI-NLP authors has been accepted for publication at JCDL 2022, the 22nd ACM/IEEE Joint Conference on Digital Libraries. The conference is planned to be a hybrid meeting and will take place in Cologne, Germany, from June 20th through June 24th, 2022. The paper proposes to replace generic document embeddings with specialized, per section document embeddings, and evaluates this approach on the task of aspect-based similarity computation for research papers.
Six papers from DFKI-NLP authors have been accepted for publication at LREC 2022, the 13th Language Resources and Evaluation Conference. The conference is planned to be a hybrid meeting and will take place in Marseille, France, from June 20th through June 25th, 2022. The paper by Dehio et al. is on claim extraction and matching in COVID-19-related Legislation, the one by Raither et al. presents a novel corpus for German-language Adverse Drug Reaction (ADR) detection in patient-generated content. The paper by Gabryszak et al. also presents a corpus, in this case of German-language tweets annotated with their relevance for public transportation, and with sentiment towards aspects related to barrier-free travel. The fourth paper by Macketanz et al. presents a fine-grained machine translation test suite for the language pair German-English. The test suite is based on a number of linguistically motivated categories and phenomena and the semi-automatic evaluation is carried out with regular expressions. The fSeiffe et al.’s paper presents a work on how to model subjective text complexity, by constructing and analyzing a German text corpus labeled with expert and non-expert complexity ratings. The final paper by Calizzano et al. introduces a new dataset called WikinewsSum for English, German, French, Spanish, Portuguese, Polish, and Italian summarisation tailored for extended summaries of approx. 11 sentences, and compares three multilingual transformer models on the extractive summarisation task and three training scenarios on which we fine-tune mT5 to perform abstractive summarisation.
Four papers from DFKI-NLP authors have been accepted for publication at ACL 2022, the 60th Annual Meeting of the Association for Computational Linguistics. The conference is planned to be a hybrid meeting and will take place in Dublin, Ireland, from May 22nd through May 27th, 2022. One paper is on evaluating pre-trained encoders on the task of low-resource NER across several English and German datasets, the other analyzes relation classification evaluation and suggests that using F1 weightings other than micro-F1 tells us much more about model performance, e.g. on imbalanced datasets. The third paper proposes a novel approach to encode and inject hierarchical structure information explicitly into an extractive, transformer-based summarization model. The final paper present a study that aims to uncover relevant perceptual quality dimensions for one type of machine-generated text, that is, Machine Translation. We conducted a crowd-sourcing survey in the style of a Semantic Differential to collect attribute ratings for German MT outputs.