[PLing] Gastvortrag von Benjamin Roth zum Thema "Evaluation and Learning with Structured Test Sets"

Mon Oct 17 17:51:18 CEST 2022

Liebe Kolleg*innen,

ich möchte Sie sehr herzlich zum Gastvortrag von Benjamin Roth 
(Universität Wien) einladen. In seinem Vortrag beschäftigt er sich mit 
der Extraktion von Wissen aus Text und einer alternativen Methode, um 
Modelle für maschinelles Lernen zu trainieren und zu evaluieren. Der 
Vortrag trägt den Titel "Evaluation and Learning with Structured Test 
Sets" und findet am Mittwoch, den 19.10., um 18:30 statt. Er ist Teil 
der aktuell laufenden Vortragsreihe des Österreichischen 
Forschungsinstituts für Artificial Intelligence (OFAI).

Der Vortrag wird in hybrider Form abgehalten, d.h. die Teilnahme ist 
auch vor Ort am OFAI möglich (Freyung 6/6/7, 1010 Vienna). Das Tragen 
einer FFP2 Maske wird empfohlen. Alternativ ist die Teilnahme auch über 
Zoom möglich:

URL: 
https://us06web.zoom.us/j/84282442460?pwd=NHVhQnJXOVdZTWtNcWNRQllaQWFnQT09
Meeting ID: 842 8244 2460
Passcode: 678868

Abstract und Biographie finden Sie unten angehängt.

Wir freuen uns auf Ihre Teilnahme!

Mit besten Grüßen,
Stephanie Gross

_Abstract_: Behavioural testing – verifying system capabilities by 
validating human-designed input-output pairs – is an alternative 
evaluation method of natural language processing systems proposed to 
address the shortcomings of the standard approach: computing metrics on 
held-out data. While behavioural tests capture human prior knowledge and 
insights, there has been little exploration on how to leverage them for 
model training and development. With this in mind, we explore 
behaviour-aware learning by examining several fine-tuning schemes using 
HateCheck, a suite of functional tests for hate speech detection 
systems. To address potential pitfalls of training on data originally 
intended for evaluation, we train and evaluate models on different 
configurations of HateCheck by holding out categories of test cases, 
which enables us to estimate performance on potentially overlooked 
system properties. The fine-tuning procedure led to improvements in the 
classification accuracy of held-out functionalities and identity groups, 
suggesting that models can potentially generalise to overlooked 
functionalities. However, performance on held-out functionality classes 
and i.i.d. hate speech detection data decreased, which indicates that 
generalisation occurs mostly across functionalities from the same class 
and that the procedure led to overfitting to the HateCheck data 
distribution.

_Biography_: Benjamin Roth is a professor in the area of deep learning & 
statistical NLP, leading the WWTF Vienna Research Group for Young 
Investigators "Knowledge-Infused Deep Learning for Natural Language 
Processing". Prior to this, he was an interim professor at LMU Munich. 
He obtained his PhD from Saarland University and did a postdoc at UMass, 
Amherst. His research interests are the extraction of knowledge from 
text with statistical methods and knowledge-supervised learning.

-- 
------------------------------------------------------------------
Mag. Dr. Stephanie Gross MSc     | Austrian Research Institute for
email:stephanie.gross at ofai.at    | Artificial Intelligence (OFAI)
phone: (+43-1)5324621-1          | Freyung 6/3/1a
                                  | A-1010 Vienna, Austria
------------------------------------------------------------------
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.univie.ac.at/pipermail/pling/attachments/20221017/ae6475b4/attachment.html>