[PLing] Vortrag von Michael Pucher zum Thema "Synthesizing Dialects, Faces, Singing Voices, Songbirds, and Famous Dead Actors"

Tue Feb 28 13:52:17 CET 2023

Liebe Kolleg*innen,

ich möchte Sie sehr herzlich zum Vortrag von Michael Pucher vom 
Österreichischen Forschungsinstituts für Artificial Intelligence (OFAI) 
einladen. Sein Vortrag mit den Titel "Synthesizing Dialects, Faces, 
Singing Voices, Songbirds, and Famous Dead Actors" ist Teil der 
aktuellen Vortragsreihe des OFAI und wird am Mittwoch, den 1.3.2024 um 
18:30 (UTC+1) am OFAI (Freyung 6/6/7, 1010 Wien) und auch online 
stattfinden.

"Synthesizing Dialects, Faces, Singing Voices, Songbirds, and Famous 
Dead Actors"
Dr. Michael Pucher
Österreichisches Forschungsinstitut für Artificial Intelligence

Zoom Zugang:
URL: 
https://us06web.zoom.us/j/84282442460?pwd=NHVhQnJXOVdZTWtNcWNRQllaQWFnQT09
Meeting ID: 842 8244 2460
Passcode: 678868

Abstract und Biographie finden Sie unten angehängt.

Wir freuen uns auf Ihre Teilnahme!

Mit freundlichen Grüßen
Tristan Miller

Talk abstract: During the last decades statistical parametric speech 
synthesis has significantly improved the quality and flexibility of 
speech synthesis systems. This development started with hidden Markov 
models (HMM) and then another big step of improvement in acoustic 
modeling and vocoding was made with deep neural networks (DNN). In this 
talk I will present a range of applications of statistical parametric 
speech synthesis that we have investigated. In the field of acoustic 
speech synthesis I will show how dialect interpolation can be realized, 
which allows for the generation of in-between language varieties. In 
audio-visual speech synthesis joint audio-visual modeling and visual 
control will be presented. In singing speech I will describe our work 
towards an opera style singing synthesis system that is trained on high 
quality opera singing data. A model for synthesis of singing birds will 
be presented that can control bird songs by symbolic input sequences. 
Finally, I will present a DNN-based synthesizer of a famous Austrian 
actor that we have built from audio book data, and that was used in a 
theater play. I will conclude my talk with an outlook on the future of 
speech synthesis technologies, remaining technical and possible societal 
challenges.

Speaker biography: Michael Pucher is Senior Researcher at the Austrian 
Research Institute for Artificial Intelligence (OFAI) and Senior Speech 
Technologist at Recognosco, Vienna, Austria. He obtained his doctoral 
degree (Dr.techn.) in Electrical and Information Engineering from Graz 
University of Technology in 2007. In 2017 he received the venia docendi 
in Speech Communication at Graz University of Technology with a 
habilitation thesis on Speech Processing for Multimodal and Adaptive 
Systems. His research interests are acoustic modeling for speech 
recognition, semantic language modeling, speech synthesis for language 
varieties, persona design for speech-based systems, multimodal and 
spoken dialog systems, audio-visual speech synthesis, synthesis of 
singing, synthesis of animal vocalisations, digital phonetics, and 
sociophonetics. He has also made significant contributions in the area 
of speaker verification spoofing, where we showed how adaptive 
synthesizers can spoof a speaker verification system.

-- 
Dr.-Ing. Tristan Miller, Research Scientist
Austrian Research Institute for Artificial Intelligence (OFAI)
Freyung 6/6, 1010 Vienna, Austria | Tel: +43 1 5336112 12
https://logological.org/ | https://punderstanding.ofai.at/