[PLing] Vortrag von Michael Pucher zum Thema "Synthesizing Dialects, Faces, Singing Voices, Songbirds, and Famous Dead Actors"
Tristan Miller
tristan.miller at ofai.at
Tue Feb 28 13:52:17 CET 2023
Liebe Kolleg*innen,
ich möchte Sie sehr herzlich zum Vortrag von Michael Pucher vom
Österreichischen Forschungsinstituts für Artificial Intelligence (OFAI)
einladen. Sein Vortrag mit den Titel "Synthesizing Dialects, Faces,
Singing Voices, Songbirds, and Famous Dead Actors" ist Teil der
aktuellen Vortragsreihe des OFAI und wird am Mittwoch, den 1.3.2024 um
18:30 (UTC+1) am OFAI (Freyung 6/6/7, 1010 Wien) und auch online
stattfinden.
"Synthesizing Dialects, Faces, Singing Voices, Songbirds, and Famous
Dead Actors"
Dr. Michael Pucher
Österreichisches Forschungsinstitut für Artificial Intelligence
Zoom Zugang:
URL:
https://us06web.zoom.us/j/84282442460?pwd=NHVhQnJXOVdZTWtNcWNRQllaQWFnQT09
Meeting ID: 842 8244 2460
Passcode: 678868
Abstract und Biographie finden Sie unten angehängt.
Wir freuen uns auf Ihre Teilnahme!
Mit freundlichen Grüßen
Tristan Miller
Talk abstract: During the last decades statistical parametric speech
synthesis has significantly improved the quality and flexibility of
speech synthesis systems. This development started with hidden Markov
models (HMM) and then another big step of improvement in acoustic
modeling and vocoding was made with deep neural networks (DNN). In this
talk I will present a range of applications of statistical parametric
speech synthesis that we have investigated. In the field of acoustic
speech synthesis I will show how dialect interpolation can be realized,
which allows for the generation of in-between language varieties. In
audio-visual speech synthesis joint audio-visual modeling and visual
control will be presented. In singing speech I will describe our work
towards an opera style singing synthesis system that is trained on high
quality opera singing data. A model for synthesis of singing birds will
be presented that can control bird songs by symbolic input sequences.
Finally, I will present a DNN-based synthesizer of a famous Austrian
actor that we have built from audio book data, and that was used in a
theater play. I will conclude my talk with an outlook on the future of
speech synthesis technologies, remaining technical and possible societal
challenges.
Speaker biography: Michael Pucher is Senior Researcher at the Austrian
Research Institute for Artificial Intelligence (OFAI) and Senior Speech
Technologist at Recognosco, Vienna, Austria. He obtained his doctoral
degree (Dr.techn.) in Electrical and Information Engineering from Graz
University of Technology in 2007. In 2017 he received the venia docendi
in Speech Communication at Graz University of Technology with a
habilitation thesis on Speech Processing for Multimodal and Adaptive
Systems. His research interests are acoustic modeling for speech
recognition, semantic language modeling, speech synthesis for language
varieties, persona design for speech-based systems, multimodal and
spoken dialog systems, audio-visual speech synthesis, synthesis of
singing, synthesis of animal vocalisations, digital phonetics, and
sociophonetics. He has also made significant contributions in the area
of speaker verification spoofing, where we showed how adaptive
synthesizers can spoof a speaker verification system.
--
Dr.-Ing. Tristan Miller, Research Scientist
Austrian Research Institute for Artificial Intelligence (OFAI)
Freyung 6/6, 1010 Vienna, Austria | Tel: +43 1 5336112 12
https://logological.org/ | https://punderstanding.ofai.at/
More information about the PLing
mailing list