Lecture series: Machines that understand?

Large Language Models and Artificial Intelligence (WiSe 2023/2024)

The aim of the lecture series is to make current developments in the field of generative artificial intelligence understandable and to stimulate an informed dialogue about the capabilities, limitations and societal relevance of these models.

Top-class international researchers are invited to present their current research to a broad university public. In addition to technical aspects, topics include questions of fairness and responsibility in AI models, and the importance of AI for the broader university context, e.g. in the field of digital humanities.

Start: Thursday, October 5th, 16:45-18:15
End: Thursday, January 25th, 16:45-18:15
Location: Hörsaal 1 lecture hall, Oskar-Morgenstern-Platz 1
Or online via live u:stream (and Zoom on Oct 19 and Jan 18)
Languages: German and English

Information in u:find

Program

Date Speaker / Description
October 5, 2023 Opening and panel discussion:
Was bedeutet Generative KI für Hochschule und Gesellschaft? (German)
Experts:
Dagmar Gromann, Centre for Translation Studies
Matthias Leichtfried, German Studies, Faculty of Philological and Cultural Studies
Claudia Plant, Data Mining and Machine Learning, Faculty of Computer Science
Host:
Benjamin Roth, Digital Philology, Faculty of Philological and Cultural Studies and Faculty of Computer Science
Video on YouTube (Uni Wien Live)
October 12, 2023 Introductory Lecture: Foundations of Neural Language Models and Overview of Topics (English)
(Anna Beer / Benjamin Roth / Claudia Plant)
October 19, 2023 Invited Talk: Dirk Hovy, Bocconi University, Milan / Italy
Unhumanizing Models. Why we Need to Change how We Think about AI (English, online)
Abstract: AI models are seemingly everywhere these days, feeling more human than ever before. However, we should be careful not to humanize those models, as it gives them powers they do not possess and obscures the flaws they do have.
In this talk, I will look at examples where models seem to know, understand, feel, create, judge, or move like humans and why it is still wrong to anthropomorphize them. At the same time, we will have to find a way to inform AI about what it means to be human if we want to prevent harm and improve their capabilities. For that, we must look across disciplinary boundaries and build a (social) theory of AI.

Recording available on Moodle.
November 9, 2023 Invited Talk: Alexander Koller, Saarland University, Saarbrücken / Germany
ChatGPT does not really understand you, does not really know anything, but is still revolutionary AI (English)
Abstract: Large neural language models (LLMs), such as ChatGPT, have revolutionized AI. It is hard not to be impressed by the quality of text and program code they generate, and they seem to encode unprecedented amounts of knowledge about the world. At the same time, it is well known that LLMs are prone to hallucinations, often produce factually incorrect language, and the "knowledge" they encode is faulty and inconsistent.
In this talk, I will discuss some strengths and weaknesses in the ability of LLMs to understand language and reason with knowledge. I will present arguments that the way that LLMs are trained cannot, in principle, lead to human-level language understanding. I will also talk about recent work that leverages the knowledge encoded in LLMs for planning. In this way, I hope to offer a balanced picture of the situation and engage in an informed discussion with the audience.
November 16, 2023 Invited Talk: Sepp Hochreiter, JKU Linz
Memory Architectures for Deep Learning (English)
Abstract: Currently, the most successful Deep Learning architecture is the transformer. The attention mechanism of the transformer is equivalent to modern Hopfield networks, therefore is an associative memory. However, this associative memory has disadvantages like its quadratic complexity with the sequence length when mutually associating sequences elements, its restriction to pairwise associations, its limitations in modifying the memory, its insufficient abstraction capabilities. In contrast, recurrent neural networks (RNNs) like LSTMs have linear complexity, associate sequence elements with a representation of all previous elements, can directly modify memory content, and have high abstraction capabilities. However, RNNs cannot store sequence elements that were rare in the training data, since RNNs have to learn to store. Transformer can store rare or even new sequence elements, which is one of the main reasons besides their high parallelization why they outperformed RNNs in language modelling. I think that future successful Deep Learning architectures should comprise both of these memories: attention for implementing episodic memories and RNNs for implementing short-term memories and abstraction.
November 23, 2023 Invited Talk: Nikolaus Forgo, University of Vienna
Die EU als "regulatory superpower"? Überlegungen zur europäischen KI-Regulierung. (German)
Abstract: Die Vorlesung befasst sich mit europäischen Ansätzen zur Technologieregulierung, insbesondere zur KI-Regulierung am Beispiel der EU und Österreichs. Im Fokus steht dabei insb. der (Vorschlag zu einem) AI-Act.
November 30, 2023 Invited Talk: Barbara Plank, LMU Munich, Munich / Germany
Revisiting Trustworthiness in NLP - Two Views on Uncertainty (English)
Abstract: Despite the recent success of Natural Language Processing (NLP), driven by advances in large language models (LLMs) trained on enormous amounts of data, there are many challenges ahead to make NLP more trustworthy. This talk explores two pivotal views on trustworthiness related to uncertainty of models and humans in Natural Language Processing. The first view emphasizes trustworthiness and reliability in NLP, focusing on principles to create reliable NLP systems. The second view centers on human variability in labeling or text production, highlighting the challenges of disagreement. It unifies notions of label variation and suggests a path forward to address this complex issue, fostering a comprehensive understanding of human variability in NLP and ML.
December 7, 2023 Invited Talk: Ondrej Dusek, Charles University, Prague / Czechia
Skipping Chit-chat with ChatGPT: Large Language Models and Structured Outputs (English)
Abstract: The current state of the art in text generation are large language models (LLMs) pretrained on vast amounts of text and finetuned to produce solutions given instructions. LLMs represent significant progress, allowing the user to request outputs for various tasks by stating a query in natural language and being able to follow examples provided by the user (in-context learning/prompting), without the need for further training (finetuning) on task-specific data. However, they retain some of the problems of the previous generation of language models, in particular their opacity and lack of controllability. This talk will show experiments on using LLMs with prompting only for multiple tasks: data-to-text generation, task-oriented dialogues, and dialogue evaluation. All these tasks operate with structure (structured data input, structured outputs, structured dialogue), which is not what these LLMs were specifically pretrained for. I show that LLMs are usable for these tasks, but also point out their limitations and potential areas of improvement.
December 14, 2023 Invited Talk: John Pavlopoulos, Athens University of Economics and Business, Athens / Greece
Machine Learning for Ancient Languages (English)
Abstract: Ancient languages preserve the cultures and histories of the past. However, their study is fraught with difficulties, and experts must tackle a range of challenging text-based tasks, from deciphering lost languages to restoring damaged inscriptions, to determining the authorship of works of literature. Technological aids have long supported the study of ancient texts, but in recent years advances in artificial intelligence and machine learning have enabled analyses on a scale and in a detail that are reshaping the field of humanities, similarly to how microscopes and telescopes have contributed to the realm of science. Based on a recent survey, this talk will focus on published research using machine learning for the study of ancient texts written in any language, script, and medium, spanning over three and a half millennia of civilizations around the ancient world. By analysing the relevant literature, lessons learnt and promising directions for future work in this interdisciplinary field are highlighted.
January 11, 2024 Invited Talk: Asia Biega, Max Planck Institute for Security and Privacy, Bochum / Germany
Data Protection in Data-Driven Systems (English)
Abstract: Modern AI systems are characterized by extensive personal data collection, despite increasing societal costs of such practices. To prevent harms, data protection regulations specify several principles for respectfully processing user data, such as purpose limitation, data minimization, or consent. Yet, practical implementations of these principles leave much to be desired. This talk will delve into the computational and human factors that contribute to such lax implementations, and examine potential improvements.
January 18, 2024 Revision of the lecture topics, question and answer session (English)
(Anna Beer / Claudia Plant / Benjamin Roth)

Online ONLY


Zoom link, Passcode: 'LLM-AI-[city]' (please replace '[city]' with 'VIE')
January 25, 2024 Exam