Keynotes and Panel

COLM 2025 will include five keynotes and one panel. We are excited to have our keynotes and panel reflect the broad impact and intellectual spectrum of language modeling.

Keynote Speakers

Luke Zettlemoyer (University of Washington / Meta)

Mixed-modal Language Modeling

ABSTRACT: Multimodal architectures are typically designed for specific modalities (image->text, text->image, text only, etc). In this talk, I will present our recent work on a series of early fusion mixed-modal models with generalized architectures that can instead generate arbitrary mixed sequences of images and text. Such models have the ability to unlock fundamentally new multimodal chain-of-thought reasoning capabilities, as I will show through an early model for multimodal tool use, but determining the best mixed-model architecture remains an open challenge. I will discuss and contrast two models architectures, Chameleon and Transfusion, that make very different assumptions about how to model mixed-modal data, and argue for moving from a tokenize-everything approach to newer models that are hybrids of autoregressive transformers and diffusion. I will also cover recent efforts to better understand how to more stably train such models at scale without excessive modality competition, using a mixture of transformers technique. Together, these advances lay a possible foundation for universal models that can understand and generate data in any modality, and I will also sketch some of the steps that we still need to focus on to reach this goal.

BIO: Luke Zettlemoyer is a Professor in the Paul G. Allen School of Computer Science & Engineering at the University of Washington, and a Senior Research Director at Meta. His research focuses on empirical methods for natural language semantics, and involves designing machine learning algorithms, introducing new tasks and datasets, and, most recently, studying how to best develop self-supervision signals for pre-training. His honors include being elected ACL President, named an ACL Fellow, winning a PECASE award, an Allen Distinguished Investigator award, and multiple best paper awards. Luke was an undergrad at NC State University, received his PhD from MIT and was a postdoc at the University of Edinburgh.

Tom Griffiths (Princeton University)

Mapping the Jagged Edges of AI with Cognitive Science

ABSTRACT: Current artificial intelligence systems demonstrate surprising amount of heterogeneity in their abilities, displaying superhuman competence in some tasks but puzzling limitations in others. I will argue that the tools we need for understanding this heterogeneity can be found in cognitive science, where researchers have spent decades developing theoretical and empirical methods for making sense of the capabilities of intelligent systems. Work by cognitive scientists suggests two strategies for mapping the jagged edges of AI: identifying general properties of neural networks that might translate into limitations for current AI systems, and considering cases where human minds might provide a guide to problems that pose a challenge for AI. I will present examples of both strategies, discussing some surprising cases where large language models perform poorly in predictable ways and recent results using the limits of human cognition to predict cases where large language models and vision language models fail.

BIO: Tom Griffiths is the Henry R. Luce Professor of Information Technology, Consciousness and Culture in the Departments of Psychology and Computer Science at Princeton University, where he is also the Director of the new AI Lab. His research explores connections between human and machine learning, using ideas from statistics and artificial intelligence to understand how people solve the challenging computational problems they encounter in everyday life. He has made contributions to the development of Bayesian models of cognition, probabilistic machine learning, nonparametric Bayesian statistics, and models of cultural evolution, and his recent work has demonstrated how methods from cognitive science can shed light on modern artificial intelligence systems. Tom completed his PhD in Psychology at Stanford University in 2005, and taught at Brown University and the University of California, Berkeley before moving to Princeton. He has received awards for his research from organizations ranging from the American Psychological Association to the National Academy of Sciences and is a co-author of the book Algorithms to Live By, introducing ideas from computer science and cognitive science to a general audience. His new book, The Laws of Thought, tells the story of people trying to use mathematics to understand the mind, and comes out in February 2026.

Nicholas Carlini (Anthropic)

Are the harms and risks of LLMs worth it?

ABSTRACT: Having largely succeed at creating highly effective language models over the past decades, this talk examines the risks we now face. I discuss both the immediate harms that we are already facing today (e.g., sycophancy), risks that seem inevitable to arrive in short order (e.g., LLM aided abuse), and 'existential risks' some argue are likely to arrive soon after that (e.g., Doom?). I argue that each of these risks fall on a single continuum, and progress mitigating any one of these risks contributes positively towards mitigating the others. I propose several research directions that have been under-explored by the academic ML/LLM research community that could help mitigate these risks, and advocate for increased collaboration between all those who are concerned about risks of 'AI' broadly construed.

BIO: Nicholas Carlini is a research scientist at Anthropic working at the intersection of security and machine learning. His current work studies what harms an adversary could do with, or do to, language models. For his work, he has received best paper award from ICML, EuroCrypt, USENIX Security, and IEEE S&P. He obtained his PhD from UC Berkeley in 2018 under David Wagner.

Shirley Ho (Flatiron Institute)

Building a Polymathic Foundation Model for Science

ABSTRACT: Foundation models like ChatGPT, Claude & Gemini have dramatically altered the modern work landscape for many industries reliant on language tasks, but no equivalent model exists yet for scientific applications. Incorporating foundation models into research workflows could enable unprecedented discoveries that connect traditionally distinct scientific subdisciplines. However, mainstream foundation models trained on human-generated datasets will be insufficient for analyzing most scientific phenomena — a foundation model for science will require special consideration for the requirements of scientific datasets, especially those with wide dynamic ranges. In this talk, I will introduce the Polymathic AI initiative: Our goal is to accelerate the development of versatile foundation models tailored for numerical datasets and scientific machine learning tasks. The challenge we are undertaking is to build artificial intelligence (AI) models that leverage information from heterogeneous datasets and across different scientific fields, which, contrary to domains like natural language processing, do not share a unifying representation (i.e., text). Such models can then be used as strong baselines or be further fine-tuned by scientists for specific applications. This approach has the potential to democratize AI in science by providing off-the-shelf models that have stronger priors (i.e., background knowledge) for shared general concepts such as causality, measurement, signal processing and even more specialized shared concepts like wave-like behavior, which otherwise would need to be learned from scratch. I will present our most recent results and projects, including large scientific datasets designed for large-scale training “MultiModal Universe” and “The Well.”

BIO: Dr. Shirley Ho is a Group Leader and Senior Research Scientist at Simons Foundation and a research professor at Department of Physics and Center for Data Science at NYU. She is the PI for Polymathic AI research collaboration, where she leads a group of ML and domain scientists building next generation large AI models for science. She is a fellow of the International Astrostatistics Association and serves on multiple advisory boards and committees including National Academy of Sciences. Her previous appointments include Senior Scientist at Lawrence Berkeley National Laboratory, Cooper-Siegel Chair Professor (with tenure) at Carnegie Mellon University, Chamberlain and Seaborg Fellow at Lawrence Berkeley National Laboratory. She finished her Ph.D in Astrophysical Sciences at Princeton University under supervision of Professor David Spergel.

Gillian Hadfield (Johns Hopkins University)

Alignment is social: lessons from human alignment for AI

ABSTRACT: Current approaches conceptualize the alignment challenge as one of eliciting individual human preferences and training models to choose outputs that that satisfy those preferences. To the extent these approaches consider the fact that the world is composed of many individuals, they do so only by seeking to reconcile or aggregate pluralistic, but still individual, preferences. But these approaches are not grounded in well-founded theory of how humans and human societies work. Humans are fundamentally social beings and the challenge of inducing self-interested humans to act in ways that are good for others is the fundamental alignment challenge of human societies. Alignment in human societies is not achieved by inducing the same or average innate preferences in individuals but by aligning individual behaviors with normative classifications (which behaviors are acceptable, which are not) reached through informal and formal social processes (which we can call institutions). In this talk I'll discuss three ideas for shifting our approaches for AI alignment based on the human model: building normatively competent AI agents; using reinforcement learning to train models to produce aligned justifications for their behaviors that perform well in a discursive social debate context; and developing true jury procedures for democratic human oversight of model behaviors.

BIO: Gillian K. Hadfield is the Bloomberg Distinguished Professor of AI Alignment and Governance at the Whiting School of Engineering and the School of Government and Policy at Johns Hopkins University. She is a faculty member of the Vector Institute for Artificial Intelligence and is a Schmidt Sciences AI2050 Senior Fellow. Originally trained as an economist and legal scholar, Hadfield's research now focuses on innovative design for legal, regulatory, and technical systems for AI, computational models of human normative systems, and building AI systems that understand and respond to human values and norms.

Open LLMs in the Reasoning Era Panel

A panel discussion moderated by Alexander “Sasha” Rush exploring the current state and future directions of open large language models, particularly focusing on their reasoning capabilities and the implications for the broader AI research community.

Keynotes and Panel

Keynote Speakers

Luke Zettlemoyer (University of Washington / Meta)

Tom Griffiths (Princeton University)

Nicholas Carlini (Anthropic)

Shirley Ho (Flatiron Institute)

Gillian Hadfield (Johns Hopkins University)

Open LLMs in the Reasoning Era Panel

Open LLMs in the Reasoning Era Panel

Participants

Azalia Mirhoseini (Stanford University)

Junyang Lin (Qwen)

Eric Wallace (OpenAI)