Cours 1 (20/09) Probabilistic Models of Human Language: Foundations
Probabilistic models have thoroughly reshaped computational linguistics and continues to profoundly change other areas in the scientific study of language, ranging from psycholinguistics to syntax and phonology and even pragmatics and sociolinguistics. This change has included (a) qualitative improvements in our ability to analyze complex linguistic datasets and (b) new conceptualizations of language knowledge, acquisition, and use. For the most part, these changes have occurred in parallel, but the same theoretical toolkit underlies both advances. In this lecture I give a concise introduction to this toolkit, covering the fundamentals of contemporary probabilistic models in the study of language, with examples including phoneme identification, perceptual magnet effects, and simple hierarchical models. This lecture includes content of theoretical interest in its own right, as well as tools and concepts that are fundamental to the other three lectures of the series.
Cours 2 (24/09) Surprisal, memory constraints, and the noisy channel in human sentence processing
Human language comprehension poses some of the deepest scientific challenges in accounting for the capabilities of the human mind. In this lecture I describe several major advances we have recently made in this domain that have led to a state-of-the-art theory of language comprehension. First, I describe a detailed expectation-based theory of real-time language understanding, surprisal, that unifies three topics central to the field — ambiguity resolution, prediction, and syntactic complexity — and that finds broad empirical support. I alo cover work on memory constraints that seem to influence patterns of processing difficulty in sentence comprehension, independently of surprisal. Finally, I describe a “noisy-channel” theory which generalizes the expectation-based theory by removing the assumption of modularity between the processes of individual word recognition and sentence-level comprehension. This theory accounts for critical outstanding puzzles for previous approaches, and helps move us toward a theoretical integration of surprisal and memory.
Cours 3 (1/10) Computational and experimental pragmatics
In constructing theories of linguistic meaning in context it has been productive to distinguish between strictly semantic content, or the “literal” meanings of atomic expressions (e.g., words) and the rules of meaning composition, and pragmatic enrichment, by which speakers and listeners can rely on general principles of cooperative communication to take understood communicative intent far beyond literal content. Major open questions remain, however, of how to formalize pragmatic inference and characterize its relationship with semantic composition. In this lecture I describe recent work within a Bayesian framework of interleaved semantic composition and pragmatic inference. First I show how two major principles of Levinson’s typology of conversational implicature fall out of our models: Q(uantity) implicature, in which utterance meaning is refined through exclusion of the meanings of alternative utterances; and I(nformativeness) implicature, in which utterance meaning is refined by strengthening to the prototypical case. Q and I are often in tension; I show that the Bayesian approach derives quantitative predictions regarding their relative strength in interpretation of a given utterance, and present evidence supporting these predictions from a large-scale experiment. I then describe more complex applications of the theory to key cases of compositionality, focusing on two of the most fundamental building blocks of semantic composition, the words “and” and “or”.
Cours 4 (8/10) Productive knowledge & direct experience in human language processing & acquisition
The tension between combinatorial and holistic representation of complex linguistic expressions is central to debates on language processing and acquisition. In this lecture I describe work combining probabilistic models and new large datasets to investigate this tension and uncover the respective contributions of productive knowledge and direct experience. In processing, we focus on binomial expressions (salt and pepper – pepper and salt), finding a frequency-driven tradeoff between the two knowledge sources and a frequency-dependent level of idiosyncrasy in binomial ordering preference across binomials in the language. The former is explained by a rational model of learning from limited experience; the latter we account for with an evolutionary model of transmission of ordering preferences over time. In acquisition, we focus on determiner-noun combinations (“the ball”, “a cold”) and develop a novel Bayesian model to infer the strength of contribution of productive knowledge evident in child speech. We find evidence of low initial levels of productivity and higher levels later in development, consistent with the hypothesis that the earliest months of multi-word speech are not generated using rich grammatical knowledge, but that grammatical productivity emerges rapidly thereafter.