Information-Theoretic Approaches to Linguistics

1 Course Information

Lecture times TF 9:35-11am
Lecture Location Olson 118

2 Instructor Information

Instructor Richard Futrell (
Instructor's office hours T 1:00pm or by appointment
Office hours location Olson 105

3 Course Description

Information theory is a mathematical framework for analyzing communication systems. This course examines its applications in linguistics, especially corpus linguistics, psycholinguistics, quantitative syntax, and typology. We study natural language as an efficient code for communication. We introduce the information-theoretic model of communication and concepts of entropy, mutual information, efficiency, robustness. Information-theoretic explanations for language universals in terms of efficient coding, including word length, word frequency distributions, and trade-offs of morphological complexity and word order fixedness. Information-theoretic models of language production and comprehension, including the principle of Uniform Information Density, expectation-based models, and noisy-channel models.

4 Course Format

Course time will be spent on lectures, discussions, exercises, and demos. Evaluation will consist of a single 3-page paper presenting a proposed application of information theory to a linguistic problem and a proposed experiment or set of experiments to test the theory. There will be readings before each class, labeled "Discussion Material" in the schedule below. Lectures and in-class discussions will focus on the content from the discussion material. There are also recommended readings. Reading these will greatly increase the value of the course for you.

5 Intended audience

This course is designed for linguists of all backgrounds. A background in probability theory and computational linguistics will be very helpful but is not required.

Please fill out the introductory survey here so that I know something about your background and experience.

6 Schedule (subject to modification)

Day Topic Discussion Material Recommended Readings
6/24 Introduction to information theory Gleick (2011: Ch. 7) Pereira (2000), Goldsmith (2007) highly recommended if you do not have a background in probability theory
6/27 Efficient Coding and the Lexicon Piantadosi et al. (2011) Dye et al. (2018), Liu et al. (2019)
7/2 Complexity of Languages Bentz et al. (2017) Cotterell et al. (2018), Shannon (1951)
7/5 Online Processing Smith & Levy (2013) Jaeger (2010), Levy et al. (2009)
7/9 Efficiency in Syntax Futrell et al. (2015) Jaeger & Tily (2010), Futrell & Levy (2017), Futrell et al. (2019)
7/12 Lexical Semantics and the Information Bottleneck Zaslavsky et al. (2018) Zaslavsky et al. (2019), Gibson et al. (2017), Sims (2018)
7/16 Morphological Complexity Cotterell et al. (2019) Koplenig et al. (2017), Ackerman & Malouf (2013)
7/19 Learning and Algorithmic Information Theory Hsu et al. (2013) Piantadosi & Fedorenko (2017)

7 Resources

  • On information theory

    There is a Khan Academy video course on information theory, which is highly recommended.

    James Gleick wrote a popular book about information theory. The Information: A History, A Theory, a Flood.

    The comprehensive textbook on information theory is Cover & Thomas (2006). Prof. Cover's lectures based on the book are online. If you have a strong math background, this is the book to work through.

    A more accessible introduction is given in MacKay (2003).

    A short accessible introduction is given in Cherry (1957).

    On probability

    If you would like to brush up on probability theory, I recommend watching John Tsitiklis's lectures.

    On information-theoretic linguistics

    There has been a lot of fascinating work beyond the papers listed in the schedule about applying information theory to the study of human language. Here I will give a sampler of some work which I had to leave out of the main course.

    Work on information-theoretic phonology started with Cherry, Halle & Jakobson (1953). More recent work includes Goldsmith & Riggle (2011) and Hall et al. (2016).

    There has been a lot of work on information-theoretic models of morphological processing. A good place to start with this stuff is Milin et al. (2009).

8 Requirements & Grading

  • Grade breakdown

    Your grade will be determined by your final paper. In the final paper (3 pages double-spaced), you will be asked to elaborate on a proposed application of information theory to a linguistic problem, and to propose an experiment or set of experiments to test the theory.

    The final paper will be due at the beginning of class on the last day of class.

Author: Richard Futrell

Created: 2019-06-12 Wed 15:21

Emacs 25.2.2 (Org mode 8.2.10)