# Lsci 159/259: Computational Probabilistic Modeling for Language Science, Fall 2021

## 1 Class information

Lecture Times | TR 1-2:30p |

Lecture & Recitation Location | SBSG 1321 |

Canvas site | https://canvas.eee.uci.edu/courses/40740 |

Syllabus | http://socsci.uci.edu/~rfutrell/teaching/lsci259-f2021 |

The course will be made available synchronously in a streaming format for any students who cannot attend in person. If you know you will have to be remote for a certain class session, please inform the instructor 24 hours in advance so that the stream can be set up.

## 2 Instructor information

Richard Futrell (rfutrell@uci.edu) | Instructor |

Instructor's office | SSPB 2215 |

Instructor's office hours | F 11am–12pm or by appointment |

## 3 Class Description

This class covers the intersection of linguistics, cognitive modeling, programming, probability, and machine learning. We cover the development of computationally-implemented probabilistic models for natural language, including models of human language processing and acquisition. We study (1) how to design well-formed probabilistic models with considerations from probability theory and decision theory, (2) how to estimate the parameters of these models from data, (3) how to design efficient algorithms for model fitting and parameter estimation, including sampling-based methods for Bayesian nonparametrics and neural network-based methods, and (4) common probabilistic structures used in language science such as finite-state automata and context-free grammars. Students will learn both the underlying mathematics of these models as well as the technical tools to implement them in Python.

Problem sets will contain a mix of mathematical problems and implementational (Python programming) problems. Some may be entirely mathematical; some may be entirely programming-based.

## 4 Class organization

The class will consist of lectures twice a week along with problem sets and (for graduate students enrolled in Lsci 259) a class project which may be completed in groups.

## 5 Intended Audience

Advanced undergraduate or graduate students in Language Science, Cognitive Sciences, Computer Science, or any of a number of related disciplines. The undergraduate section is Lsci 159; the graduate section is Lsci 259. Postdocs and faculty are also welcome to participate.

The course prerequisites are:

- Some experience with calculus and linear algebra, plus
- The equivalent of two quarters of Python programming, plus
*Either*:- one quarter of probability/statistics/machine learning
*or* - one quarter of introductory linguistics (fulfilled by Lsci 3 or equivalent).

- one quarter of probability/statistics/machine learning

If you think you have the requisite background but have not taken the specific courses just mentioned, please talk to the instructor to work out whether you should take this course or do other prerequisites first.

We will be doing Python programming in this course using the PyTorch library. Familiarity with PyTorch is not assumed.

## 6 Readings & Textbooks

Readings are drawn from various sources. They provide important background and details for the material presented in lecture.

We'll also occasionally draw upon other sources for readings, including original research papers in computational linguistics, psycholinguistics, and other areas of the science of language.

Pdfs for readings are provided. For readings that come from books, pdfs of the books will be posted online on Canvas, not to be shared. When a section range is given, it is inclusive (so 2.5-2.7 means read sections 2.5, 2.6, and 2.7).

The books drawn from are:

- Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer.
- Jacob Eisenstein. 2019. Introduction to Natural Language Processing. MIT Press.
- Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. MIT Press.

## 7 Schedule (*subject to modification!*)

Day | Topic | Readings | Related materials | Deadlines |
---|---|---|---|---|

R 9/23 | Introduction to Probability | Eisenstein App. A | Goldsmith, 2007 Jaynes Ch. 2 | |

T 9/28 | Fitting Distributions to Data | Bishop 2-2.3 | ||

R 9/30 | Log-Linear Models | Eisenstein 2.5-2.7 | Log-Linear Models (Eisner), Bresnan et al., 2007 | |

T 10/5 | Optimization for Log-Linear Models | Bishop 1.6 | Pset 1 due | |

R 10/7 | Information Theory | Bishop 2.4 | ||

T 10/12 | Information Theory and Language Models | Eisenstein 6.1-6.2 | ||

R 10/14 | Markov Processes | Murphy 17-17.2 | Discrete-time Markov chains | Pset 2 due |

T 10/19 | Finite-State Automata | Eisenstein 9-9.1 | ||

R 10/21 | Dynamic Programming and Semirings | Eisenstein 7.2-7.5 | Viterbi Algorithm (ritvikmath) | |

T 10/26 | Probabilistic Context-Free Grammars | Eisenstein 9.2 | Chomsky, 1956 Gazdar, 1981 Müller, 2018 | Pset 3 due |

R 10/28 | Parsing | Eisenstein 10-10.3 | ||

T 11/2 | Directed vs. Undirected Models | |||

R 11/4 | Bayesian Networks | Bishop 8.2-8.3, Levy App. C | Pset 4 due | |

T 11/9 | Inference by Sampling | Bishop 11-11.3 | ||

R 11/11 | Veterans Day, no class |
|||

T 11/16 | Nonparametric Models | Murphy 25.2 | Goldwater et al., 2011 Nikkarinen et al., 2021 | |

R 11/18 | More Nonparametrics | Pset 5 due | ||

T 11/23 | Vector-Space Models of Word Semantics | Eisenstein 14-14.6 | Levy & Goldberg, 2014 | |

R 11/25 | Thanksgiving |
|||

T 11/30 | Modern Language Models | Eisenstein 6.3-6.4 | The Illustrated Transformer (Jay Alammar) | |

R 12/2 | Wrap up | Pset 6 due | ||

F 12/10 | Project due |

## 8 Requirements & grading

Work | Grade percentage (Lsci 159) | Lsci (2159) |

Participation | 5% | 5% |

Problem sets | 95% | 65% |

Class project | -- | 30% |

### 8.1 Pset late policy

Psets can be turned in up to 7 days late; 10% of your score will be deducted for each 24 hours of lateness (rounded up). For example, if a homework assignment is worth 80 points, you turn it in 3 days late, and earn a 70 before lateness is taken into account, your score will be (1-0.3)*70=49.

### 8.2 Mapping of class score to letter grade

It is unlikely that I will grade on a curve. I guarantee minimum grades on the basis of the following thresholds:

Threshold | Guaranteed minimum grade |

>=90% | A- |

>=80% | B- |

>=70% | C- |

>=60% | D |

So, for example, an overall score of 90.0001% of points guarantees you an A-, but you could well wind up with a higher grade depending on the ultimate grade thresholds determined at the end of the semester.

## 9 Pandemic information

When physically present in a classroom, other instructional space, or any other space owned or controlled by UCI, all students and all employees (faculty and staff) must comply with COVID-19 related UCI executive directives and guidance. This guidance takes into account federal, state, and local guidelines available at https://uci.edu/coronavirus/.

## 10 Academic Integrity

We will be adhering fully to the standards and practices set out in UCI's policy on academic integrity. Any attempts of academic misconduct or plagiarism will be met with consequences as per the university regulations.

## 11 Disability

Any student requesting academic accommodations based on a disability is required to apply with Disability Service Center at UCI. For more information, please visit http://disability.uci.edu/.