Computational Linguistics 22-23
The Computational Linguistics course is taught by Raffaella Bernardi, TA: Helena Bonaldi.
Location: Palazzo Istruzione, Rovereto. Classes will be on Tuesdays, Wednesdays and Thursdays. For updated information, please check the online calender
Platform
- For the Reading Groups we will use Perusall: Students have to enroll to the course in the platform. You can find the code in Moodle
- For the CL Labs, we will use CoLab.
Information about the final exam
For CLC students, the exam will consist of three parts:
- 20%: Assignments (RG Baroni et al 2014, Linzen/Herberlot 2020 and CL labs)
- 30%: written exercises on Syntax, Semantics and their interface
- 50% written report on a topic selected among those presented in class. The report can be either a project proposal based on a literature review or the report on a project based on a literature review. The report has to be written in LaTeX.
Students who earn 9 ECTS with this course (based on previous academic years) should contact the lecturer
Non-frequentanti: need to contact the lecturer at least two months before the exam.
Students have to agree on the topic of the report with the lecturer at least one month before the exam.
We will rely on programming skills taught in the course Introduction to Computer Programming (Python)
My slides will be posted in Moodle after each class.
Topics with a rough schedule
- 5 classes on Syntax (Nov): Formal Grammars of English, Syntactic Parsing, Statistical Parsing Dependency Parsing.
- classes on Semantics (Nov-Dec): Formal Semantics (4), Distributional Semantics Models (4)
- 4 Reading Groups (Nov-Dec) on cutting edge topics related to those discussed in class.
Schedule
- Intro to the course and to CL. Extra material: Manning 2015
- 1) 08.11.2022 (10:30-12:30) Introduction to CL
- Syntax: Extra material: NLTK Ch. 8 and Syntactic Tree Structures, Demos: DG
- 2.) 09.11.2022 (15:00-17:00) Intro to CFG
- 3.) 10.11.2022 (10:30-12:30) Exercises on CFG
- 4.) 15.11.2022 (10:30-12:30) Reading Group:
- Tenney et al. ACL 2019 NLP pipeline and Neural Networks
- Jawahar et al. ACL 2019 Syntax-Semantics and Neural Networks
- Off-class quiz on the vocabulary used in these two papers.
- 5.) 16.11.2022 (15:00-17:00) Intro to TAG, DG and CG and on-paper excercises
- 6.) 17.11.2022 (10:30-12:30)On-paper exercises about Formal Grammars
- 7.) 22.11.2022 (10:30-12:30) Intro to Parsing
- 8.) 23.11.2022 (15:00-17:00) Reading Group
- Wilxoc et al 2018 Filler-Gap and LSTM Blackbox
.
- Off-class quiz on the concepts introduced in this part (syntax) (self-assesment)
- Distributional Semantics
- Extra Material: Linear Algebra (video
by G. Strang)
- Extra Material: Baroni 2013
- Evaluation methods, metrics
- Extra material: Dror et al 2018, Chris Potts and Geiger et al 2020; Baroni 2016 and Kafle, Shrestha and Kanan 2019
- .) 24.11.2022 (10:30-12:30)(CANCELLED)
- 09.) 29.11.2022 (15:00-17:00) Introduction to DSM
- 10.) 30.11.2022 (15:00-17:00) Reading Group
- Baroni et al. 2014 Don’t count, predict! A systematic comparison of context-counting vs. context-predicting semantic vectors
- 11.) 01.12.2022 (10:30-12:30) Intro to Evaluation Methods in CL, and Lab 1 on Baroni et al 2014 (Cosine similarity and Accuracy)
- 12.) 06.12.2022 (8:30-10:30) Background information (Semantics, Formal Semantics, Logic)
- 13.) 06.12.2022 (10:30-12:30) Compositional DSM
- 14.) 07.12.2022 (15:00-17:00)Reading Group
- Group 1: Tal Linzen How Can We Accelerate Progress Towards Human-like Linguistic Generalization?
- Group 2: A. Herbelot Re-solve it: simulating the acquisition of core semantic competences from small data CoNLL 2020
- Students of Group 1 prepare a quiz for students of Group 2 and viceversa.
- Formal Semantics: Extra material Semantic Parser by McCartney
- 15.) 12.12.2022 (9:30-11:00) Compositionality: Lambda calculus (function application)
- 16.) 13.12.2022 (10:30-12:30) Compositionality: Lambda calculus (abstraction)
- 17.) 14.12.2022 (15:00-17:00)
Discussion of exercises of
Lab 1 and Lab 2 on Baroni et
al 2014 (Correlation and Purity)
- 18.) 15.12.2022 (10:30-12:30) Exercises on Syntax-Semantics
- Off-class quiz on the concepts introduced in this part (self-assesmen)
- 19.) 21.12.2022 (15:00-17:00 ONLINE)Sample exam (CFG,CG,lambda)
- Extra material McCartney
- 20.) 22.12.2022 (8:30-10:00) ONLINEProject Presentations
- 21.) 22.12.2022 (10:30-12:30 ONLINE)Discussion Sample exam
Off-class quiz on the concepts introduced in this part (self-assesment)
Students should start thinking of the project they want to bring at the exam.
Further Materials
- Online courses at MIT
- Online course at Standford on Natural Language Understanding by Christopher Potts and McCartney
- Introduction to Semantics and Pragmatics by Christopher Potts
- A case for deep learning in semantics 2018 by Christopher Potts.
- ACL 2020 TUTORIAL "Reviewing Natural Language Processing Research"
- Speech and Language Processing (SLP)
- Steven Bird, Ewan Klein, and Edward Loper Natural Language Processing with Python: Ch 8 (CFG and DG), Ch. 9 (Feature Structures) and Ch. 10 (meaning)
If you are interested in textbooks about FS:
- For formal semantics and in particular lambda-calculus: Mathematical Methods in Linguistics by Barbara Partee, Alice ter Meulen, Robert Wall
- For a general intro to FS: Introduction to Natural Language Semantics by Henriette de Swart
For further information, see the suverys and tutorials below:
- Barbara Partee (2018) Formal Semantics. In the handbook of Formal Semantics ed. Maria Aloni and Paul Dekker. (in dropbox)
- Alessandro Lenci (2008) Distributional semantics in linguistic and cognitive research
- Turney and Pantel (2010) From Frequency to Meaning: Vector Space Models of Semantics
- Katrin Erk. (2012) Vector space models of word meaning and phrase meaning: a survey. Language and Linguistics Compass 6(10), 635-653, October 2012. (in dropbox)
- Marco Baroni Composition in Distributional Semantics. Language and Linguistics Compass 6(10), 635-653, October 2013. (in dropbox)
- Gemma Boleda and Aurelie Herbelot (2016) Formal Distributional Semantics: Introduction to the Special issue. Computational Linguistics 42:4
- Raffaella Bernardi, Ruket Cakici, Desmond Elliott, Aykut Erdem,
Erkut Erdem, Nazli Ikizler-Cinbis, Frank Keller, Adrian Muscat and
Barbara Plank Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures JAIR (Journal of Artificial Intelligence Research). Vol. 55, 2016. DOI:10.1613/jair.4900
- Emily Bender Semantics and Pragmatics ACL 2018
- Mrinmaya Sachan, Minjoon Seo, Hannaneh Hajishirzi, and Eric Xing Standardized Tests as benchmarks for Artificial Intelligence
- Percy Liang and Christopher PottsBringing machine learning and compositional semantics together
Tools and further links
- Count based DSM
- SippyCup semantic parser
- lambda viewer and lambda interpreter
- Tree viewer
- Online interface to query English and Italian semantic models
- DISCO, another online interface (multiple languages)
- word2vec, the tool and pre-compiled semantic vectors
- Semantic vectors, pre-compiled using word2vec with optimal parameters
- Gensim, Python Framework for Vector Space Modeling
- spaCy
- See Awesome Community-Curated NLP List
- Mailing lists: Corpora,
- Top conferences are run by ACL