Computational Linguistics 19-20
The Computational Linguistics course is taught by Raffaella Bernardi (UniTN)
Classes are on Mondays and Wednesdays at 13:00-15:00 and on Thursdays at 10:30-12:30. Detailed schedule
Please, note there are no classes on the 2th of October and on the 13rd of November
For updated information, see the online time-table
Information about the final exam
The exam will consist of two parts each contributing the 50% of the total mark:
- written exercises on Syntax, Semantics and their interface
- written report on a topic selected among those presented in class. The report can be either a project proposal based on a literature review or the report on a project based on a literature review. The report has to be written in LaTeX.
Students of the Text Processing course will do one of the two parts at their choice. They have to inform the lecturer of their choice at least two months before the exam.
Students have to agree on the topic of the report with the lecturer at least one month before the exam.
July Exam The exam will be run using the same format of the winter exam: project (see above) and written exam. The latter will be run using Zoom. The lecturer will send the link to students beforehand. The students will have to turn the webcamera so to let the lecturer see the paper where he/she writes. At the end of the exam, students will send the written exercises to the lecturer by email before leaving the Zoom meeting. Students can leave the meeting only when all students have finished the exam and have sent it to the lecturer.
We will rely on programming skills taught by Luca Ducceschi in the course Computational Skills for Text Analysis (first semester). Students are highly reccomended to attend it, in particular if they lack a computational background. My course is complementary to Carlo Strapparava's course on Human Language Technologies (second semester). The Formal Semantic part will be presented in depth in Roberto Zamparelli's course on Logical Structures of Natural Language.
Materials
- Speech and Language Processing (SLP)
- Steven Bird, Ewan Klein, and Edward Loper Natural Language Processing with Python: Ch 8 (CFG and DG), Ch. 9 (Feature Structures) and Ch. 10 (meaning)
- My slides will be posted below after each class.
If you are interested in textbooks about FS:
- For formal semantics and in particular lambda-calculus: Mathematical Methods in Linguistics by Barbara Partee, Alice ter Meulen, Robert Wall
- For a general intro to FS: Introduction to Natural Language Semantics by Henriette de Swart
For further information, see the suverys and tutorials below:
- Barbara Partee (2018) Formal Semantics. In the handbook of Formal Semantics ed. Maria Aloni and Paul Dekker. (in dropbox)
- Alessandro Lenci (2008) Distributional semantics in linguistic and cognitive research
- Turney and Pantel (2010) From Frequency to Meaning: Vector Space Models of Semantics
- Katrin Erk. (2012) Vector space models of word meaning and phrase meaning: a survey. Language and Linguistics Compass 6(10), 635-653, October 2012. (in dropbox)
- Marco Baroni Composition in Distributional Semantics. Language and Linguistics Compass 6(10), 635-653, October 2013. (in dropbox)
- Gemma Boleda and Aurelie Herbelot (2016) Formal Distributional Semantics: Introduction to the Special issue. Computational Linguistics 42:4
- Raffaella Bernardi, Ruket Cakici, Desmond Elliott, Aykut Erdem,
Erkut Erdem, Nazli Ikizler-Cinbis, Frank Keller, Adrian Muscat and
Barbara Plank Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures JAIR (Journal of Artificial Intelligence Research). Vol. 55, 2016. DOI:10.1613/jair.4900
- Emily Bender Semantics and Pragmatics ACL 2018
- Mrinmaya Sachan, Minjoon Seo, Hannaneh Hajishirzi, and Eric Xing Standardized Tests as benchmarks for Artificial Intelligence
Topics with a rough schedule
- 3 classes on cutting-edge topics
- 8 classes on Syntax (Sep-Oct): Formal Grammars of English, Syntactic Parsing, Statistical Parsing Dependency Parsing.
- 14 classes on Semantics (Oct-Nov): Formal Semantics, Distributional Semantics Models, The Representation of Sentence Meaning, Computational Semantics, Neural Models of Sentence Meaning
- 2 classes on Multimodal Models (Dec): Language and Vision
Schedule
- 1.) 23.09.2019 (13:00-15:00)
- SYNTAX: Introduction to CL: admin, intro to Formal Languages, Regural Languages and Finite State Automata
- Slides
- 2.)25.09.2019 (10:30-12:30)
- Dialog systems and Chatbots Luciana Benotti (SLP: Ch. 24 )
- Slides
- 3.) 25.09.2019 (13:00-15:00)
- Reading Group on
- NLP pipeline and Neural Networks: Tenney et al. ACL 2019 and Syntax-Semantics and Neural Networks: Jawahar et al. ACL 2019 (read by all, lead by RB)
- organization of future Reading Groups and Brief intro to LaTeX
- 4.) 26.09.2019 (10:30-12:30)
- SYNTAX (GRAMMAR): sentence structure, CFG, Chomsky Hierarchy, which FL for NL syntax, (SLP: Ch. 11)
- Slides
- 5.) 30.09.2019 (10:30-12:30)
- Natural Language Generation for Dialog Systems (Luciana Benotti)
- Slides
- 6.) 30.09.2019 (13:00-15:00)
- GRAMMAR: Pen and Pencile exercises on CFG
- Summary in LaTeX of Filler-Gap and LSTM Wilxoc et al Blackbox 2018 (lead by Bogdan and Tanise)
- 7.) 03.10.2019 (15:00-17:00)
- GRAMMAR: Feature Agreement and Unification Grammar
- Correction of exercises on CFG.
- Reading group on PoS tagger and gender issue Garimella et al ACL 2019 (Lead by Erick and Emma)
- 8.) 07.10.2019 (13:00-15:00)
- GRAMMAR: Other Formal Grammars (TAG, DG, CG). Slides
- Pen and Pencile exercises on CG, TAG and DG vs. CFG
- 09.) 09.10.2019 (13:00-15:00)
- SYNTAX (Grammar) Reading Group about
available treebanks and parsers:
- CCGbank, (Helena Bonaldi and Margherita Fanton)
- HPSG on Penn Treebank (by Marko Karetic and Xue)
- Dependency Grammar (Milena Paladin and Sofia Simakova).
- 10) 10.10.2019 (10:30-12:30)
- SYNTAX (PARSING): Top-down vs Bottom-up Parsing and Syntactic and Statistical Parsing. Slides (SLP: Ch. 12, 13 and 14)
- Reading Group on
- Shallow parsing and Keystrokes Plank COLING 2016 (by Chiara and Anna)
- 11) 14.10.2019 (13:00-15:00)
- Correction of exercises on CG
- Reading Group on on LSTM and CFL Sennhauser et al BlackBox 2018 Andrea De Varda and Ludovica de Paolis;
- 12) 16.10.2019 (12:30-14:30)
- SEMANTICS: Formal Semantics: Introduction to Semantics, Brief intro to Logic, to Formal Semantics and semantic types.
- Slides
- 13) 17.10.2019 (10:30-12:30)
- SEMANTICS: Compositionality, lambda calculus. Slides Exercises on Formal Semantics.
- 14) 21.10.2019 (13:00-15:00)
- SEMANTICS: More
exercises on lambda-calculus
- Exercises
- 15) 23.10.2019 (13:00-15:00)
- SEMANTICS: Abstraction in the lambda calculus Slides
- 16) 24.10.2019 (10:30-12:30)
- SYNTAX-SEMANTICS: Lambda calculus, CFG and CG
- Slides
- 17) 28.10.2019 (13:00-15:00)
- Exercises on lambda and CFG-lambda
- 18) 30.10.2019 (13:00-15:00)
- SEMANTICS: Distributional Semantics Models
- Slides
- 19) 31.10.2019 (10:30-12:30)
- Summary and Discussion on Paper on LS (Landauer and Dumais (1997)) lead by RB
- 20) 04.11.2019 (13:00-15:00)
- SEMANTICS: Summary on vectors, matrices Lab on DSM; Presentation by ALL students of the implementational work done with Luca.
- 21) 06.11.2018 (13:00-15:00)
- SEMANTICS: Discussion on the evaluations in Sahlgren and Lenci (2016) by Nicola Sartorato and Francesca Pase and Baroni, Dinu and Kruszewski (2014) lead by Nhut Truong and Zhuolun
- 22) 07.11.2019 (10:30-12:30)
- SEMANTICS: Compositional DSM Slides
- 23) 11.11.2019 (10:00-12:00)
- SEMANTICS: Summary and Discussion on Paper on DSM (Conneau et al. ACL 2018) lead by Duygu Buga. Start discussing ideas about projects.
- 24) 18.11.2019 (13:00-15:00)
- SEMANTICS: Summary and Discussion on Papers on CDSM Baroni In press lead by by Alex Eperon and Valentino Penasa
- 25) 20.11.2019 (13:00-15:00)
- SEMANTICS: Thesis presentation by Ludovica Panifto Summary and Discussion on (Reddy, S. et al 2011)by Abdel-akram Anis Saidi and
- 26) 21.11.2019 (10:30-12:30)
- Discussion about projects (Luca Ducceschi will join us)
- Reharsal of exercises on CG and lambda
- 27) 25.11.2018 (13:00-15:00)
- MULTIMODAL MODELS: Introduction to Language and Vision
- Slides
- 28) 27.11.2019 (13:00-15:00)
- MULTIMODAL MODELS: Overview of current work on LaVi at UniTN
- Slides
- 29) 28.11.2019 (10:30-12:30)
- Sample Written Exam
- 30) 02.12.2019 (13:00-15:00)
- Sample exam correction
- 31) 11.12.2019 (10:30-12:30)
- PROJECTS PROPOSAL PRESENTATIONS by CL students
- 32) 12.12.2019 (10:30-12:30)
- PROJECTS
PROPOSAL PRESENTATIONS by CL students
- Exam: 03.02.2020
Readings we will discuss in class (discussion lead by one or two students each time)
- T. Linzen What can linguistics and deep learning contribute to each other?
- Tal Linzen, Emmanuel Dupoux and Yoav Goldberg Assessing the Ability of LSTMs to Learn Syntax-Sensitive Dependencies
- Barbara PlankKeystroke dynamics as signal for shallow syntactic parsing COLING 2016
- Ian Tenney, Dipanjan Das, Ellie Pavlick BERT Rediscovers the Classical NLP PipelineACL 2019
- Sennhauser, L., & Berwick, R Evaluating the Ability of LSTMs to Learn Context-Free Grammars BlackBox 2018
- Wei, Pham, O'Connor and Dillon Evaluating Syntactic Properties of Seq2seq Output with a Broad Coverage HPSG: A Case Study on Machine Translation BlackBox 2018
- Phu Mon Htut, Kyunghyun Cho, Samuel Bowman Grammar Induction with Neural Language Models: An Unusual Replication BlackBox 2018
- Wilcox, Levy, Morita and Futrell What do RNN Language Models Learn about Filler–Gap Dependencies?BlackBox 2018
- Conneau, A., Kruszewski, G., Lample, G., Barrault, L., & Baroni, M.What you can cram into a single $&!#* vector: Probing sentence embeddings for linguistic propertiesACL 2018 (by Duygu))
- Marco Baroni Linguistic generalization and compositionality in modern artificial neural networks Philosophical Transation ofthe Royal Sociente B In press
Cognitively-inclined work on dialogue.
- Michael K Tanenhaus and Sarah Brown-Schmidt (2008) Language processing in the natural world
- Sarah Brown-Schmidt, Christine Gunlogson, Michael K.Tanenhaus (2008) Addressees distinguish shared from private information when interpreting questions during interactive conversation
- Mindaugas Mozuraitis, Craig G.Chambers and Meredyth Daneman (2015)Privileged versus shared knowledge about object identity in real-time referential processing
- Bert Oben and Geert Brône (2016) Explaining interactive alignment: A multimodal and multifactorial account
- David Reitter and Johanna D.Moore (2014) Alignment and task success in spoken dialogue
Very recent work on Language and Vision
- Multimodal Hierarchical Reinforcement Learning Policy for Task-Oriented Visual Dialog (SIGDIAL 2018)
- Do Better ImageNet Models Transfer Better?
- Evaluating Feature Importance Estimates
- PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning. (CVPR 2018)
- Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights (ECCV 2018)
- Explainable Neural Computation via Stack Neural Module Networks (ECCV 2018)
- Xinya Du, Junru Shao, Claire Cardie Learning to Ask: Neural Question Generation for Reading Comprehension, ACL 2018
- Zarriess, Schlangen D. Obtaining referential word meanings from visual and distributional information: Experiments on object naming, ACL 2018
- Ryan Lowe, Michael Noseworthy, Iulian Vlad Serban, Nicolas Angelard-Gontier, Yoshua Bengio,Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses,ACL 2018
Further readings
- Jorge Perez, Javier Marinkovic, Pablo Barcelo On the compteness of moder neural network architectures ICLR 2019.
- Yikang Shen, Zhouhan Lin, Chin-Wei Huang & Aaron Courville Neural Language Modeling by jointly learning syntax and lexiconICLR 2018
- Richard N. AslinStatistical learning: a powerful mechanism that operates by mere exposure
- Novotny, Larlus and VedaldiI Have Seen Enough: Transferring Parts Across Categories
- Zeldes, Amir (2018) "The GUM Corpus: Creating Multilayer Resources in the Classroom". Language Resources and Evaluation 51(3), 581–612.
- L. Bentivogli, R. Bernardi, M. Marelli, S. Menini, M. Baroni, R. Zamparelli SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment In Language Resources and Evaluation. Vol 50, Nr. 1, 95--124, 2016. I will give you a copy
- R. Girju and Preslv Nakov, Vivi Nastase, Stan Szpakowicz, Peter Turney Deniz Yuret (2007/2009). Classification of semantic relations between nominals. In SemEval 2007. Extended version in Language Resources and Evaluation 2009, 43(2)
- Carina Silberer, Vittorio Ferrari, and Mirella Lapata. (2016)Visually Grounded Meaning Representations
- Aravind Joshi (2009). "Tree-Adjoining Grammars". In The Oxford Handbook of Computational Linguistics (I can give you a copy)
- M.C. de Marneffe, T. Dozat, N.Silveira, K. Haverinen, F. Ginter, J. Nivre, C. D. ManningUniversal Stanford Dependencies: A cross-linguistic typology LREC 2014
- Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, Andrew McCallum Linguistically-Informed Self-Attention for Semantic Role Labeling. EMNLP 2018.
- Héctor Martínez Alonso, Željko Agić, Barbara Plank and Anders Søgaard. Parsing Universal Dependencies without training. In EACL 2017 (long), Valencia, Spain.
- Mike Lewis and Mark Steedman. A* CCG Parsing with a Supertag-factored Model, EMNLP 2014
- Daniel Gildea Dependencies vs. Constituents for Tree-Based AlignmentEMNLP 2004
- Adam Kilgarriff I don't believe in word sense. Computers and the Humanities 31.2 (1997): 91--113.
- Sebastian Pado and Mirella Lapata (2007) Dependency-Based Construction of Semantic Space Models
- Katrin Erk (2016)What do you know about an alligator when you know the company it keeps?
- Beltagy et al 2016 Representing Meaning with a Combination of Logical and Distributional Models. In Computational Linguistics 42:4
- Magnus Sahlgren and Alessandro Lenci (2016)
The Effects of Data Size and Frequency Range
on Distributional Semantic Models
- Stephen Roller and Katrin Erk (2016) Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment
- Eva M. Vecchi, Marco Marelli, Roberto Zamparelli, Marco Baroni (2018)Spicy Adjectives and Nominal Donkeys: Capturing Semantic Deviance Using Compositionality in Distributional Spaces
- Richard Socher Brody Huval Christopher D. Manning Andrew Y. Ng (2012)Semantic Compositionality through Recursive Matrix-Vector Spaces
- Jeff Mitchell, Mirella Lapata (2010) Composition in Distributional Models of Semantics Pado and Lapata (2007)
- Blog by Guy FighelThe 7 Artificial Intelligence Books You Should Read Today
- Lillian Lee “I’m sorry Dave, I’m afraid I can’t do that”: Linguistics, Statistics, and Natural Language Processing circa 2001
- A. M. Turing (1950) Computing machinary and intelligence. In Mind, 59(236), pp.433-460
- Online interface to query English and Italian semantic models
- DISCO, another online interface (multiple languages)
- word2vec, the tool and pre-compiled semantic vectors
- Semantic vectors, pre-compiled using word2vec with optimal parameters
- Gensim, Python Framework for Vector Space Modeling
- spaCy
- GUM
- Computer Vision: textbook by Richard Szeliski.
- Reading list on Deep learning (by Bengio)
- ACL 2018 publications
- CCG page ACL Wiki
- WiCV
- See Awesome Community-Curated NLP List
- Mailing lists: Corpora,
- Top conferences are run by ACL