Computational Linguistics 18-19
The Computational Linguistics course is taught by Raffaella Bernardi (RB) (UniTN)
Classes are on Wednesdays and Thursdays at 13:00-15:00 and on Fridays at 13:30-15:30. For updated information, see the online time-table
Information about the final exam
The exam will consist of two parts each contributing the 50% of the total mark:
- written exercises on the Syntax, Semantics and their interface
- written report on a topic selected among those presented in class. The report can be either a project proposal based on a literature review or the report on a project based on a literature review. The report has to be written in LaTeX.
Students of the Text Processing course will do one of the two parts at their choice. They have to inform the lecturer of their choice at least two months before the exam.
Students have to agree on the topic of the report with the lecturer at least one month before the exam.
We will rely on programming skills taught by Luca Ducceschi in the course Computational Skills for Text Analysis (first semester). Students are highly reccomended to attend it, in particular if they lack a computational background. My course is complementary to Carlo Strapparava's course on Human Language Technologies (second semester). The Formal Semantic part will be presented in depth in Roberto Zamparelli's course on Logical Structures of Natural Language (first semester from 30th of October 2018.)
Materials
- Speech and Language Processing (SLP)
- Steven Bird, Ewan Klein, and Edward Loper Natural Language Processing with Python: Ch 8 (CFG and DG), Ch. 9 (Feature Structures) and Ch. 10 (meaning)
- My slides will be posted below after each class.
If you are interested in textbooks about FS:
- For formal semantics and in particular lambda-calculus: Mathematical Methods in Linguistics by Barbara Partee, Alice ter Meulen, Robert Wall
- For a general intro to FS: Introduction to Natural Language Semantics by Henriette de Swart
For further information, see the suverys and tutorials below:
- Barbara Partee (2018) Formal Semantics. In the handbook of Formal Semantics ed. Maria Aloni and Paul Dekker. (in dropbox)
- Alessandro Lenci (2008) Distributional semantics in linguistic and cognitive research
- Turney and Pantel (2010) From Frequency to Meaning: Vector Space Models of Semantics
- Katrin Erk. (2012) Vector space models of word meaning and phrase meaning: a survey. Language and Linguistics Compass 6(10), 635-653, October 2012. (in dropbox)
- Marco Baroni Composition in Distributional Semantics. Language and Linguistics Compass 6(10), 635-653, October 2013. (in dropbox)
- Gemma Boleda and Aurelie Herbelot (2016) Formal Distributional Semantics: Introduction to the Special issue. Computational Linguistics 42:4
- Raffaella Bernardi, Ruket Cakici, Desmond Elliott, Aykut Erdem,
Erkut Erdem, Nazli Ikizler-Cinbis, Frank Keller, Adrian Muscat and
Barbara Plank Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures JAIR (Journal of Artificial Intelligence Research). Vol. 55, 2016. DOI:10.1613/jair.4900
- Emily Bender Semantics and Pragmatics ACL 2018
- Mrinmaya Sachan, Minjoon Seo, Hannaneh Hajishirzi, and Eric Xing Standardized Tests as benchmarks for Artificial Intelligence
Topics with a rough schedule
- 8 classes on Syntax (Sep-Oct): Formal Grammars of English, Syntactic Parsing, Statistical Parsing Dependency Parsing.
- 12 classes on Semantics (Oct-Nov): Formal Semantics, Distributional Semantics Models, The Representation of Sentence Meaning
- 3 classes on Syntax-Semantics interface (Nov): Computational Semantics, Semantic Role Labelling and Argument Structure, Neural Models of Sentence Meaning
- 3 classes on Beyhond sentences (Nov): Discourse Coherence, Dialogue, Question Answering/IQA
- 5 classes on Multimodal Models (Dec): Language and Vision
Schedule
- 1.) 19.09.2018
- SYNTAX: Introduction to CL: admin, intro to Formal Languages, Regural Languages and Finite State Automata
- Slides
- 2.) 20.09.2018
- SYNTAX (GRAMMAR): sentence structure, CFG, Chomsky Hierarchy, which FL for NL syntax, (SLP: Ch. 11)
- Slides
- 3.) 26.09.2018
- GRAMMAR: Pen and Pencile exercises on CFG
- Brief intro to LaTeX LaTex Base
- 4.) 27.09.2018
- GRAMMAR: Feature Agreement and Unification Grammar
- Slides,
- 5.) 28.09.2018
- GRAMMAR: Correction of exercises on CFG.[Exercises with solutions]
- 6.) 03.10.2018
- GRAMMAR: Summary in LaTeX of a paper on HPSG (Simon)
- Other Formal Grammars. Slides on TAG, DG, CG.
- 7) 04.10.2018
- SYNTAX (Grammar) Pen and Pencile exercises on CG, TAG and DG vs. CFG, and comparison of available treebanks (CCGbank, Peen Treebank).
- 8) 05.10.2018
- SYNTAX (PARSING): Top-down vs Bottom-up Parsing. Slides
- Syntactic and Statistical Parsing (SLP: Ch. 12, 13 and 14)[Slides]
- Summary of a paper on TAG (Joshi 2009, Eleonora, Greta and Ali)
- 9) 10.10.2018
- Correction of exercises on CG
- Summary on DG vs. Constituency (Gildea 2004 or Alonso et al 2017) by Natalia, Erica and Siavosh;
- summary on DG (de Marneffe et ali. 2014 or Strubell et al. 2018) by Dina and Darya)
- 10) 11.10.2018
- PARSING: Dependency Grammar Parsing by Barbara Plank
- Slides
- 11) 12.10.2018
- SEMANTICS: Formal Semantics: Introduction to Semantics, Brief intro to Logic, to Formal Semantics and semantic types.
- Slides
- Exercises on CFG and feature agreement?? NLTK Ch 9(SLP: Ch. 11)
- 12) 17.10.2018
- SEMANTICS: Compositionality, lambda calculus. Slides
- 13) 18.10.2018
- SEMANTICS: Exercises on Formal Semantics.
- 14) 19.10.2018
- SEMANTICS: Abstraction in the lambda calculus Slides
- 15) 24.10.2018
- SYNTAX-SEMANTICS: Lambda calculus, CFG and CG
- Slides
- 16) 25.10.2018
- Exercises on lambda and CFG-lambda
- 17) 26.10.2018
- SEMANTICS: Distributional Semantics Models
- Slides
- 18) 31.10.2018
- Summary and Discussion on Paper on LS (Landauer and Dumais (1997)) lead by RB
- 19) 07.11.2018
- SEMANTICS:Summary on vectors, matrices Lab on DSM; Presentation by Students of the implementational work done with Luca.
- 20) 08.11.2018
- SEMANTICS: Discussion on the evaluations in Sahlgren and Lenci (2016) and Baroni, Dinu and Kruszewski (2014) lead by Daria and Erica
- 21) 09.11.2018
- SEMANTICS: Compositional DSM, Slides
- 22) 14.11.2018
- SEMANTICS: lab on CDSM, DISSECT by Sandro Pezzelle
- 23) 16.11.2018
- SEMANTICS: lab on CDSM, DISSECT by Sandro Pezzelle
- 24) 21.11.2018
- SEMANTICS: Summary and Discussion on Paper on DSM (Erk 2016) lead by Siavosh and Natalia
- 25) 22.11.2018
- SEMANTICS: Summary and Discussion on Paper on CDSM (Vecchi et al. 2018) and Linzen 2016 lead by Dina and Ali
- 26) 23.11.2018
- SEMANTICS: Summary and Discussion on (Reddy, S. et al 2011) and (McGregor, S. et al. 2017) lead by Greta and Eleonora
- 27) 27.11.2018 (10:30-12:30 -NOTE CHANGE OF DATE)
- MULTIMODAL MODELS: Introduction to Language and Vision
- Slides
- 28) 29.11.2018
- MULTIMODAL MODELS: Overview of current work on LaVi
- Slides
- 29) 04.12.2018 (11:00-13:00)
- Sample Written Exam NOTE DIFFERENT DAY.
- 30) 06.12.2018
- Sample exam correction
- 31) 12.12.2018 (9:30-12:00)
- PROJECTS
PROPOSAL PRESENTATIONS by CL students
- 19.12.2018 (14:00-16:00), Written exam
Readings we will discuss in class (discussion lead by one or two students each time)
- Aravind Joshi (2009). "Tree-Adjoining Grammars". In The Oxford Handbook of Computational Linguistics (I can give you a copy)
- M.C. de Marneffe, T. Dozat, N.Silveira, K. Haverinen, F. Ginter, J. Nivre, C. D. ManningUniversal Stanford Dependencies: A cross-linguistic typology LREC 2014
- Emma Strubell, Patrick Verga, Daniel Andor, David Weiss, Andrew McCallum Linguistically-Informed Self-Attention for Semantic Role Labeling. EMNLP 2018.
- Héctor Martínez Alonso, Željko Agić, Barbara Plank and Anders Søgaard. Parsing Universal Dependencies without training. In EACL 2017 (long), Valencia, Spain.
- Mike Lewis and Mark Steedman. A* CCG Parsing with a Supertag-factored Model, EMNLP 2014
- Daniel Gildea Dependencies vs. Constituents for Tree-Based AlignmentEMNLP 2004
- Adam Kilgarriff I don't believe in word sense. Computers and the Humanities 31.2 (1997): 91--113.
- Sebastian Pado and Mirella Lapata (2007) Dependency-Based Construction of Semantic Space Models
- Katrin Erk (2016)What do you know about an alligator when you know the company it keeps?
- Beltagy et al 2016 Representing Meaning with a Combination of Logical and Distributional Models. In Computational Linguistics 42:4
- Magnus Sahlgren and Alessandro Lenci (2016)
The Effects of Data Size and Frequency Range
on Distributional Semantic Models
- Stephen Roller and Katrin Erk (2016) Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment
- Eva M. Vecchi, Marco Marelli, Roberto Zamparelli, Marco Baroni (2018)Spicy Adjectives and Nominal Donkeys: Capturing Semantic Deviance Using Compositionality in Distributional Spaces
Cognitively-inclined work on dialogue.
- Michael K Tanenhaus and Sarah Brown-Schmidt (2008) Language processing in the natural world
- Sarah Brown-Schmidt, Christine Gunlogson, Michael K.Tanenhaus (2008) Addressees distinguish shared from private information when interpreting questions during interactive conversation
- Mindaugas Mozuraitis, Craig G.Chambers and Meredyth Daneman (2015)Privileged versus shared knowledge about object identity in real-time referential processing
- Bert Oben and Geert Brône (2016) Explaining interactive alignment: A multimodal and multifactorial account
- David Reitter and Johanna D.Moore (2014) Alignment and task success in spoken dialogue
Very recent work on Language and Vision
- Multimodal Hierarchical Reinforcement Learning Policy for Task-Oriented Visual Dialog (SIGDIAL 2018)
- Do Better ImageNet Models Transfer Better?
- Evaluating Feature Importance Estimates
- PackNet: Adding Multiple Tasks to a Single Network by Iterative Pruning. (CVPR 2018)
- Piggyback: Adapting a Single Network to Multiple Tasks by Learning to Mask Weights (ECCV 2018)
- Explainable Neural Computation via Stack Neural Module Networks (ECCV 2018)
- Xinya Du, Junru Shao, Claire Cardie Learning to Ask: Neural Question Generation for Reading Comprehension, ACL 2018
- Zarriess, Schlangen D. Obtaining referential word meanings from visual and distributional information: Experiments on object naming, ACL 2018
- Ryan Lowe, Michael Noseworthy, Iulian Vlad Serban, Nicolas Angelard-Gontier, Yoshua Bengio,Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses,ACL 2018
Further readings
- Richard N. AslinStatistical learning: a powerful mechanism that operates by mere exposure
- Novotny, Larlus and VedaldiI Have Seen Enough: Transferring Parts Across Categories
- Zeldes, Amir (2018) "The GUM Corpus: Creating Multilayer Resources in the Classroom". Language Resources and Evaluation 51(3), 581–612.
- L. Bentivogli, R. Bernardi, M. Marelli, S. Menini, M. Baroni, R. Zamparelli SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment In Language Resources and Evaluation. Vol 50, Nr. 1, 95--124, 2016. I will give you a copy
- R. Girju and Preslv Nakov, Vivi Nastase, Stan Szpakowicz, Peter Turney Deniz Yuret (2007/2009). Classification of semantic relations between nominals. In SemEval 2007. Extended version in Language Resources and Evaluation 2009, 43(2)
- Carina Silberer, Vittorio Ferrari, and Mirella Lapata. (2016)Visually Grounded Meaning Representations
- Richard Socher Brody Huval Christopher D. Manning Andrew Y. Ng (2012)Semantic Compositionality through Recursive Matrix-Vector Spaces
- Jeff Mitchell, Mirella Lapata (2010) Composition in Distributional Models of Semantics Pado and Lapata (2007)
Further (classical) readings
- Lillian Lee “I’m sorry Dave, I’m afraid I can’t do that”: Linguistics, Statistics, and Natural Language Processing circa 2001
- A. M. Turing (1950) Computing machinary and intelligence. In Mind, 59(236), pp.433-460
This year main CL conferences:
Some of the people who are doing interesting work on C/DSM or Language and Vision
- Online interface to query English and Italian semantic models
- DISCO, another online interface (multiple languages)
- word2vec, the tool and pre-compiled semantic vectors
- Semantic vectors, pre-compiled using word2vec with optimal parameters
- Gensim, Python Framework for Vector Space Modeling
- spaCy
- GUM
- Computer Vision: textbook by Richard Szeliski.
- Reading list on Deep learning (by Bengio)
- ACL 2018 publications
- CCG page ACL Wiki
- WiCV
- See Awesome Community-Curated NLP List (Liling Tan)
- Mailing lists: Corpora,
- Top conferences are run by ACL