Computational Linguistics 17-18
The Computational Linguistics course is taught by Raffaella Bernardi (RB) (UniTN)
Classes are on Mondays (13:00-15:00), Wednesdays (15:00-17:00) and Thursdays (13:00-15:00). For updated information, see the online time-table
Information about the final exam
The exam will consist of two parts each contributing the 50% of the total mark:
- written exercises on the Syntax, Semantics and their interface
- written report on a topic selected among those presented in class. The report can be either a project proposal based on a literature review or the report on a project based on a literature review. The report has to be written in LaTeX.
Students of the Text Processing course will do one of the two parts at their choice.
Students have to agree on the topic of the report with the lecturer at least one month before the exam.
This course is complementary to Carlo Strapparava's course on Human Language Technologies. The Formal Semantic part will be presented in depth in Roberto Zamparelli's course on Logical Structures of Natural Language
Claudio Greco will be the TA for the lab when we use NLTK. Sandro Pezzelle will be the labs when we use DISSECT.
Materials
- Speech and Language Processing (SLP)
- Steven Bird, Ewan Klein, and Edward Loper Natural Language Processing with Python: Ch 8 (CFG and DG), Ch. 9 (Feature Structures) and Ch. 10 (meaning)
- My slides will be posted below after each class.
See the Suverys below:
- Barbara Partee (2017) Formal Semantics. In the handbook of Formal Semantics ed. Maria Aloni and Paul Dekker. (in dropbox)
- Alessandro Lenci (2008) Distributional semantics in linguistic and cognitive research
- Turney and Pantel (2010) From Frequency to Meaning: Vector Space Models of Semantics
- Katrin Erk. (2012) Vector space models of word meaning and phrase meaning: a survey. Language and Linguistics Compass 6(10), 635-653, October 2012. (in dropbox)
- Marco Baroni Composition in Distributional Semantics. Language and Linguistics Compass 6(10), 635-653, October 2013. (in dropbox)
- Gemma Boleda and Aurelie Herbelot (2016) Formal Distributional Semantics: Introduction to the Special issue. Computational Linguistics 42:4
- Raffaella Bernardi, Ruket Cakici, Desmond Elliott, Aykut Erdem, Erkut Erdem, Nazli Ikizler-Cinbis, Frank Keller, Adrian Muscat and Barbara Plank Automatic Description Generation from Images: A Survey of Models, Datasets, and Evaluation Measures JAIR (Journal of Artificial Intelligence Research). Vol. 55, 2016. DOI:10.1613/jair.4900
If you are interested in textbooks about FS:
- For formal semantics and in particular lambda-calculus: Mathematical Methods in Linguistics by Barbara Partee, Alice ter Meulen, Robert Wall
- For a general intro to FS: Introduction to Natural Language Semantics by Henriette de Swart
Topics with a rough schedule
- 8 classes on Syntax (Sep-Oct): Formal Grammars of English, Syntactic Parsing, Statistical Parsing Dependency Parsing.
- 12 classes on Semantics (Oct-Nov): Formal Semantics, Distributional Semantics Models, The Representation of Sentence Meaning
- 3 classes on Syntax-Semantics interface (Nov): Computational Semantics, Semantic Role Labelling and Argument Structure, Neural Models of Sentence Meaning
- 3 classes on Beyhond sentences (Nov): Discourse Coherence, Dialogue, Question Answering/IQA
- 5 classes on Multimodal Models (Dec): Language and Vision
Schedule
- 1.) 18.09.2017
- SYNTAX: Introduction to CL: admin, intro to Formal Languages, Regural Languages and Finite State Automata
- Slides
- 2.) 20.09.2017
- SYNTAX: sentence structure, CFG, Chomsky Hierarchy, which FL for NL syntax, (SLP: Ch. 11)
- Slides
- 3.) 21.09.2017
- SYNTAX: Pen and Pencile exercises on CFG
- Brief intro to LaTeX LaTex Base
- 4.) 25.09.2017
- SYNTAX: Feature Agreement and Unification Grammar
- Slides, Claudio's Slides on Python and NLTK
- 5.) 02.10.2017
- SYNTAX: Other Formal Grammars (Slides on TAG, DG, CG).
- Summary in LaTeX of a paper on HPSG (Andrea, Lara and Sara)
- 6.) 04.10.2017
- SYNTAX: Pen and Pencile exercises on CG, TAG and DG vs. CFG, and comparison of available treebanks (CCGbank, Peen Treebank). Check in class that NLTK works on students'PCs
- Summary on a paper on TAG (Joshi 2009, Roberto and Alberto)
- 7) 05.10.2017
- SYNTAX: Syntactic and Statistical Parsing (SLP: Ch. 12, 13 and 14).
- Summary on a paper on DG (de Marneffe et ali. 2014, Atakan and Behnia) Demo top vs. bottom up and ambiguity with NLTK by Claudio Greco.
- Slides
- 8) 09.10.2017
- SYNTAX: LAB with NLTK: Exercises with NLTK on Top and Bottom parsing.
- 9) 11.10.2017
- SYNTAX: Parsing (RB). Demo on chart, statistical parser, features by Claudio Greco
- Slides Summary of a paper on DG vs. Constituency (Gildea 2004 Natallia and Aliia)
- 10) 12.10.2017
- SEMANTICS: Formal Semantics: Introduction to Semantics, Brief intro to Logic, to Formal Semantics and semantic types.
- Slides
- 11) 16.10.2017
- SEMANTICS: Compositionality, lambda calculus. Slides
- 12) 18.10.2017
- SEMANTICS: Exercises on Formal Semantics.
- 13) 23.10.2017
- SEMANTICS: Abstraction in the lambda calculus Slides
- 14) 25.11.2017
- SYNTAX-SEMANTICS: Lambda calculus, CFG and CG
- Slides
- 15) 26.10.2017
- SEMANTICS: Distributional Semantics Models
- Slides
- 16) 30.10.2017
- SEMANTICS: Lab on DSM
- 17) 02.11.2017
- SEMANTICS:
- Summary and Discussion on Paper on LS (Landauer and Dumais (1997)) lead by Raffa
- NOTE: The lecture will be in Povo (DISI, Garda room from 09:45 to 11:15)
- 18) 06.11.2017
- SEMANTICS: Compositional DSM, Slides
- 19) 08.11.2017
- SEMANTICS: lab on CDSM, DISSECT by Sandro Pezzelle
- 20) 09.11.2017
- SEMANTICS: lab on CDSM, DISSECT by Sandro Pezzelle
- 21) 13.10.2017
- SEMANTICS: Discussion on the evaluations in and Sahlgren and Lenci (2016) and Baroni, Dinu and Kruszewski (2014) lead by Andrea and Lara
- 22) 15.11.2017
- SEMANTICS: Summary and Discussion on Paper on DSM (Erk 2016) lead by Atakan and Aliia
- 23) 16.11.2017
- SEMANTICS: Summary and Discussion on Paper on CDSM (Vecchi et al. 2018) lead by Alberto, Nataliia and Bahareh
- Slides
- 24) 20.11.2017
- SEMANTICS: Summary and Discussion on Papers of FS & DSM (Beltagy et al 2016) lead by Roberto and Sara
- Slides
- 25) 22.11.2017
- MULTIMODAL MODELS: Introduction to Language and Vision
- Slides
- 26) 23.11.2017
- MULTIMODAL MODELS: Overview of current work on LaVi
- Slides
- 27) 28.11.2017 (09:00-10:30)
- MULTIMODAL MODELS: Discussion on Cooperative Visual Dialogue (Roberto, Sara, Andrea) and GuessWhat (Nataliia, Atakan, Alberto)
- 28) 29.11.2017
- MULTIMODAL MODELS: Project presentations by PhDs at Clic (Sandro Pezzelle and Claudio Greco)
- Slides
- 29) 30.11.2017
- Sample Written Exam
- Slides
- 30) 4.12.2017
- BEYOND SENTENCES: Dialogue
- Slides
- 31) 14.12.201 (10:30-12:30, 2nd floor CIMeC)
- PROJECTS
PROPOSAL PRESENTATIONS by CL students
- 18.12.201, 14:00-16:00, aula 11, P. Istruzione Written exam
Readings we will discuss in class (discussion lead by one or two students each time)
- Aravind Joshi (2009). "Tree-Adjoining Grammars". In The Oxford Handbook of Computational Linguistics (I can give you a copy)
- M.C. de Marneffe, T. Dozat, N.Silveira, K. Haverinen, F. Ginter, J. Nivre, C. D. ManningUniversal Stanford Dependencies: A cross-linguistic typology LREC 2014
- Mike Lewis and Mark Steedman. A* CCG Parsing with a Supertag-factored Model, EMNLP 2014
- Daniel Gildea Dependencies vs. Constituents for Tree-Based AlignmentEMNLP 2004
- Adam Kilgarriff I don't believe in word sense. Computers and the Humanities 31.2 (1997): 91--113.
- Sebastian Pado and Mirella Lapata (2007) Dependency-Based Construction of Semantic Space Models
- Katrin Erk (2016)What do you know about an alligator when you know the company it keeps?
- Beltagy et al 2016 Representing Meaning with a Combination of Logical and Distributional Models. In Computational Linguistics 42:4
- Magnus Sahlgren and Alessandro Lenci (2016)
The Effects of Data Size and Frequency Range
on Distributional Semantic Models
- Stephen Roller and Katrin Erk (2016) Relations such as Hypernymy: Identifying and Exploiting Hearst Patterns in Distributional Vectors for Lexical Entailment
- Eva M. Vecchi, Marco Marelli, Roberto Zamparelli, Marco Baroni (2018)Spicy Adjectives and Nominal Donkeys: Capturing Semantic Deviance Using Compositionality in Distributional Spaces
Readings for projects
- Richard N. AslinStatistical learning: a powerful mechanism that operates by mere exposure
- Novotny, Larlus and VedaldiI Have Seen Enough: Transferring Parts Across Categories
- Zeldes, Amir (2017) "The GUM Corpus: Creating Multilayer Resources in the Classroom". Language Resources and Evaluation 51(3), 581–612.
- L. Bentivogli, R. Bernardi, M. Marelli, S. Menini, M. Baroni, R. Zamparelli SICK through the SemEval glasses. Lesson learned from the evaluation of compositional distributional semantic models on full sentences through semantic relatedness and textual entailment In Language Resources and Evaluation. Vol 50, Nr. 1, 95--124, 2016. I will give you a copy
- R. Girju and Preslv Nakov, Vivi Nastase, Stan Szpakowicz, Peter Turney Deniz Yuret (2007/2009). Classification of semantic relations between nominals. In SemEval 2007. Extended version in Language Resources and Evaluation 2009, 43(2)
- Zarriess, Schlangen D. Obtaining referential word meanings from visual and distributional information: Experiments on object naming, ACL 2017
- Ryan Lowe, Michael Noseworthy, Iulian Vlad Serban, Nicolas Angelard-Gontier, Yoshua Bengio,Towards an Automatic Turing Test: Learning to Evaluate Dialogue Responses,ACL 2017
- Xinya Du, Junru Shao, Claire Cardie Learning to Ask: Neural Question Generation for Reading Comprehension, ACL 2017
- Carina Silberer, Vittorio Ferrari, and Mirella Lapata. (2016)Visually Grounded Meaning Representations
- Richard Socher Brody Huval Christopher D. Manning Andrew Y. Ng (2012)Semantic Compositionality through Recursive Matrix-Vector Spaces
- Jeff Mitchell, Mirella Lapata (2010) Composition in Distributional Models of Semantics Pado and Lapata (2007)
Further (classical) readings
- Lillian Lee “I’m sorry Dave, I’m afraid I can’t do that”: Linguistics, Statistics, and Natural Language Processing circa 2001
- A. M. Turing (1950) Computing machinary and intelligence. In Mind, 59(236), pp.433-460
Some of the people who are doing interesting work on C/DSM
Tools for C/DSM
- Online interface to query English and Italian semantic models
- DISCO, another online interface (multiple languages)
- word2vec, the tool and pre-compiled semantic vectors
- Semantic vectors, pre-compiled using word2vec with optimal parameters
- Gensim, Python Framework for Vector Space Modeling
- spaCy
Further references and NLP tools
- GUM
- Computer Vision: textbook by Richard Szeliski.
- Reading list on Deep learning (by Bengio)
- ACL 2017 publications
- CCG page ACL Wiki
- See Awesome Community-Curated NLP List (Liling Tan)
- Mailing lists: Corpora,
- Top conferences are run by ACL