General
Course Description (09-10)
Material ('08-'09)

Computational Linguistics Course A.A. '09-'10 at FUB

Course Description

Syllabus^

Why is language/speech difficult and interesting?; Ambiguity; History of the field; Morphology; Syntax; Semantics; Pragmatics; Formal Grammars; Parsing; Logic and NLP.

Objectives ^

This course presents a graduate-level introduction to computational linguistics, the primary concern of which is the study of human language use from a computational perspective. The principal objectives of the course are to provide students with a broad overview of the field, and prepare them for further study computational linguistics and language technologies. No previous knowledge of linguistic theory and linguistic applications is assumed. Some background in First Order Logic is preferred.

Grading^

  • 50%: You are to complete an independent project on some topic in computational linguistics. Projects will be presented either (a) to the lecturer only (in this case, you will have to send a written report), or (b) to the other students too during the lab session (in this case you will have to prepare slides). The presentation must include a brief overview of the literature, a critique of a selected paper and a description of your own idea/implementation. Projects' topics will have to be decided together with the lecturer. You can find tips on how to write a paper and on how to give a talk here.
    • 50%: project presentation. Winter session: 21/01/10 (TBC),
    • 50%: final exam. Winter session: Friday 19/02/10, (TBC)

    Practical Info^

    • Students: Compulsory course for (first year) students enrolled in the European Masters Program in LCT. Optional course for 2nd and 3rd year bachelor students and students of other MSc offered at FUB, Faculty of CS.
    • Pre-requisites: None (some background in Logic is preferred.)
    • Lecturer: Dr. Raffaella Bernardi
    • Credits: 4 credits (24 hs lectures, 12 hs labs)
    • Schedule: 1st semester 2009-2010. Lectures: Tuesdays 08:30-10:30. Labs: Wednesdays 18:00-19:00
    • Place: See the updated info in the: RIS
    • Office hours: Tuesdays 10:30-11:30 during the course period (confirmation by email) or by prior arrangement via e-mail during the whole academic year

    News ^

    03-11-2009
    No lab on the 04th of Nov and the 11th of Nov.. There will be a two hour lab on the 12th 17:00-19:00 in E4.31, instead.
    20-10-2009
    No lab on the 21th of Oct. There will be a lab on the 29th 17:00-19:00 in E4.31, instead.
    06-10-2009
    Crash course on Prolog: Wedn. 14th of October, 17:00-18:00 in E 4.31
    05-10-2009
    There won't be the lab on the 21st of Oct. (new date to be agreed with students)
    12-09-2009
    Page oneline

    Participants^

    For organizational reasons, it would be good if you could register to the course expressing your intend to attend it by sending an e-mail to the lecturer. Please, specify whether you are a Bachelor or a Master student, and, in the latter case, whether you will be following the European Masters Program in LCT. If you have not done it yet, please fill in this form and return it to the lecturer by email.

    Material

    Textbooks^

    The recommended text books for the course are:

    Lecture Notes ^

    During the frontal lessons I will use slides that will made available after the lesson from this link.

    Labs^

    Labs aim to give you hands-on experience on the topic presented during the frontal lessons. We will use Prolog for the first part of the exercises (on syntax and parsing). During the second part we will be doing pencile and paper exerices on the lambda caluclus and the interface between syntax and semantics.

    Projects^

    During the second part of the labs, students will carry out small projects on the base of their interest and backgrounds. Some suggestions are listed here.

    • Phoeix: CCG
    • Laura: CG -- logical part -- TBD
    • Le: applied, statistical: e.g. n-grams (overview plus application to spell checker)
    • Thu: Topic Models application to DL
    • Martin: applied, parser (mail sent)
    • Enrico and Albert: DRS and Ace
    • Dmitry and Tenzin: applied, Geffet (mail sent)
    • Karolis: bio-info and NLP
    • Sudeep: Leixcal semantics/ontologies. Text2Onto (mail sent)
    • Alexander: TAG
    • Grady and Maria: WSD in Romanian
    • Angelina:??
    • Gernot with Martin: Brill (PoS tagging)?

    • Controlled Natural Language and Onotology Learning:
    • Lambek Calculus-theoretical
    • Multilingual Chatter-Bot
    • Treebanks (Dependency Grammar)
    • BoB and OPAC
    • TAG and semantic representation
    • Implement some Parsing Algorithm
    • Underspecification-theoretical
    • Prolog: syntax-semantics (see DRT book)


    • NLTK's many suggestions
    • Critique/Slides on Joshi's paper on "TAG"
    • Question type tagger for BoB
    • Underspecification
    • Critque/Slides on "An Efficient Context Free Parsing Algorithm from Jay Earley"
    • ACE
    • Chunk Parser
    • Semantics in Prolog,
    • Finite State Automata
    • Machine Translation
    • Brill's algoritm and Tiger Corpus
    • LSA
    • TAG
    • Incremental parsing

    • Report on: Lexical Semantics
    • CCG and Boxer
    • A morphological parser (FSA): KIMMO
    • Unification-based syntactic parser (Feature Structures): PATR
    • Dialogue

    Critiques^

    Guidelines for preparing the slides and writing critiques

    An example of a critique of A Prototype Reading Coach that Listens. Mostow et al. AAAI 94.

    If you want to write your critique in LaTeX, you will find this site intersting. Below a first proposal for the reading material.


    Weekly Programme ^

    The program below is provisional since it will be adapted to the students background. Slides will be updated through the course after each lesson.
    Nr.
    Lec
    Nr.
    Lab
    Date Slides SLP Lab Deepen in/Related to
    1.   06/10/09 Introduction to LCT and CL Chapters 1-3,8.1,8.2: Course Info; Goals of CL; Challenges: Ambiguities at all levels; Morphology; Finite State Automata; Part-of-Speech; Word Class; Constituency.
    FSA: Theory of Computing,Formal Languages. PoS: Text Processing.
    2.   13/10/09
    Syntax I Chapter 9: Coordination; Formal Grammars; Context-Free Rules and Trees; Sentence-Level Constructions, Chomsky Hierarchy.   Formal Grammars: Compiler
      0. 14/10/09
    17:00-18:00 (E431)
        Prolog Crash Course, Getting Started  
      1. 14/10/09
    18:00-19:00
        FSA: morphology Formal Grammars: Compiler
    3.   20/10/09
    Parsing Chapter 10, 11: Bottom up Parsing; Top down Parsing; Depth First Search; Breadth First Search; [Feature Unification]. See also BS   Text Processing, Compiler
        21/10/09
        No LAB!!  
    4.   27/10/09
    Semantics I Chapter 15.1,15.2: Syntax-Driven Semantics; Lambda-Calculus. [Inference]. See also BB1   Reasoning methods: Computational Logic, Knowledge Representation
    5a.   28/10/09
    (18:00-19:00)
        Lambda calculus:
    exercises
    with solutions
     
      3. 29/10/09
    (17:00-19:00)
        CFG in Prolog  
    5b.   03/11/09 (08:30-09:30) Semantics II      
        03/11/09 (09:30-10:30) Michael Moortgat:
    Natural Language as a Programming Language I
        
    6.   10/11/09 Syntax-Semantics: Ambiguity      
      5. 12/11/09
    (17:00-19:00)
        Bottom-up and Top-down Recognizers  
    7.   17/11/09 Lambek Calculi      
      6. 18/11/09     CFG and lambda calculus: exercises  
    8.   24/11/09 Chris Barker      
      7. 25/11/09     CFG, CG and lambda calculus
     
    9.   01/12/09 Comparison of Formal Grammars    
      8. 02/12/09     LC and lambda terms  
      9. 09/12/09     LC and lambda terms: Amibiguity  
    10.   15/12/09
    Controlled Natural Language
    and QA
         
      12. 16/12/09 (2 hours!)  
     
      Sample written exam  
    11.   12/01/10 Summary
    exercises revision
    Slides    
    12.   20/01/2010
    (18:00-19:35)
        Projects Presentation  
    13.   28/01/2010
        Projects Presentation