Machine Learning

Projects of A.Y. 2008-2009

                                                         (Groups of  maximum 2 people)

 

Simpler on SVMs    

 

The software for SVMs can be downloaded from  www.joachims.org

 

Document Corpus Reuters 90 categories can be downloade from:

http://disi.unitn.it/moschitti/corpora.htm

 

1) Given the software for Support Vector Machines, implement at least 2 weighting schemes for Text Categorization task.

-    Manually Optimize parameters on a held-out set

-    Measure Precision, Recall, F1 on the  test-set

-    Experiment with n categories

-    Article 1 (show as SVMs can be parameterized)

 

2)  Given the software for Support Vector Machines implement at least 2 feature selectors.

      -    Manually Optimize parameters on a held-out set

-    Measure Precision, Recall, F1 on the  test-set

-    Experiment with n categories

-        Article 1 (show as SVMs can be parameterized)

-        Article 2 (feature selection)

 

   3.a)   Given the software for Support Vector Machines and data for some categories

-      Optimize parameters using n-fold cross validation

-      Measure Precision, Recall, F1 using n-fold cross validation

-      Experiment with n categories

-      Articolo 1 (parameterization of SVMs with n-fold cross validation)

OR

3.b) Question Classification Corpus (c, j, and lambda parameters) PLEASE ASK for the DATASET

      OR

3.c) Predicate Argument Classification Corpus (c, j, and lambda parameters) PLEASE ASK for the DATASET

General Classifiers  

4) Rocchio's classifier implementation

-   Test on Reuters data

-   Articolo 1 (shows how to implement Rocchio)

 

5) Naive Bayes' classifier implementation

-  Test on Reuters data

-  slides "basic concepts" or Andrea's slides

 

6) KNN's classifier implementation

-  Test on Reuters data

-  slides "text categorization" or Andrea's slides

 

7) Document Clustering

-  Test on Reuters (or other) data

-  slides "basic concepts" or Andrea's slides

-  Articolo 1:  (some techniques for basic clustering)

 

Advanced on SVMs 


Given the following Software for Support Vector Machines:

-     SVM-Light-TK1.2 available at http://disi.unint.it/moschitti/Tree-Kernel.htm

-   SVM-Light-TK1.5 available in your home of lab account (or ask me).

-   SVM-light 6.0 available at www.joachims.org

 

8) Given SVM-Light-TK1.2, implement string kernel (word sequenze kernel).

-   Test on a small portion of Reuters

-   Articolo 1       

 

9) Given  SVM-light 6.0 software, implement in it tree kernels.

-   Experiment on Question Classification

-   Articolo 1

 

10) Given SVM-light-TK-1.5  (built on top of svm-light 5.0), port it in SVM-light 6.0.

 
11) Implement feature selection based on SVMs (using SVM-Light-TK1.5)

     -  Articolo 1

 

12) Implement and Experiment with feature selection in Convolution Kernel Spaces (using SVM-Light-TK1.5)

     -  Articolo 1