Web Mining

Laurea Magistrale in Computer Science
Academic year 2009-2010, second semester

Previous academic years: [ (2008/2009) ] [ (2007/2008) ]

Draft version of notes

The draft version of notes which will become a new book in the next weeks is available here (please observe that not all course topics are covered).

Students are very welcome to contribute by:

  • identifying errors
  • suggesting quotations to put in the different chapters
  • suggesting specific applications of interest
  • ...

All contributions will be duly aknowledged in the final version of the book (which will be available for free to students from our web site).


Teachers

For email messages concerning the course requirements (e.g., assignments hand-in) please refer to Umut Avci.


Schedule

  • Monday 1600-1800, room 104
  • Thursday 0830-1030, room 22

Every student in the course is kindly asked to send a mail to Umut Avci containing name and matricola number. Every mail exchanged for purposes related to the course must be sent from the students' email account (name.surname@studenti.unitn.it) and contaning [webmining] in the subject.

Program

  1. Web crawling
  2. Web page indexing
  3. Information retrieval
  4. Unsupervised learning: clustering
  5. PageRank and HITS
  6. Search engine attack and defense strategies
See the detailed program below

Exam

Exams will consist of a written test (exercises) and an oral test (discussion of theory and, optionally, lab assignments).

Lab assignments

Assignments are optional (i.e., you can skip them) and personal (no collaboration); if done (in a satisfactory way), they will be the basis of the oral examination; otherwise, the oral will consist of questions about any part of the course program.

To hand in an assignment, your email should contain:

  • URL of requested document or targzipped source code (NOT attached to the email, but available in the student's web space);
  • Instructions for compilation and execution.
  • Name of the student.
Assigned/
Due
SubjectSelected contributions
2010-02-25
2010-03-01
Data collection
and plotting

[Assessments]

Bibliography

Official Textbook

    SOUMEN CHAKRABARTI
    Mining the Web - Discovering knowledge from hypertext data
    Morgan Kaufmann - Elsevier, 2003.

Course notes

Draft of the notes book


(Detailed) Program

(Future) Program


Page maintained by Mauro Brunato