Web Mining

Laurea Magistrale in Computer Science
Academic year 2009-2010, second semester

Previous academic years: [ (2008/2009) ] [ (2007/2008) ]

Teachers

For email messages concerning the course requirements (e.g., assignments hand-in) please refer to Umut Avci.


Schedule

  • Monday 1600-1800, room 104
    (Exceptions: April 12 in room 205, May 3 in room 206)
  • Thursday 0830-1030, room 22

Every student in the course is kindly asked to send a mail to Umut Avci containing name and matricola number. Every mail exchanged for purposes related to the course must be sent from the students' email account (name.surname@studenti.unitn.it) and contaning [webmining] in the subject.

Program

  1. Web crawling
  2. Web page indexing
  3. Information retrieval
  4. Unsupervised learning: clustering
  5. PageRank and HITS
  6. Search engine attack and defense strategies
See the detailed program below

Exam

Exams will consist of a written test (exercises) and an oral test (discussion of theory and, optionally, lab assignments).

Lab assignments

Assignments are optional (i.e., you can skip them) and personal (no collaboration); if done (in a satisfactory way), they will be the basis of the oral examination; otherwise, the oral will consist of questions about any part of the course program.

To hand in an assignment, your email should contain:

  • URL of requested document or targzipped source code (NOT attached to the email, but available in the student's web space);
  • Instructions for compilation and execution.
  • Name of the student.
Assigned/
Due
SubjectSelected contributions
2010-02-25
2010-03-01
Data collection
and plotting

with Grapheur
[Assessments]

Bibliography

Official Textbook

  • SOUMEN CHAKRABARTI
    Mining the Web - Discovering knowledge from hypertext data
    Morgan Kaufmann - Elsevier, 2003.

Course notes

External links

Course material

Sample code

Exercises

Exams


Program


Page maintained by Mauro Brunato