Themis Palpanas

University
of Trento
T h e m i s P a l p a n a s

Home | Announcements | R e a d i n g M a t e r i a l | Homework

Suggested Books

J. Han and M. Kamber. Data Mining: Concepts and Techniques. Morgan Kaufmann, 2nd ed., 2006

D. J. Hand, H. Mannila, and P. Smyth, Principles of Data Mining, MIT Press, 2001

T. M. Mitchell, Machine Learning, McGraw Hill, 1997

I. H. Witten and E. Frank, Data Mining: Practical Machine Learning Tools and Techniques with Java Implementations, Morgan Kaufmann, 2nd ed. 2005

Lecture Slides

date
content

19 Feb 2013 Course Information
Introduction to Data Mining

21 Feb 2013
Data Preprocessing overview
How to read a scientific paper

26 Feb 2013
Discussion of the New Jersey Data Reduction Report

28 Feb 2013
Basics of Data Warehousses

12 Mar 2013
Association Rules, part 1
Association Rules, part 2

19 Mar 2013
Association Rules, part 3

21 Mar 2013
ICT Days events

26 Mar 2013
Discussion of association rules papers

28 Mar 2013
Discussion of projects

2 Apr 2013
Easter holidays

4 Apr 2013
Clustering, part 1

9 Apr 2013
Clustering, part 2

18 Apr 2013
Classification, part 1

23 Apr 2013
Classification, part 2

30 Apr 2013
Discussion of classification papers

2 May 2013
Discussion of projects

7 May 2013
Data Streams, part 1
Data Streams, part 2

Technical Papers

Daniel Barbará, William DuMouchel, Christos Faloutsos, Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V. Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth A. Ross, Kenneth C. Sevcik: The New Jersey Data Reduction Report. IEEE Data Eng. Bull. 20(4):3-45(1997).

Rakesh Agrawal, Ramakrishnan Srikant: Mining Sequential Patterns. ICDE 1995: 3-14.

Douglas Burdick, Manuel Calimlim, Johannes Gehrke: MAFIA: A Maximal Frequent Itemset Algorithm for Transactional Databases. ICDE 2001: 443-452.

Tian Zhang, Raghu Ramakrishnan, Miron Livny: BIRCH: An Efficient Data Clustering Method for Very Large Databases. SIGMOD Conference 1996: 103-114.

Martin Ester, Hans-Peter Kriegel, Jörg Sander, Xiaowei Xu: A Density-Based Algorithm for Discovering Clusters in Large Spatial Databases with Noise. KDD 1996: 226-231.

Manish Mehta, Rakesh Agrawal, Jorma Rissanen: SLIQ: A Fast Scalable Classifier for Data Mining. EDBT 1996: 18-32.

Johannes Gehrke, Raghu Ramakrishnan, Venkatesh Ganti: RainForest - A Framework for Fast Decision Tree Construction of Large Datasets. VLDB 1998: 416-427.

Charu C. Aggarwal, Jiawei Han, Jianyong Wang, and Philip S. Yu: A framework for clustering evolving data streams. VLDB 2003.

Pedro Domingos, Geoff Hulten: Mining high-speed data streams. KDD 2000: 71-80.