Suggested Books
- J. Han and M. Kamber. Data Mining: Concepts and Techniques.
Morgan Kaufmann, 2nd ed., 2006
- D. J. Hand, H. Mannila, and P. Smyth, Principles of Data
Mining, MIT Press, 2001
- T. M. Mitchell, Machine Learning, McGraw Hill, 1997
- I. H. Witten and E. Frank, Data Mining: Practical
Machine Learning Tools and Techniques with Java Implementations, Morgan
Kaufmann, 2nd ed. 2005
Lecture Slides
date
|
content
|
19 Feb 2013 |
Course Information
Introduction to Data Mining |
21 Feb
2013
|
Data
Preprocessing overview
How
to
read a
scientific paper |
26 Feb
2013
|
Discussion of the New Jersey Data Reduction Report |
28 Feb
2013
|
Basics of Data
Warehousses |
12
Mar 2013
|
Association
Rules, part
1
Association
Rules, part 2 |
19
Mar 2013
|
Association
Rules, part
3 |
21 Mar
2013
|
ICT Days events
|
26 Mar
2013
|
Discussion of association
rules papers
|
28 Mar
2013
|
Discussion of projects
|
2 Apr
2013
|
Easter holidays
|
4 Apr
2013
|
Clustering, part
1
|
9 Apr
2013
|
Clustering, part 2
|
18 Apr
2013
|
Classification, part 1 |
23 Apr
2013
|
Classification,
part 2 |
30 Apr
2013
|
Discussion
of classification papers
|
2 May
2013
|
Discussion
of projects
|
7 May
2013
|
Data Streams, part
1
Data Streams, part 2
|
Technical Papers
- Daniel Barbará, William DuMouchel, Christos Faloutsos,
Peter J. Haas, Joseph M. Hellerstein, Yannis E. Ioannidis, H. V.
Jagadish, Theodore Johnson, Raymond T. Ng, Viswanath Poosala, Kenneth
A. Ross, Kenneth C. Sevcik: The New Jersey Data Reduction Report.
IEEE Data Eng. Bull. 20(4):3-45(1997).
- Rakesh Agrawal, Ramakrishnan Srikant: Mining
Sequential Patterns. ICDE 1995: 3-14.
- Douglas Burdick, Manuel Calimlim, Johannes Gehrke: MAFIA: A Maximal Frequent Itemset
Algorithm for Transactional Databases.
ICDE 2001: 443-452.
- Tian Zhang, Raghu Ramakrishnan, Miron Livny: BIRCH: An Efficient Data Clustering Method
for Very Large Databases.
SIGMOD Conference 1996: 103-114.
- Martin Ester, Hans-Peter Kriegel, Jörg Sander,
Xiaowei Xu: A Density-Based Algorithm for Discovering
Clusters in Large Spatial Databases with Noise.
KDD 1996: 226-231.
- Manish Mehta, Rakesh Agrawal, Jorma Rissanen: SLIQ: A Fast Scalable Classifier for Data
Mining.
EDBT 1996: 18-32.
- Johannes Gehrke, Raghu Ramakrishnan,
Venkatesh Ganti: RainForest - A Framework for
Fast Decision Tree Construction of Large Datasets.
VLDB 1998: 416-427.
- Charu C. Aggarwal, Jiawei Han, Jianyong Wang, and Philip S. Yu: A framework for clustering evolving data streams.
VLDB 2003.
- Pedro Domingos, Geoff Hulten: Mining high-speed data streams. KDD 2000: 71-80.
|