CSCI 4390/6390   Database Mining

 

Overview

This course will provide an introductory survey of the main topics in data mining and knowledge discovery in databases (KDD), including:  

frequent pattern mining, sequence mining, graph pattern mining, dimensionality reduction, kernel methods, clustering, classification, 

similarity search, recommender systems, etc. Emphasis will be on the algorithmic and system issues in KDD, as well as on practical

applications such as Web mining, multimedia mining, bioinformatics, etc.
Prerequisites

CSCI 2300 and MATH 2800. You should be familiar with calculus, linear algebra, probability and statistics, and algorithms/programming.
Textbook

Data Mining and Analysis, M. J. Zaki and W. Meira, 2014.  http://www.dataminingbook.info/pmwiki.php

Instructor

       Wei Liu, Ph.D. wliu.cu@gmail.com

TA

Hao Li lih13@rpi.edu   TA hours: Thurs 9am-10am, Amos Eaton 217. 

Grading Policy

       50% assignment + 50% project.  6 assignments (choosing 5 best scores to count in) and 3 projects.

 

Syllabus

        1st  week.    Introduction to data mining.  Lecture 1  Lecture 2

 

        2nd week.    Linear algebra, probability and statistics.  Lecture 3  Lecture 4  Assignment 1

      

3rd  week.   Convex optimization, probability, and graph pattern mining.  Lecture 5  Lecture 6

 

4th  week.   Random walks on graphs I and large graph mining I.   Lecture 7  Lecture 8  Assignment 2

 

5th  week.   Random walks on graphs II and large graph mining II.   Lecture 9 

 

6th  week.   Project discussion, and large graph mining III.   Lecture 10  Project 1  Prob1_data Prob2_data

 

7th  week.   Itemset mining, sequence mining, and time series analysis.  Lecture 11 Lecture 12 

 

8th  week.   Dimensionality reduction I.    Lecture 13  Part I

 

9th  week.   Dimensionality reduction II.   Lecture 13  Part II 

 

10th  week.   Project presentations, and kernel methods.

 

11th  week.   Clustering I and II.   Lecture 14   Lecture 15

 

12th  week.   Classification I and II, and project discussion.   Lecture 16  Lecture 17   Project 2 Prob1_data Prob2_data

 

13th  week.   Classification III, and recommender systems I.   Assignment 3   Lecture 18   Lecture 19  

 

14th  week.   Recommender systems II.  Lecture 20  Assignment 4

 

15th  week.   Project presentations, and course summary.  Project 3 (free choice)