E-mail: hanj at cs.uiuc.edu, URL: http://www.cs.uiuc.edu/~hanj
List of Supported Students, Staff, and Collaborators
1. Jiawei Han, PI
Xiaoxin Yin, Ph.D.
student, Department of Computer Science,
Philip S.Yu, manager of the Software Tools and Techniques group,
Jiong Yang, Schroeder Assistant Professor,
Electrical Engineering and Computer Science Department,
Most of structured information in this world is stored in relational databases. Different relations in a database are interconnected with each other according to the database schema created during database design, and the linkages between relations indicate semantic relationships between different objects. The structural information and linkages in relational databases provide a rich source of information for data mining. Unfortunately, most data mining techniques today can only be applied to data stored in single "flat" tables. The scope of this project includes a variety of tasks on data mining and knowledge discovery from relational databases. It focuses on discover structural information and linkages from databases, and using such information in different tasks such as classification, clustering, outlier detection, etc. Our methodology includes designing efficient and scalable method for exploring multi-relational data, and using such methods to discover inherent properties and linkages among such data.
This study will contribute to the development of principles and new approaches in knowledge discovery in multi-relational data, which are of essential importance in a variety of strategic applications including financial decision support, customer-relationship analysis, and bioinformatics.
Publications and Products
1. X. Li, J. Han, X. Yin, and D. Xin, Mining Evolving Customer-Product Relationships in Multi-Dimensional Space, Proc. 2005 Int. Conf. on Data Engineering (ICDE'05), Tokyo, Japan, April 2005.
2. X. Yan, X. J. Zhou, J. Han, Mining Closed Relational Graphs with Connectivity Constraints, Proc. 2005 Int. Conf. on Data Engineering (ICDE'05), Tokyo, Japan, April 2005.
3. X. Yin, J. Han, J. Yang, and P. S. Yu, CrossMine: Efficient Classification across Multiple
Database Relations, Proc. 2004 Int. Conf. on Data Engineering (ICDE'04),
4. X. Yin and J. Han, CPAR:
Classification based on Predictive Association Rules, Proc. 2003 SIAM Int.Conf. on Data Mining (SDM'03),
1. Research Progress: A set of new algorithms and methods (as well as software packages) are developed for mining multi-relational databases. Many of these methods can be used by industry and other agencies.
Education: Parts of this research are
used in a Data Mining graduate course taught at the
Collaborations: For this project we have
established a cooperation with
Current and Future Activities
The following are some of the highlights of our ongoing work. Please refer to the section: Publications and Products section for related references
1. Development of efficient and scalable multi-relational clustering approaches, based on our work of CrossMine published at ICDE'04.
2. Development of efficient and accurate record linkage approaches based on multi-relational data.
3. Further development of efficient and accuracy methods for multi-relational classificationmethods, based on our work of CrossMine published at ICDE'04.
Multi-relational data mining is a new topic proposed a few years ago. It is related to Inductive Logic Programming, which aims at finding hypothesis by induction based on knowledge that may be represented in relational form. Multi-relational data mining explores a much broader scope in both methodologies and applications, including various data mining tasks such as classification, clustering, outlier detection, temporal analysis, etc.
 H. Blockeel, L. De Raedt, and J.
Ramon. Top-down induction of logical decision trees. In Proc. 1998 Int. Conf.
 S. Dzeroski, N. Lavac (editors).
Relational data mining. Springer,
 S. Muggleton. Inductive Logic Programming. Academic Press,
 S. Muggleton and C. Feng. Efficient
induction of logic programs. In Proc. 1990 Conf. Algorithmic Learning Theory,
 J. Neville,
D. Jensen, L. Friedland, and M. Hay. Learning
Relational Probability Trees. Proc. 2003 Int. Conf. Knowledge Discovery and
 J. R.
Quinlan and R. M. Cameron-Jones. FOIL: A midterm report. In Proc. 1993 European
Conf. Machine Learning,
 B. Taskar, E. Segal, and D. Koller.
Probabilistic classiˉcation and clustering in
relational data. In Proc. 2001 Int. Joint Conf. Artiˉcial
Potential Related Projects
The project is closely related to many research projects on knowledge discovery in databases and their applications, such as homeland security, bioinformatics, etc.
Project Web site URL: http://www.cs.uiuc.edu/~hanj/projs/dbmine.html
Online software: Online software related to this project can be downloaded at Software Downloads
Online resources: Research publications related to this project can be downloaded at Selected Publications