NSF/BDI: Collaborative Research:
Endowing Biological Databases with Analytical Power: Indexing, Querying, and
Mining of Complex Biological Structures
E-mail: hanj at cs.uiuc.edu, URL: http://www.cs.uiuc.edu/~hanj
List
of Supported Students and Staff
§
Dong
Xin, Ph.D. student, Department of Computer Science,
§
Deng
Cai, Ph.D. student, Department of Computer Science,
Project
Summary
We propose to perform
in-depth research and development of new, powerful, and scalable indexing,
query processing, and data mining methods for construction of scalable, efficient,
and analysis-based, heterogeneous biological database systems. Our proposed study will work on a set of
typical genomic and biological databases and focus on (1) development of efficient and scalable
methods for indexing and accessing of complex biological structures, with the
following emphases on a) mining structural patterns in large multi-graphs, b)
mining dense recurrent graphs/networks, c) discriminative feature-based
indexing of biologic structures, and d) similarity search on biologic structures,
and (2) The project investigates
efficient and effective approaches to the implementation of this system. The project also strives to ensure that the
developed technology will enable the development of more advanced analytical
biological database systems for broad applications.
Publications and Products:
Journal articles (including
accepted)
1. Jian Pei, Jiawei Han, Hongjun Lu,
Shojiro Nishio, Shiwei Tang, and Dongqing Yang, “H-Mine: Fast and
Space-Preserving Frequent Pattern Mining in Large Databases”, IIE
Transactions, 39:593-605, 2007.
2. Chulyun Kim, Sangkyum Kim, Russell
Dorer, Dan Xie, Jiawei Han, and Sheng Zhong, “TagSmart: Analysis and
Visualization for Yeast Mutant Fitness Data Measured by Tag Microarrays”,
BMC Bioinformatics, 8:128, April 2007.
(http://www.biomedcentral.com/1471-2105/8/128)
3.
4. Jiawei Han, Hong Cheng, Dong Xin,
and Xifeng Yan, “Frequent Pattern Mining: Current Status and Future
Directions”, Data Mining and Knowledge Discovery, 14, 2007. (Online
version published on January 27, 2007, DOI 10.1007/s10618-006-0059-1 SpringerLink).
5. Dong Xin, Jiawei Han, Xifeng Yan and
Hong Cheng, “On Compressing Frequent Patterns”, Knowledge and Data
Engineering (Special issue on Intelligent Data Mining), 60(1): 5-29, 2007.
6. Dong Xin, Jiawei Han, Xiaolei Li,
Zheng Shao, and Benjamin W. Wah, “Computing Iceberg Cubes by Top-Down and
Bottom-Up Integration: The StarCubing Approach”, IEEE Transactions on
Knowledge and Data Engineering, 19(1): 111-126, 2007.
7. Chao Liu, Long Fei, Xifeng Yan,
Jiawei Han, and Samuel P. Midkiff, “Statistical Debugging: A Hypothesis Testing-based
Approach”, IEEE Transactions on Software Engineering, 32(10):831-848,
2006.
8. Yixin Chen, Guozhu Dong, Jiawei Han,
9. Xifeng Yan, Feida Zhu, Philip S. Yu,
and Jiawei Han, “Feature-based Substructure Similarity Search”, ACM
Transactions on Database Systems, 31(4): 1418-1453, 2006.
10. Deng Cai, Xiaofei He, Jiawei Han and
Hong-Jiang Zhang, “Orthogonal Laplacianfaces for Face Recognition”,
IEEE Transactions on Image Processing, 15(11): 3608-3614, 2006.
11. F. Pan, K. Kamath, K. Zhang, S.
Pulapura, A. Achar, J. Nunez-Iglesias, Y. Huang, X. Yan, J. Han, H. Hu, M. Xu,
X. J. Zhou. “Integrative Array Analyzer: A software package for analysis
of cross-platform and cross-species microarray data”, Bioinformatics,
22(13): 1665-1667, 2006.
12. J. Wang, J. Han, and J. Pei, “Closed Constrained-Gradient Mining in Retail
Databases”, IEEE Transactions on Knowledge and Data Engineering,
18(6): 764-769, 2006.
13. X. Yin, J. Han, J. Yang and P. S.
Yu, “Efficient Classification
across Multiple Database Relations: A CrossMine Approach”, IEEE
Transactions on Knowledge and Data Engineering}, 18(6): 770-783, 2006.
14. Charu Aggarwal, Jiawei Han, Jianyong
Wang, and Philip S. Yu, “A
Framework for On-Demand Classification of Evolving Data Streams”,
IEEE Transactions on Knowledge and Data Engineering, 18(5):577-789, 2006.
15. Hwanjo Yu, Jiong Yang, Jiawei Han,
and Xiaolei Li, “Making SVM
Scalable to Large Data Sets Using Hierarchical Indexing”, Data Mining
and Knowledge Discovery, 11(3): 295-321, 2005.
16. Jiawei Han, Yixin Chen, Guozhu Dong,
Jian Pei, Benjamin W. Wah, Jianyong Wang, and Y. Dora Cai, “Stream Cube: An Architecture for
Multi-Dimensional Analysis of Data Streams”, Distributed and Parallel
Databases, 18(2): 173-197, 2005.
17. Xifeng Yan, Philip Yu, and Jiawei
Han, “Graph Indexing Based on
Discriminative Frequent Structure Analysis”, ACM Transactions on
Database Systems, 30(4): 960-993 2005.
18. Deng Cai, Xiaofei He and Jiawei Han,
“Document Clustering Using Locality
Preserving Indexing”, IEEE Transactions on Knowledge and Data
Engineering, 17(12):1624-1637, 2005.
19. C. Aggarwal, J. Han, J. Wang, and P.
S. Yu, “On Efficient Algorithms for
High Dimensional Projected Clustering of Data Streams”, Data Mining and Knowledge Discovery,
10:251-272, 2005.
20. Petre Tzvetkov, Xifeng Yan, Jiawei
Han, “TSP: Mining top-k closed sequential patterns, Knowl. Inf. Syst.,
7(4): 438-457, 2005.
21. J. Wang, J. Han, Y. Lu, and P.
Tzvetkov, “TFP: An Efficient Algorithm for Mining Top-K Frequent Closed
Itemsets”, IEEE Transactions on Knowledge and Data Engineering},
17(5):652-664, 2005.
22. K. Wang, Y. Jiang, J. X. Yu, G.
Dong, and J. Han, “Divide-and-Approximate:
A Novel Constraint Push Strategy for Iceberg Cube Mining”, IEEE Transactions on Knowledge and Data
Engineering, 17(3):354-368, 2005.
Book and Book Chapters
Refereed Conference Publications (Refereed Workshop Publications
are omitted due to limited space)
1. Chao Liu, Xiangyu Zhang, Jiawei Han,
Yu Zhang and Bharat K. Bhargava, “Failure Indexing: A
Dynamic Slicing Based Approach”, in Proc. 2007 IEEE Int. Conf. on
Software Maintenance (ICSM'07),
2. Deng Cai, Xiaofei He, and Jiawei
Han, “A
Unified Subspace Learning Framework for Content-Based Image Retrieval”,
in Proc. 2007 Int. Conf. on ACM Multimedia (ACM-MM'07), Augsburg, Germany,
Sept. 2007.
3. Tianyi Wu, Yuguo Chen and Jiawei
Han, “Association
Mining in Large Databases: A Re-Examination of Its Measures”, in
Proc. 2007 Int. Conf. on Principles and Practice of Knowledge Discovery in
Databases (PKDD'07), Warsaw, Poland, Sept. 2007.
4. Chen Chen, Xifeng Yan, Philip S. Yu,
Jiawei Han, DongQing Zhang, and Xiaohui Gu, “Towards Graph Containment
Search and Indexing”, in Proc. 2007 Int. Conf. on Very Large Data
Bases (VLDB'07), Vienna, Austria, Sept. 2007.
5. Hector Gonzalez, Jiawei Han, Xiaolei
Li, Margaret Myslinska, and John Paul Sondag, “Adaptive Fastest
Path Computation on a Road Network: A Traffic Mining Approach”, in
Proc. 2007 Int. Conf. on Very Large Data Bases (VLDB'07),
6. Xiaolei Li and Jiawei Han, “Mining Approximate Top-K
Subspace Anomalies in Multi-Dimensional Time-Series Data”, in Proc.
2007 Int. Conf. on Very Large Data Bases (VLDB'07), Vienna, Austria, Sept.
2007.
7. Tainyi Wu, Xiaolei Li, Dong Xin,
Jiawei Han, Jacob Lee, and Ricardo Redder, “DataScope: Viewing
Database Contents in Google Maps' Way”, in Proc. 2007 Int. Conf. on
Very Large Data Bases (VLDB'07), Vienna, Austria, Sept. 2007 (system demo).
8. Xiaoxin Yin, Jiawei Han, and Philip
S. Yu, “Truth
Discovery with Multiple Conflicting Information Providers on the Web”,
in Proc. 2007 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
(KDD'07), San Jose, CA, Aug. 2007.
9. Xiaolei Li, Jiawei Han, Jae-Gil Lee,
and Hector Gonzalez, “Traffic Density-based
Discovery of Hot Routes in Road Networks”, in Proc. 2007 Int. Symp. on
Spatial and Temporal Databases (SSTD'07),
10. Deng Cai, Xiaofei He and Jiawei Han,
“Isometric
Projection”, in Proc. 2007 AAAI Conf. on Artificial Intelligence
(AAAI-07), Vancouver, B. C., Canada, July 2007.
11. Wen Jin, Anthony K.H. Tung, Martin
Ester, and Jiawei Han, “On Efficient
Processing of Subspace Skyline Queries on High Dimensional Data”, in
Proc. 2007 Int. Conf. on Scientific and Statistical Database Management
(SSDBM'07),
12. Deng Cai, Xiaofei He, Yuxiao Hu,
Jiawei Han, and Thomas Huang, “Learning a Spatially
Smooth Subspace for Face Recognition”, in Proc. 2007 IEEE Conf. on
Computer Vision and Pattern Recognition (CVPR'07),
13. Jae-Gil Lee, Jiawei Han, and
Kyu-Young Whang, “Trajectory
Clustering: A Partition-and-Group Framework”, in Proc. 2007 ACM
SIGMOD Int. Conf. on Management of Data (SIGMOD'07), Beijing, China, June 2007.
14. Dong Xin, Jiawei Han, and Kevin
C.-C. Chang, “Progressive and
Selective Merge: Computing Top-K with Ad-hoc Ranking Functions”, in
Proc. 2007 ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'07), Beijing,
China, June 2007.
15. Feida Zhu, Xifeng Yan, Jiawei Han,
and Philip S. Yu, “gPrune: A
Constraint Pushing Framework for Graph Pattern Mining”, in Proc. 2007
Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD'07), Nanjing,
China, May 2007. (Best Student Paper Award)
16. Jiawei Han, Hong Cheng, Dong Xin,
and Xifeng Yan, “Frequent
Pattern Mining: Current Status and Future Directions”, Data Mining
and Knowledge Discovery, 14(1), 2007. (Online version published on January 27,
2007, DOI 10.1007/s10618-006-0059-1 SpringerLink).
17. Jing Gao, Wei Fan, and Jiawei Han,
“A General
Framework for Mining Concept-Drifting Data Streams with Skewed Distributions”,
in Proc. 2007 SIAM Int. Conf. on Data Mining (SDM'07), Minneapolis, MN, April
2007.
18. Xiaolei Li, Jiawei Han, Sangkyum
Kim, and Hector Gonzalez, “ROAM: Rule- and
Motif-Based Anomaly Detection in Massive Moving Object Data Sets”, in
Proc. 2007
19. Hong Cheng, Xifeng Yan, Jiawei Han,
and Chih-Wei Hsu, “Discriminative
Frequent Pattern Analysis for Effective Classification”, in Proc.
2007 Int. Conf. on Data Engineering (ICDE'07), Istanbul, Turkey, April 2007.
20. Feida Zhu, Xifeng Yan, Jiawei Han,
Philip S. Yu, and Hong Cheng, “Mining Colossal
Frequent Patterns by Core Pattern Fusion”, in Proc. 2007 Int. Conf.
on Data Engineering (ICDE'07), Istanbul, Turkey, April 2007. (Best Student Paper Award)
21. Hector Gonzalez, Jiawei Han, and
Xuehua Shen, “Cost-conscious
Cleaning of Massive RFID Data Sets”, in Proc. 2007 Int. Conf. on Data
Engineering (ICDE'07),
22. Xiaoxin Yin, Jiawei Han, and Philip
S. Yu, “Object
Distinction: Distinguishing Objects with Identical Names by Link Analysis”,
in Proc. 2007 Int. Conf. on Data Engineering (ICDE'07),
23. Wen Jin, Martin Ester, Zengjian Hu,
and Jiawei Han, “The Multi-Relational
Skyline Operator”, in Proc. 2007 Int. Conf. on Data Engineering
(ICDE'07), Istanbul, Turkey, April 2007.
24. Deng Cai, Xiaofei He, Kun Zhou,
Jiawei Han and Hujun Bao, “Locality Sensitive
Discriminant Analysis”, in Proc. 2007 Int. Joint Conf. on
Artificial Intelligence (IJCAI'07), Hyderabad, India, Jan. 2007.
25. Chao Liu, Zeng Lian, and Jiawei Han,
“How
Bayesians Debug?”, in Proc. 2006 Int. Conf. on Data Mining (ICDM'06),
26. Hong Cheng, Philip S. Yu, and Jiawei
Han, “AC-Close:
Efficiently Mining Approximate Closed Itemsets by Core Pattern Recovery”,
in Proc. 2006 Int. Conf. on Data Mining (ICDM'06),
27. Chao Liu and Jiawei Han, “Failure Proximity: A Fault
Localization-Based Approach”, in Proc. 14th ACM SIGSOFT Symposium on
the Foundations of Software Engineering (FSE'06),
28. Hector Gonzalez, Jiawei Han, and
Xiaolei Li, “Mining
Compressed Commodity Workflows From Massive RFID Data Sets”, in Proc.
2006 Int. Conf. on Information and Knowledge Management (CIKM'06), Arlington,
VA, Nov. 2006.
29. Xiaoxin Yin, Jiawei Han, and Philip
Yu, “LinkClus: Efficient
Clustering via Heterogeneous Semantic Links”, in Proc. 2006 Int.
Conf. on Very Large Data Bases (VLDB'06),
30. Hector Gonzalez, Jiawei Han, and
Xiaolei Li, “FlowCube: Constructuing RFID
FlowCubes for Multi-Dimensional Analysis of Commodity Flows”,
in Proc. 2006 Int. Conf. on Very Large Data Bases (VLDB'06),
31. Dong Xin, Chen Chen, and Jiawei
Han, “Towards Robust Indexing for
Ranked Queries”, in Proc. 2006 Int. Conf. on Very Large Data
Bases (VLDB'06),
32. Dong Xin, Jiawei Han, Hong Cheng,
and Xiaolei Li, “Answering Top-k Queries with
Multi-Dimensional Selections: The Ranking Cube Approach”, in
Proc. 2006 Int. Conf. on Very Large Data Bases (VLDB'06),
33. Dong Xin, Hong Cheng, Xifeng Yan,
and Jiawei Han, “Extracting Redundancy-Aware
Top-K Patterns”, in Proc. 2006 ACM SIGKDD Int. Conf. on
Knowledge Discovery and Data Mining (KDD'06), Philadelphia, PA, Aug. 2006.
34. Qiaozhu Mei, Dong Xin, Hong Cheng,
ChengXiang Zhai, and Jiawei Han, “Generating Semantic
Annotations for Frequent Patterns with Context Analysis”, in
Proc. 2006 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
(KDD'06), Philadelphia, PA, Aug. 2006. (Best Student
Paper Runner-Up Award)
35. Chao Liu, Chen Chen, Jiawei Han, and
Philip Yu, “GPLAG: Detection of Software
Plagiarism by Procedure Dependency Graph Analysis”, in Proc.
2006 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'06),
36. Dong Xin, Xuehua Shen, Qiaozhu Mei,
and Jiawei Han, “Discovering Interesting
Patterns Through User's Interactive Feedback”, in Proc. 2006
ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'06),
Philadelphia, PA, Aug. 2006.
37. Deng Cai, Xiaofei He and Jiawei Han,
“Tensor Space Model for
Document Analysis”, in Proc. 2006 Int. ACM SIGIR Conf. on
Research & Development on Information Retrieval (SIGIR'06), Seattle, WA,
Aug. 2006.
38. Hongyan Liu, Ying Lu, Jiawei Han,
and Jun He, “Error-Adaptive and Time-Aware
Maintenance of Frequency Counts over Data Streams”, in Proc.
2006 Int. Conf. on Web-Age Information Management (WAIM'06),
39. Kaushik Chakrabarti, Venkatesh
Ganti, Jiawei Han, and Dong Xin, “Ranking Objects Based on
Relationships”, in Proc. 2006 ACM SIGMOD Int. Conf. on
Management of Data (SIGMOD'06), Chicago, IL, June 2006.
40. Xiaolei Li, Jiawei Han, and Sangkyum
Kim, “Motion-Alert: Automatic
Anomaly Detection in Massive Moving Objects”, Proc. 2006 IEEE
Int. Conf. on Intelligence and Security Informatics (ISI'06), San Diego, CA,
May 2006.
41. Wen Jin, Anthony K. H. Tung, Jiawei
Han, and Wei Wang, “Ranking Outliers Using
Symmetric Neighborhood Relationship,” in Proc. 2006
Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD'06),
Singapore, April 2006.
42. Hongyan Liu, Jiawei Han, Dong Xin,
and Zheng Shao, “Mining Interesting Patterns
from Very High Dimensional Data: A Top-Down Row Enumeration Approach,”
in Proc. 2006 SIAM Int. Conf. on Data Mining (SDM'06), Bethesda, MD, April
2006. (One of “Best of SDM’06”)
43. Chao Liu, Xifeng Yan, and Jiawei
Han, “Mining Control Flow
Abnormality for Logic Error Isolation,” in Proc. 2006 SIAM
Int. Conf. on Data Mining (SDM'06), Bethesda, MD, April 2006.
44. Charu Aggarwal, Chen Chen, and
Jiawei Han, “On the Inverse Classification
Problem and its Applications”, in Proc. 2006 Int. Conf. on
Data Engineering (ICDE'06), Atlanta, Georgia, April 2006.
45. Hector Gonzalez, Jiawei Han, Xiaolei Li, and Diego Klabjan,
“Warehousing and Analysis of
Massive RFID Data Sets”, in Proc. 2006 Int. Conf. on Data Engineering
(ICDE'06), Atlanta, Georgia, April 2006. (Best
Student Paper Award)
46. Hongyan Liu, Jiawei Han, Dong Xin,
and Zheng Shao, “Top-Down Mining of
Interesting Patterns from Very High Dimensional Data”, in
Proc. 2006 Int. Conf. on Data Engineering (ICDE'06), Atlanta, Georgia, April
2006.
47. Dong Xin, Jiawei Han, Zheng Shao,
and Hongyan Liu, “C-Cubing: Efficient
Computation of Closed Cubes by Aggregation-Based Checking”, in
Proc. 2006 Int. Conf. on Data Engineering (ICDE'06), Atlanta, Georgia, April
2006.
48. Xifeng Yan, Feida Zhu, Jiawei Han,
and Philip Yu, “Searching Substructures with
Superimposed Distance”, in Proc. 2006 Int. Conf. on Data
Engineering (ICDE'06), Atlanta, Georgia, April 2006.
49. Deng Cai, Zheng Shao, Xiaofei He,
Xifeng Yan, Jiawei Han, “Community Mining from
Multi-Relational Networks”, in Proc. 2005 European Conf. on
Principles and Practice of Knowledge Discovery in Databases (PKDD'05), Porto,
Portugal, Oct., 2005.
50. Wen Jin, Martin Ester and Jiawei
Han, “Efficient Processing of
Ranked Queries with Sweeping Selection”, in Proc. 2005 European Conf. on Principles
and Practice of Knowledge Discovery in Databases (PKDD'05), Porto, Portugal,
Oct., 2005.
51. Xiaoxin Yin and Jiawei Han, “Efficient Classification from Multiple Heterogeneous
Databases”, in Proc. 2005 European Conf. on Principles and
Practice of Knowledge Discovery in Databases (PKDD'05), Porto, Portugal, Oct.,
2005.
52. C. Liu, X. Yan, L. Fei, J. Han, and
S. Midkiff, “SOBER:
Statistical Model-based Bug Localization”, Proc. 2005 ACM SIGSOFT
Symp. on the Foundations of Software Engineering (FSE 2005), Lisbon, Portugal,
Sept. 2005.
53. D. Xin, J. Han, X. Yan and H. Cheng,
“Mining Compressed
Frequent-Pattern Sets”, Proc. 2005 Int. Conf. on Very Large Data
Bases (VLDB'05), Trondheim, Norway, Aug. 2005.
54. X. Yan, H. Cheng, J. Han, and D.
Xin, “Summarizing
Itemset Patterns: A Profile-Based Approach”, Proc. 2005 Int. Conf. on
Knowledge Discovery and Data Mining (KDD'05), Chicago, IL, Aug. 2005. (Best Student Paper Runner-Up Award)
55. X. Yan, X. J. Zhou, and J. Han,
“Mining
Closed Relational Graphs with Connectivity Constraints”, Proc. 2005
Int. Conf. on Knowledge Discovery and Data Mining (KDD'05), Chicago, IL, Aug.
2005.
56. X. Yin, J. Han, and P.S. Yu, “Cross-Relational
Clustering with User's Guidance”, Proc. 2005 Int. Conf. on Knowledge
Discovery and Data Mining (KDD'05), Chicago, IL, Aug. 2005.
57. S. Cong, J. Han, and D. Padua,
“Parallel
Mining of Closed Sequential Patterns”, Proc. 2005 Int. Conf. on
Knowledge Discovery and Data Mining (KDD'05), Chicago, IL, Aug. 2005.
58. D. Cai and X. He. “Orthogonal Locality
Preserving Indexing”, Proc. 2005 Int. Conf. on Research and
Development in Information Retrieval (SIGIR'05), Salvador, Brazil, Aug. 2005.
Project
Impact
§
Education: Parts of the new research results are
used in Data Mining courses (CS412, CS512) for both undergraduate and graduate
students being taught in the Department of Computer Science, the
§
Collaborations: For this project we have established collaborations
with Department of Computational and Molecular Biology of the
Current and Future Activities
The following are some of the highlights of
our ongoing work. Please refer to the
section: Publications and Products section for related references
§
Development of efficient and scalable mechanisms for mining biological
networks: (based on) ISMB’05.
§
Development of multi-dimensional stream data analysis techniques:
VLDB’04, VLDB’06.
§
Development of efficient methods for mining frequent, sequential and
structured patterns: TODS’05, TODS’06, ICDE’06 (C-Cubing)
Area
Background
This project is based on the previous works on data
mining, stream data/query processing, and moving
object databases. There have been many research papers
published on these themes. Several textbooks provide good overviews
of data mining principles and algorithms, including (Han and Kamber,
2006), (Hand, Mannila, and Smyth, 2001) and (Hastie,
Tibshirani, and Friedman, 2001) and bioinformatics, such as (Durbin et
al. 1998), (Pevzner 2000), and (Waterman 1995).
Area
References
1. Y. Chen, G. Dong, J. Han, B. W. Wah,
and J. Wang, Multi-Dimensional
Regression Analysis of Time-Series Data Streams, VLDB 2002.
2. Y. Cheng and G. Church, Biclustering
of Expression Data, Proc. 2000 Int. Conf. on Intelligent Systems for Molecular
Biology (ISMB'00), 2000.
3. J, Cohen. Bioinformatics---An introduction for computer
scientists. ACM Computing Surveys.
36(2):122-158, 2004.
4. R. Durbin, S. Eddy, A. Krogh, and G.
Mitchison. Biological Sequence Analysis: Probability Models of Proteins and
Nucleic Acids,
5. H.
Hu, X.
Yan, Y.
Huang, J. Han, X.
J. Zhou: Mining coherent dense subgraphs across massive biological networks
for functional discovery. ISMB,
2005.
6. J. Han
and M. Kamber. Data Mining: Concepts and Techniques, 2nd ed., Morgan Kaufmann, 2006.
7. D. J. Hand, H. Mannila, and P.
Smyth. Principles of Data Mining, MIT
Press, 2001.
8. T. Hastie, R. Tibshirani, and J.
Friedman. The Elements of Statistical Learning: Data Mining, Inference, and
Prediction, Springer-Verlag 2001.
9. P. A. Pevzner. Computational Molecular Biology: An
Algorithmic Approach. MIT Press. 2000.
10. M. S. Waterman. Introduction to Computational Biology: Maps,
Sequences, and Genomes. CRC Press. 1995
Potential Related Projects
This project is related to
most of data mining and biological database and data mining. In
particularly, it is related to P.I.'s NSF IIS 020-9199 (Mining Sequential and
Structured Patterns: Scalability, Flexibility, Extensibility and
Applicability), and P.I.'s NSF IIS-03-08215 (Mining Dynamics of Data Streams in
Multi-Dimensional Space). We wish to
collaborate or exchange research ideas with most of the research projects
related to knowledge discovery in databases, biological databases and data
analysis, bioinformatics, and their applications.
Project Web site URL: http://www.cs.uiuc.edu/~hanj/projs/biobdi.htm
Online software: Online software related to this project can be downloaded
at www.illimine.cs.uiuc.edu
Online resources: Research publications related to
this project can be downloaded at Selected Publications