III: Medium: Collaborative Research: Towards On-Line Analytical Mining of Heterogeneous Information Networks

 

National Science Foundation Award Number: NSF IIS 09-05215 (07/15/200906/30/2012)

 

 

Principal Investigator

Jiawei Han

Department of Computer Science

University of Illinois, Urbana-Champaign

2132 Thomas Siebel Center for Computer Science

201 North Goodwin Ave.

Urbana, Illinois 61801

Email: hanj at cs.uiuc.edu

URL: http://www.cs.uiuc.edu/homes/hanj

 

Co-PI

Philip S. Yu

Department of Computer Science

University of Illinois, Chicago

Rm 1138 SEOL

851 S. Morgan St.,

Chicago, Illinois 60607

Email: psyu at cs.uic.edu

URL: http://www.cs.uic.edu/~psyu

 

Co-PI

Xifeng Yan

Computer Science Department
University of California at Santa Barbara

Rm 1111, Harold Frank Hall
University of California
Santa Barbara, CA 93106-5110

Email: xyan at cs.ucsb.edu

URL: http://www.cs.ucsb.edu/~xyan

 

Future Directions

 

Based on our study on information network analysis and data mining, we identify the following several promising research directions on this promising and rising research theme with potential broad and deep impact to science, engineering and society.

 

1.       Knowledge discovery in large scale heterogeneous information networks:

Various kinds of hidden knowledge can be discovered from heterogeneous information networks, by exploration of the power of links and “information redundancy” in interconnected networks, e.g., clustering, classification, ranking, pattern discovery, and outlier analysis of information networks, with tons of algorithms can be developed and lots of applications can be explored.  This is a fertile land of research and may have deep implications in data mining, network science, and their broad applications.

2.       OLAP and similarity search in heterogeneous information networks:

Structuring information networks in multi-dimensional space may facilitate search, OLAP, multidimensional analysis and interactive mining of massive, heterogeneous information networks.  This is an exciting and emerging research frontier, with many applications.  Interesting research topics in this frontier may include summarization of information networks, indexing and similarity search of information networks, OLAP and cube construction/materialization of information networks.

3.       Evolution of large-scale, temporal information-associated information networks:

Considering information network often have temporal information associated with, and the discovery of trends, evolution regularities and outliers/anomalies along with time is another important task, with broad applications.

4.       Knowledge discovery in cyber-physical information networks:

Sensors and GPS system may be connected into sensor networks and they are also connected with some information network entities.  The interconnected information and sensor networks will form a cyber-physical network and therefore pose many challenging research issues on information discovery and search in cyber-physical networks.

5.       Integration of text search and text information analysis with multi-dimensional information network analysis:

Nodes and links in an information network may contain rich text information, such as blogs, product descriptions, forums, discussions, audio and video information, and moreover, documents could be linked together by co-references, lines of following-up discussions, or other functions, forming text-based information networks.  Searching and mining of such information networks poses many new challenge research issues and increases the power of information network analysis.

6.       Mining large databases: An information network analysis approach:

One may view database as a gigantic information network where data are inter-connected and information-related entities (objects).  Thus information network analysis methods can be developed to analyze large, relatively structured databases.

7.       Web mining by integration of Web structure discovery and information network analysis:

One may view web as interconnected information networks instead of isolated objects stored as a data repository. Information network analysis methods can be developed to analyze database data, which may facilitate mining information in large databases.

8.       Data cleaning, data integration and data validation by information network analysis:

Using interconnected, often redundant information in a networked environment, one can often perform intelligent and effective data cleaning, data integration and data validation (such as veracity analysis) by further development of information network analysis functions.

9.       Role discovery, concept hierarchy discovery and ontology enrichments by information network analysis:

It is often important to generate ontology and concept hierarchies for a particular domain, and even domain experts may disagree each other on such information but often consensus can be built by sophisticated information network analysis methods.

10.   Ranking and promotion analysis by information network analysis and for information network analysis:

One may often want to cluster, rank and promote objects in data analysis and such functions are desirable for information network analysis.  Ranking queries and promotion queries have been studied in databases and it is important to re-examine their interactions with information network analysis, especially how to help ranking and promotion analysis in a database if we view database as an information networks, and how to perform ranking and promotion analysis in information networks.

 

Keywords

 

Information network analysis

online analytical processing (OLAP)

data cube

knowledge discovery and data mining

graph summarization

graph mining

efficiency and scalability

 

Project Summary

 

Information networks have been fast expanding and attracted broad interests in recent years, ranging from intrusion pattern detection to social community discovery. Typical information networks include communication networks, social networks, the Web, and biological networks. In contrast to the rising popularity and increasing scale of information networks, there is no general analytical processing framework available to information networks. The lack of such framework makes sensible navigation and interactive knowledge exploration virtually impossible in large-scale networks.

 

As information networks continue to grow in applications such as social networks and the Web, supporting Online Analytical Processing (OLAP) operations on large networks becomes critical to many next generation graph-intensive applications. In this proposal, we present the Information Network OLAP Framework (called Infonet-OLAP), an effort to develop a general system that exploits OLAP concepts and measures unique in the graph space, explores constraints and monotonicity hidden in these measures, and performs discovery-driven OLAP operations for fast and accurate knowledge discovery. We will further support the Infonet-OLAP framework by structure discovery, network summarizations, and self quality assurance of underlying networks. If successful, our techniques would simplify information network analytical processing and transform existing ad hoc graph exploratory work into a uniŻed framework as traditional OLAP does to multidimensional data analysis.

 

Project Impact

 

§         Education:  Parts of the new research results are used in Data Mining courses (CS412, CS512) for both undergraduate and graduate students being taught in the Department of Computer Science, the University of Illinois at Urbana-Champaign.    Moreover, the research results have been and will continuously be published timely in international conferences and journals and be distributed world-wide for education and research.  The new progress will also be integrated into the new edition of our data mining textbook and other research collections.

§         Collaborations: For this project we have established collaborations with Army Research Lab, NASA, HP Labs, IBM T.J. Watson Research Center, Yahoo! Research, Microsoft Research, Boeing, and NCSA (National Center of Supercomputer Applications).  Through such collaborations we expect to have access to real datasets and applications and produce more research results.

Publications and Products

 

Edited Books

 

1.       H. J. Miller and J. Han (eds.), Geographic Data Mining and Knowledge Discovery, 2nd ed., Springer Verlag, 2009.

2.       Hillol Kargupta, Jiawei Han, Philip S. Yu, and Rajeev Motwani (eds.), Next Generation of Data Mining, (Chapman & Hall/CRC Data Mining and Knowledge Discovery Series), 2009 (605 + xxiv pages).

 

Articles in Refereed Journals

 

1.       Mohammad M. Masud, Jing Gao, Latifur Khan, Jiawei Han, and Bhavani Thuraisingham, “Classification and Novel Class Detection in Concept-Drifting Data Streams under Time Constraints", IEEE Transactions on Knowledge and Data Engineering, accepted Feb. 2010.

2.       Deng Cai, Xiaofei He, and Jiawei Han, “Locally Consistent Concept Factorization for Document Clustering", IEEE Transactions on Knowledge and Data Engineering, 22, 2010, accepted 14-Jan-2010.

3.       Hector Gonzalez, Jiawei Han, Hong Cheng, Xiaolei Li, Diego Klabjan, and Tianyi Wu, “Modeling Massive RFID Datasets: A Gateway-Based Movement-Graph Approach", IEEE Transactions on Knowledge and Data Engineering, 22(1):90-104, 2010.

4.       Tianyi Wu, Yuguo Chen, and Jiawei Han, “Re-Examination of Interestingness Measures in Pattern Mining: A Uni_ed Framework", Data Mining and Knowledge Discovery, 2010 (in print) (online pub. Jan. 05, 2010: DOI 10.1007/s10618-009-0161-2)

5.       Charu C. Aggarwal, Chen Chen and Jiawei Han, “The Inverse Classification Problem", Journal of Computer Science and Technology, accepted, Dec. 2009.

6.       Hongyan Liu, Yuan Lin, and Jiawei Han, “Methods for Mining Frequent Items in Data Streams: An Overview", Knowledge and Information Systems, (Online: Nov 11, 2009) (DOI 10.1007/s10115-009-0267-2)

7.       Jae-Gil Lee, Jiawei Han, Xiaolei Li, and Hong Cheng, “Mining Discriminative Patterns for Classifying Trajectories on Road Networks", IEEE Transactions on Knowledge and Data Engineering, accepted, Nov. 2009.

8.       Xiaofei He, Deng Cai, Yuanlong Shao, Hujun Bao, and Jiawei Han, “Laplacian Regularized Gaussian Mixture Model for Data Clustering", IEEE Transactions on Knowledge and Data Engineering, accepted, Nov. 2009.

9.       Duo Zhang, ChengXiang Zhai, Jiawei Han, Ashok Srivastava, and Nikunj Oza, “Topic Modeling for OLAP on Multidimensional Text Databases: Topic Cube and its Applications", Statistical Analysis and Data Mining, 2(5-6):378-395, 2009.

10.   Hongyan Liu, Xiaoyu Wang, Jun He, Jiawei Han, Dong Xin, Zheng Shao, “Top-down mining of frequent closed patterns from very high dimensional data", Information Sciences, 179(7):899-924, 2009.

11.   Hailiang Chen, Hongyan Liu, Jiawei Han, Xiaoxin Yin, “Exploring Optimization of Semantic Relationship Graph for Multi-relational Bayesian Classification", Decision Support Systems, 2009. Online publication complete: 13-AUG-2009. DOI information: 10.1016/j.dss.2009.07.004.

12.   Chen Chen, Xifeng Yan, Feida Zhu, Jiawei Han, Philip S. Yu, “Graph OLAP: A Multi-Dimensional Framework for Graph Data Analysis", Knowledge and Information Systems (KAIS), 21(1):41-63, 2009.

 

Selected Publications in Refereed Books and Monographs

 

1.       Hector Gonzalez, Jiawei Han, Hong Cheng, Tianyi Wu, “Warehousing RFID and Location-Based Sensor Data", Chapter 3 of Intelligent Techniques for Warehousing and Mining Sensor Network Data, Alfredo Cuzzocrea (ed.), IGI Global, 2009.

2.       Xifeng Yan and Jiawei Han, “Graph Indexing", Edited by Charu C. Aggarwal and Haixun Wang (eds.), Managing and Mining Graph Data, Kluwer Academic Publishers, 2009, pp. 143-164.

3.       Hong Cheng and Xifeng Yan and Jiawei Han, “Mining Graph Patterns", Edited by Charu C. Aggarwal and HaixunWang (eds.), Managing and Mining Graph Data, Kluwer Academic Publishers, 2009, pp. 353-382.

4.       Harvey J. Miller and Jiawei Han, “Geographic Data Mining and Knowledge Discovery: An Overview", Harvey J. Miller and Jiawei Han (eds.), Geographic Data Mining and Knowledge Discovery, 2nd ed., Taylor & Francis, 2009, pp. 1-26.

5.       Yvan Bedard and Jiawei Han, “Fundamentals of Spatial Data Warehousing and Geographic Knowledge Discovery", Harvey J. Miller and Jiawei Han (eds.), Geographic Data Mining and Knowledge Discovery, 2nd ed., Taylor & Francis, 2009, pp. 45-68.

6.       Jiawei Han, Jae-Gil Lee and Micheline Kamber, “An Overview of Clustering Methods in Geographic Data Analysis", Harvey J. Miller and Jiawei Han (eds.), Geographic Data Mining and Knowledge Discovery, 2nd ed., Taylor & Francis, 2009, pp. 149-188.

7.       Jiawei Han, “Data Mining", in M. Tamer Ozsu and Ling Liu (eds.), Encyclopedia of Database Systems, Springer, 2009

8.       Hong Cheng and Jiawei Han, “Frequent Itemsets and Association Rules", in M. Tamer Ozsu and Ling Liu (eds.), Encyclopedia of Database Systems, Springer, 2009

9.       Hong Cheng and Jiawei Han, “Pattern-Growth Methods", in M. Tamer Ozsu and Ling Liu (eds.), Encyclopedia of Database Systems, Springer, 2009

10.   Jiawei Han and Bolin Ding, “Stream Mining", in M. Tamer Ozsu and Ling Liu (eds.), Encyclopedia of Database Systems, Springer, 2009

11.   Ronnie Alves, Joel Ribeiro, Orlando Belo, and Jiawei Han, “Ranking Gradients in Multi-Dimensional Spaces", as Chapter 11, in T. M. Nguyen (ed.), Complex Data Warehousing and Knowledge Discovery for Advanced Retrieval Development: Innovative Methods and Applications, IGI Global, 2009. ISBN: 978-1-60566-748-5.

12.   Jiawei Han and Jing Gao, “Research Challenges for Data Mining in Science and Engineering", in H. Kargupta, et al., (eds.), Next Generation of Data Mining, Chapman & Hall/CRC, 2009, pp. 3-28.

13.   Feida Zhu, Xifeng Yan, Jiawei Han and Philip S. Yu, “Mining Frequent Approximate Sequential Patterns", in H. Kargupta, et al., (eds.), Next Generation of Data Mining, Chapman & Hall/CRC, 2009, pp. 69-90.

14.   Jiawei Han and Xiaolei Li, “Classification and Clustering for Homeland Security", in John G. Voeller (ed.), Wiley Handbook of Science and Technology for Homeland Security, John Wiley & Sons, 2009.

 

Selected Publications in Refereed Conference Proceedings

 

1.       Mohammad M. Masud, Jing Gao, Latifur Khan, Jiawei Han, and Bhavani Thuraisingham, “Classification and Novel Class Detection in Data Streams with Active Mining ", Proc. 2010 Paci_c-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD'10), Hyderabad, India, June 2010.

2.       Cindy Xide Lin, Yintao Yu, Jiawei Han, and Bing Liu, “Hierarchical Clustering of Webpages via Cross-Page and In-Page Link Structures", Proc. 2010 Pacific-Asia Conf. on Knowledge Discovery and Data Mining (PAKDD'10), Hyderabad, India, June 2010.

3.       Mohammad Mai_ Hasan Khan, Hieu K. Le, Michael LeMay, Parya Moinzadeh, Lili Wang, Yong Yang, Dong K. Noh, Tarek Abdelzaher, Carl A. Gunter, Jiawei Han, Xin Jin, “Diagnostic Powertracing for Sensor Node Failure Analysis", Proc. 2010 Int. Conf. on Information Processing in Sensor Networks (IPSN'10), Stockholm, Sweden, April, 2010.

4.       Xin Jin, Scott Spangler, Rui Ma, and Jiawei Han, “Topic Initiator Detection on the World Wide Web", Proc. 2010 Int. World Wide Web Conf. (WWW'10), Raleigh, NC, April 2010.

5.       Tim Weninger, William H. Hsu, and Jiawei Han, “CETR Content Extraction via Tag Ratios", Proc. 2010 Int. World Wide Web Conf. (WWW'10), Raleigh, NC, April 2010.

6.       Liangliang Cao, Andrey Del Pozo, Xin Jin, Jiebo Luo, Jiawei Han, and Thomas S. Huang, “RankCompete: Simultaneous Ranking and Clustering of Web Photos", Proc. 2010 Int. World Wide Web Conf. (WWW'10), Raleigh, NC, April 2010 (poster paper).

7.       Zhenhui Li, Ding Zhou, YunFang Juan, and Jiawei Han, “Keyword Extraction For Social Snippets", Proc. 2010 Int. World Wide Web Conf. (WWW'10), Raleigh, NC, April 2010. (poster paper)

8.       Xide Lin, Bo Zhao, Tim Weninger, Jiawei Han, and Bing Liu, “Entity Relation Discovery from Web Tables and Links", Proc. 2010 Int. World Wide Web Conf. (WWW'10), Raleigh, NC, April 2010. (poster paper)

9.       Zhijun Yin, Manish Gupta, Tim Weninger and Jiawei Han, “LINKREC: A Uni_ed Framework for Link Recommendation with User Attributes and Graph Structure", Proc. 2010 Int. World Wide Web Conf. (WWW'10), Raleigh, NC, April 2010. (poster paper)

10.   Jie Yu, Xin Jin, Jiawei Han, and Jiebo Luo, “Social Group Suggestion from User Image Collections", Proc. 2010 Int. World Wide Web Conf. (WWW'10), Raleigh, NC, April 2010. (poster paper)

11.   Hyun Duk Kim, ChengXiang Zhai and Jiawei Han, “Aggregation of Multiple Judgments for Evaluating Ordered Lists", Proc. 2010 European Conf. on Information Retrieval (ECIR'10), Milton Keynes, UK, March 2010. (full paper)

12.   Liangliang Cao, Jiebo Luo, Andrew Gallagher, Xin Jin, Jiawei Han, and Thomas S. Huang, “A Worldwide Tourism Recommendation System Based on Geotagged Web Photos", Proc. 2010 Int. Conf. on Acoustics, Speech, and Signal Processing (ICASSP'10), Dallas, TX, March 2010.

13.   Cuiping Li, Jiawei Han, Xin Jin, Yizhou Sun, Yintao Yu, and Tianyi Wu, “Fast Computation of SimRank for Static and Dynamic Information Networks", Proc. 2010 Int. Conf. on Extending Data Base Technology (EDBT'10), Lausanne, Switzerland, March 2010.

14.   Tianyi Wu, Yizhou Sun, Cuiping Li, and Jiawei Han, \Region-based Online Promotion Analysis", Proc. 2010 Int. Conf. on Extending Data Base Technology (EDBT'10), Lausanne, Switzerland, March 2010.

15.   Zhenhui Li, Jae-Gil Lee, Xiaolei Li, and Jiawei Han, “Incremental Clustering for Trajectories", Proc. 2010 Int. Conf. on Database Systems for Advanced Applications (DASFAA'10), Tsukuba, Japan, April 2010.

16.   Lu Liu, Feida Zhu, Chen Chen, Xifeng Yan, Jiawei Han, Philip Yu, and Shiqiang Yang, \Mining Diversity on Networks", Proc. 2010 Int. Conf. on Database Systems for Advanced Applications (DASFAA'10), Tsukuba, Japan, April 2010.

17.   Dustin Bortner and Jiawei Han, “Progressive Clustering of Networks Using Structure-Connected Order of Traversal", Proc. 2010 Int. Conf. on Data Engineering (ICDE'10), Long Beach, CA, March 2010.

18.   Bolin Ding, Bo Zhao, Cindy Xide Lin, Jiawei Han, Chengxiang Zhai, “TopCells: Keyword-Based Search of Top-k Aggregated Documents in Text Cube", Proc. 2010 Int. Conf. on Data Engineering (ICDE'10), Long Beach, CA, March 2010.

19.   Xifeng Yan, Bin He, Feida Zhu, Jiawei Han, “Top-K Aggregation Queries Over Large Networks", Proc. 2010 Int. Conf. on Data Engineering (ICDE'10), Long Beach, CA, March 2010.

20.   Yizhou Sun, Jiawei Han, Jing Gao, and Yintao Yu, \iTopicModel: Information Network-Integrated Topic Modeling", Proc. 2009 Int. Conf. on Data Mining (ICDM'09), Miami, FL, Dec. 2009.

21.   Xiao Yu, Lu An Tang, and Jiawei Han, \Filtering and Re_nement: A Two-Stage Approach for E_cient and E_ective Anomaly Detection", Proc. 2009 Int. Conf. on Data Mining (ICDM'09), Miami, FL, Dec. 2009.

22.   Samson Hauguel, ChengXiang Zhai, and Jiawei Han, “Parallel PathFinder Algorithms for Mining Structures from Graphs", Proc. 2009 Int. Conf. on Data Mining (ICDM'09), Miami, FL, Dec. 2009.

23.   Jing Gao, Feng Liang, Wei Fan, Yizhou Sun, and Jiawei Han, “Bipartite Graph-based Consensus Maximization among Supervised and Unsupervised Models", Proc. NIPS 2009 Neural Info. Processing Systems Conf. (NIPS'09), Vancouver, B.C., Canada, Dec. 2009.

24.   Peixiang Zhao, Jiawei Han, Yizhou Sun, “P-Rank: A Comprehensive Structural Similarity Measure over Information Networks", Proc. 2009 ACM Conf. on Information and Knowledge Management (CIKM'09), Hong Kong, China, Nov. 2009.

25.   Chandrasekar Ramachandran, Rahul Malik, Xin Jin, Jing Gao, Klara Nahrstedt, and Jiawei Han, “VideoMule: A Consensus Learning Approach to Multi-Label Classi_cation from Noisy User-Generated Videos", Proc. 2009 ACM Int. Conf. on Multimedia (ACM-MM'09), Beijing, China, Oct. 2009.

26.   Tianyi Wu and Jiawei Han, “Subspace Discovery for Promotion: A Cell Clustering Approach", Proc. 12th Int. Conf. on Discovery Science (DS'09), Porto, Portugal, Oct. 2009. (J. Gama et al. (Eds.): DS 2009, LNAI 5808, Springer-Verlag, 2009.)

27.   Min-Soo Kim and Jiawei Han, “CHRONICLE: A Two-Stage Density-based Clustering Algorithm for Dynamic Networks", Proc. 12th Int. Conf. on Discovery Science (DS'09), Porto, Portugal, Oct. 2009. (J. Gama et al. (Eds.): DS 2009, LNAI 5808, Springer-Verlag, 2009.)

28.   Mohammad M. Masud, Jing Gao, Latifur Khan, Jiawei Han, and Bhavani Thuraisingham, “Integrating Novel Class Detection with Classification for Concept-Drifting Data Streams", Proc. 2009 European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'09), Bled, Slovenia, Sept. 2009.

29.   Min-Soo Kim and Jiawei Han, “A Particle-and-Density Based Evolutionary Clustering Method for Dynamic Networks", Proc. 2009 Int. Conf. on Very Large Data Bases (VLDB'09), Lyon, France, Aug. 2009.

30.   Tianyi Wu, Dong Xin, Qiaozhu Mei, and Jiawei Han, “Promotion Analysis in Multi-Dimensional Space", Proc. 2009 Int. Conf. on Very Large Data Bases (VLDB'09), Lyon, France, Aug. 2009.

31.   Chen Chen, Cindy Lin, Matt Fredrikson, Mihai Christodorescu, Xifeng Yan, and Jiawei Han, « Mining Graph Patterns Efficiently via Randomized Summaries", Proc. 2009 Int. Conf. on Very Large Data Bases (VLDB'09), Lyon, France, Aug. 2009.

 

System demonstrations and invited keynote speech

 

1.       Zhenhui Li, Ming Ji, Jae-Gil Lee, LuAn Tang, Jiawei Han, Roland Kays, “MoveMine: Mining Moving Object Databases", (system demo), Proc. 2010 ACM SIGMOD Int. Conf. on Management of Data (SIGMOD'10), Indianapolis, Indiana, June 2010

2.       Xin Jin, Jiebo Luo, Jie Yu, Gang Wang, Dhiraj Joshi, and Jiawei Han, “iRIN: Image Retrieval in Image Rich Information Networks", Proc. 2010 Int. World Wide Web Conf. (WWW'10), Raleigh, NC, April 2010. (demo paper)

3.       Jiawei Han, “Mining Heterogeneous Information Networks by Exploring the Power of Links", Proc. 12th Int. Conf. on Discovery Science (DS'09), Porto, Portugal, Oct. 2009, pp. 1330. (J. Gama et al. (Eds.): DS 2009, LNAI 5808, Springer-Verlag, 2009.)

4.       Yintao Yu, Cindy X. Lin, Yizhou Sun, Chen Chen, Jiawei Han, Binbin Liao, Tianyi Wu, ChengXiang Zhai, Duo Zhang, and Bo Zhao, ”iNextCube: Information Network-Enhanced Text Cube", Proc. 2009 Int. Conf. on Very Large Data Bases (VLDB'09), Lyon, France, Aug. 2009. (System demo)

 

Current and Future Activities

The following are some of the highlights of our ongoing work.  Please refer to the section: Publications and Products section for related references

§         Development of efficient and scalable mechanisms for OLAP information networks: see ICDM’08, EDBT’09, SDM’09, KDD’09 and VLDB’09 papers.

§         Development of multi-dimensional text information-based network analysis methods: see ICDM’08 (text cube), SDM’09 (topic cube), VLDB’09 (iNextCube) demo, and ICDM’09 (iTopicModel)

§         Development of efficient methods for data intensive knowledge discovery and data mining: SDM’09, KDD’09, VLDB’09, WWW’10.

Area Background

 

This project is based on the previous research on information network analysis, data mining, text data analysis, data cube, and multidimensional analysis.    There have been many research papers published on these themes.   Several textbooks on data mining,  information network analysis, and information retrieval provide good overviews of the principles and algorithms, including (Han and Kamber, 2006), (Hastie, Tibshirani, and Friedman,  2009) and (Manning, Raghavan and Schutze 2008).

 

Area References

 

1.       Chen Chen, Xifeng Yan, Feida Zhu, Jiawei Han, and Philip S. Yu, "Graph OLAP: Towards Online Analytical Processing on Graphs", Proc. 2008 Int. Conf. on Data Mining (ICDM'08), Pisa, Italy, Dec. 2008..

2.       J. Han and M. Kamber. Data Mining: Concepts and Techniques, 2nd ed., Morgan Kaufmann, 2006.

3.       T. Hastie, R. Tibshirani, and J. Friedman. The Elements of Statistical Learning: Data Mining, Inference, and Prediction, Springer-Verlag 2001.

4.       Cindy Xide Lin, Bolin Ding, Jiawei Han, Feida Zhu, and Bo Zhao, "Text Cube: Computing IR Measures for Multidimensional Text Database Analysis", Proc. 2008 Int. Conf. on Data Mining (ICDM'08), Pisa, Italy, Dec. 2008.

5.       Yizhou Sun, Yintao Yu, and Jiawei Han, “Ranking-Based Clustering of Heterogeneous Information Networks with Star Network Schema", Proc. 2009 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'09), Paris, France, June 2009.

6.       Yizhou Sun, Jiawei Han, Peixiang Zhao, Zhijun Yin, Hong Cheng, Tianyi Wu, “RankClus: Integrating Clustering with Ranking for Heterogeneous Information Network Analysis", Proc. 2009 Int. Conf. on Extending Data Base Technology (EDBT'09), Saint-Petersburg, Russia, Mar. 2009.

 

Potential Related Projects

This project is related to most of information network analysis, data mining, and OLAP.   In particularly, it is related to P.I.'s NSF IIS 08-42769 (NSF/SGER: CS-BibCube: OLAPing and Mining of Computer Science Literature), and PI’s ARL project NS-CTA INARC (Information Network Academic Research Center).  We wish to collaborate or exchange research ideas with most of the research projects related to information/social network analysis, knowledge discovery in databases, machine learning, web search and data mining, text information systems, and OLAP analysis, and their applications.

Project Web site URL:  http://www.cs.uiuc.edu/homes/hanj/projs/infonet.htm

Online software:  Online software related to this project can be downloaded at www.illimine.cs.uiuc.edu

Online resources:  Research publications related to this project can be downloaded at Selected Publications