NSF III: Small: Multi-Dimensional Structuring, Summarizing and Mining of Social Media Data

National Science Foundation Award Number: NSF IIS 16-18481 (08-01-2016—07-31-2019)

 

Contact Information

 

·         Jiawei Han, PI
Department of Computer Science
University of Illinois, Urbana-Champaign
201 N. Goodwin Ave., Urbana, Illinois 61801 U.S.A.
Office: (217) 333-6903

Fax: (217) 265-6494

E-mail: hanj at cs.uiuc.edu

URL: http://www.cs.uiuc.edu/~hanj

 

List of Supported Students and Staff

 

·         Meng Jiang, Postdoc Research Fellow, Department of Computer Science, University of Illinois at Urbana-Champaign (Finished in Aug. 2017)

·         Quan Yuan, Postdoc Research Fellow, Department of Computer Science, University of Illinois at Urbana-Champaign (collaborative) (Finished in Sept. 2017)

·         Ahmed Elkishky, Ph.D. student, Department of Computer Science, University of Illinois at Urbana-Champaign (collaborative)

·         Xiang Ren, Ph.D. student, Department of Computer Science, University of Illinois at Urbana-Champaign (collaborative) (graduated Dec. 2017)

·         Jiaming Shen, Ph.D. student, Department of Computer Science, University of Illinois at Urbana-Champaign

·         Chao Zhang, Ph.D. student, Department of Computer Science, University of Illinois at Urbana-Champaign (graduated Dec. 2018)

·         Honglei Zhuang, Ph.D. student, Department of Computer Science, University of Illinois at Urbana-Champaign (graduated May 2019)

·         Xiaotao Gu, Ph.D. student, Department of Computer Science, University of Illinois at Urbana-Champaign

·         Shi Zhi, Ph.D. student, Department of Computer Science, University of Illinois at Urbana-Champaign

Project Award Information

·         Award Number: NSF IIS 16-18481  

·         Duration: 08/01/2016—07/31/2019

·         Title: NSF III: Small: Multi-Dimensional Structuring, Summarizing and Mining of Social Media Data

·         Keywords:  Big data; data mining; social media analysis; data integration; text mining; text summarization and OLAP; information trustworthiness analysis; information network analysis; efficiency and scalability; applications

Project Summary

·         Various kinds of social media have impacted billions of users on their ways of obtaining and sharing information across the globe.   This creates great opportunities but also poses tremendous challenges on understanding, summarizing, and mining of such data due to its huge volume as well as dynamic and unstructured nature of its text contents.   In response to such challenges, this project focuses on text-based social media, proposes a multi-dimensional data structuring approach, which mines unstructured social media data to uncover its hidden multi-dimensional structures.  The project investigates principle, methodologies and algorithms for social media structuring, summarizing and mining, and develops effective and scalable technology for multi-dimensional social media data analysis.   The principles and methodologies developed in this study can be extended to scalable and multi-dimensional analysis of other kinds of massive unstructured data as well.

 

·         To conduct effective multi-dimensional social media structuring, this project develops a distant supervision-based methodology with minimal effort of human curation and labeling.   It takes data in Wikipedia, Freebase, or other knowledge-bases as references, integrates social media data with the corresponding news or other relevant documents, conducts phrase mining, entity and event discovery and typing, and uncover critical aspects, attributes, and values associated with such entities and events from social media.  By organizing social media data in a structured way, massive social media can be summarizing effectively in a context-aware semantic OLAP (online analytical processing) framework and can be analyzed systematically under a general multi-dimensional social media querying and mining framework for many tasks, such as modeling behavioral patterns and uncovering bursty events and detecting social frauds or anomalies.

 

Intellectual Merit: 

·         We propose a multi-dimensional data structuring approach, which mines unstructured social media data to uncover its hidden multi-dimensional structures. Multi-dimensional structuring will involve integrating social media data with news, wikipedia, Freebase, and other knowledge-base data, conducting phrase mining, entity/event discovery and typing, and uncovering aspects associated with such entities and events.  Organizing massive social media data in a conceptually structured way will facilitate understand and summarize social media information effectively, support context-aware semantic OLAP, facilitate multi-dimensional mining of social media data, such as finding bursty events and detecting anomalies in social media.

·         To systematically develop this approach, we organize the proposal into three themes: (1) multidimensional structuring of social media data, (2) context-aware summarization in multi-dimensional space, and (3) a general framework for multidimensional social media mining. We will systematically develop principle, methodologies and algorithms along the three lines of the proposed research and generate effective and scalable technology for multi-dimensional social media data structuring, summarization and mining.

·         Built on our existing work, this project has the following intellectual merit. (1) Developing new principles, methods, and technologies for structuring, summarizing, and mining of massive, time-evolving social media data: New technologies will be developed for entity extraction/typing, aspect discovery, context-aware semantic OLAP, and multidimensional event discovery and anomaly mining, and thus advance the state-of-the-art; (2) Enriching the principles and technologies of data mining: Structuring and mining massive, dynamic and unstructured data, such as social media data, is a major challenge in data mining.

 

Broader Impacts: 

 

·         With tremendous amounts of social media data being generated in all aspects of our society, this project will have the following broad impacts: (1) Benefits our social-media permeated society: Social media penetrates every aspect of our life. The project, enhancing our analysis power on social media, will benefit our society in many ways; (2) Benefits data mining and information technology: New technologies and tools will be generated for mining massive unstructured data and will be transferred to ARL, etc., as we did before;  (3) Benefits education and training: The project will train a good number of researchers, especially female and minority students, educating a great number of undergraduates and graduates via our research publications, tutorials, massive online courses, workshops, and demo-systems.

·         This project focuses on text-based social media, not on in-depth analysis of image, audio, and video data. Also, we will use publicly accessible social media data (e.g., publicly released tweets) with no links to users' personal information.

·         The research results are to be published in various research and application forums and be integrated into the educational programs at UIUC.  The progress of the project and the research results are also disseminated via the project Web site (http://www.cs.uiuc.edu/homes/hanj/projs/social_media.htm).

Selected Publications and Products:

Books (authored)

 

·         Chao Zhang and Jiawei Han, Multidimensional Mining of Massive Text Data, Morgan & Claypool Publishers, 2019. (Zhang's thesis: 2019 ACM SIGKDD Dissertation Award Runner-Up)

·         Xiang Ren and Jiawei Han, Mining Structures of Factual Knowledge from Text: An Effort-Light Approach, Morgan & Claypool Publishers, 2018. (Ren's thesis: 2018 ACM SIGKDD Dissertation Award)

 

Journal and Refereed Conference Publications

 

·         Yu Shi, Xinwei He, Naijing Zhang, Carl Yang, and Jiawei Han, "User-Guided Clustering in Heterogeneous Information Networks via Motif-Based Comprehensive Transcription", in Proc. 2019 European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'19), Wurzburg, Germany, Sept. 2019

·         Carl Yang, Huy Hoang Do, Tomas Mikolov and Jiawei Han “Place Deduplication with Embeddings", in Proc. the Web Conf. 2019 (WWW'19), San Franscisco, CA, May 2019

·         Honglei Zhuang, Timothy Hanratty, and Jiawei Han, “Aspect-Based Sentiment Analysis with Minimal Guidance", in Proc. 2019 SIAM Int. Conf. on Data Mining (SDM'19), Calgary, Alberta, Canada, May 2019

·         Sha Li, Chao Zhang, Dongming Lei, Ji Li, Jiawei Han, “GeoAttn: Fine-Grained Localization of Social Media Messages via Attentional Memory Network", in Proc. 2019 SIAM Int. Conf. on Data Mining (SDM'19), Calgary, Alberta, Canada, May 2019

·         Jiaming Shen, Ruiliang Lyu, Xiang Ren, Michelle Vanni, Brian Sadler, Jiawei Han, “Mining Entity Synonyms with Efficient Neural Set Generation", in Proc. 2019 AAAI Conf. on Artificial Intelligence (AAAI-19), Honolulu, Hawaii, Jan. 2019  

·         Yu Meng, Jiaming Shen, Chao Zhang and Jiawei Han, “Weakly-Supervised Hierarchical Text Classification", in Proc. 2019 AAAI Conf. on Artificial Intelligence (AAAI-19), Honolulu, Hawaii, Jan. 2019

·         Jingbo Shang, Jialu Liu, Meng Jiang, Xiang Ren, Clare R. Voss, Jiawei Han, “Automated Phrase Mining from Massive Text Corpora", IEEE Transactions on Knowledge and Data Engineering, 30(10):1825-1837, 2018

·         Chenguang Wang, Yangqiu Song, Haoran Li, Ming Zhang, and Jiawei Han, “Unsupervised Meta-path Selection for Text Similarity Measure based on Heterogeneous Information Networks", Data Mining and Knowledge Discovery (DMKD), 32(6): 1735-1767 (2018)

·         Julie Yixuan Zhu, Chao Zhang, Huichu Zhang, Shi Zhi, Victor O. K. Li, Jiawei Han, Yu Zheng, “pg-Causality: Identifying Spatiotemporal Causal Pathways for Air Pollutants with Urban Big Data", IEEE Transactions on Big Data (TBD), 4(4): 571-585 (2018)

·         Chao Zhang, Dongming Lei, Quan Yuan, Honglei Zhuang, Lance Kaplan, Shaowen Wang, Jiawei Han, “GeoBurst+: Effective and Real-Time Local Event Detection in Geo-Tagged Tweet Streams", ACM Transactions on Intelligent Systems and Technology (ACM TIST), 9(3): 34:1-34:24 (2018)

·         Jingbo Shang, Meng Jiang, Wenzhu Tong, Jinfeng Xiao, Jian Peng, Jiawei Han, “DPPred: An Effective Prediction Framework with Concise Discriminative Patterns", IEEE Transactions on Knowledge and Data Engineering, 30(7): 1226-1239 (2018)

·         Xuan Wang, Yu Zhang, Qi Li, Cathy Wu, and Jiawei Han, “PENNER: Pattern-enhanced Nested Named Entity Recognition in Biomedical Literature", in Proc. 2018 Int. Conf. on Bioinformatics and Biomedicine (BIBM'18), Madrid, Spain, Dec. 2018, pp. 540-547

·         Qi Li, Xuan Wang, Yu Zhang, Fei Ling, Cathy Wu, and Jiawei Han, “Pattern Discovery for Wide-Window Open Information Extraction in Biomedical Literature", in Proc. 2018 Int. Conf. on Bioinformatics and Biomedicine (BIBM'18), Madrid, Spain, Dec. 2018, pp. 420-427

·         Shi Zhi, Fan Yang, Zheyi Zhu, Qi Li, Zhaoran Wang, and Jiawei Han, “Dynamic Truth Discovery on Numerical Data", in Proc. of 2018 IEEE Int. Conf. on Data Mining (ICDM'18), Singapore, Nov. 2018, pp. 817-826

·         Carl Yang, Yichen Feng, Pan Li, Yu Shi, and Jiawei Han, “Meta-Graph Based HIN Spectral Embedding: Methods, Analyses, and Insights", in Proc. of 2018 IEEE Int. Conf. on Data Mining (ICDM'18), Singapore, Nov. 2018, pp. 657-666

·         Fangbo Tao, Chao Zhang, Xiusi Chen, Meng Jiang, Tim Hanratty, Lance Kaplan, and Jiawei Han, “Doc2Cube: Automated Document Allocation to Text Cube via Dimension-Aware Joint Embedding", in Proc. of 2018 IEEE Int. Conf. on Data Mining (ICDM'18), Singapore, Nov. 2018, pp. 1260-1265

·         Doris Xin, Ahmed El-Kishky, De Liao, Brandon Norick, and Jiawei Han, “Active Learning on Heterogeneous Information Networks: A Multi-armed Bandit Approach", in Proc. of 2018 IEEE Int. Conf. on Data Mining (ICDM'18), Singapore, Nov. 2018, pp. 1350-1355

·         Jingbo Shang, Liyuan Liu, Xiaotao Gu, Xiang Ren, Teng Ren and Jiawei Han, “Learning Named Entity Tagger using Domain-Specific Dictionary", in Proc. of 2018 Conf. on Empirical Methods in Natural Language Processing (EMNLP'18), Brussels, Belgium, Oct. 2018, pp. 2054-2064

·         Liyuan Liu, Xiang Ren, Jingbo Shang, Xiaotao Gu, Jian Peng and Jiawei Han, “Efficient Contextualized Representation: Language Model Pruning for Sequence Labeling", in Proc. of 2018 Conf. on Empirical Methods in Natural Language Processing (EMNLP'18), Brussels, Belgium, Oct. 2018, pp. 1215-1225

·         Quan Yuan, Xiang Ren, Wenqi He, Chao Zhang, Xinhe Geng, Lifu Huang, Heng Ji, Chin-Yew Lin and Jiawei Han, “Open-Schema Event Profiling for Massive News Corpora", in Proc. of 2018 ACM Int. Conf. on Information and Knowledge Management (CIKM'18), Turin, Italy, Oct. 2018, pp. 587-596

·         Yu Meng, Jiaming Shen, Chao Zhang and Jiawei Han, “Weakly-Supervised Neural Text Classification", in Proc. of 2018 ACM Int. Conf. on Information and Knowledge Management (CIKM'18), Turin, Italy, Oct. 2018, pp. 983-992

·         Jingbo Shang, Jiaming Shen, Tianhang Sun, Xingbang Liu, Anja Gruenheid, Flip Korn, Adam Lelkes, Cong Yu and Jiawei Han, “Investigating Rumor News Using Agreement-Aware Search", in Proc. of 2018 ACM Int. Conf. on Information and Knowledge Management (CIKM'18), Turin, Italy, Oct. 2018, pp. 2117-2125

·         Carl Yang, Mengxiong Liu, Frank He, Xikun Zhang, Jian Peng, and Jiawei Han, “Similarity Modeling on Heterogeneous Networks via Automatic Path Discovery", in Proc. of 2018 European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'18), Dublin, Ireland, Sept. 2018, pp. 37-54

·         Jingbo Shang, Qi Zhu, Jiaming Shen, Xuan Wang, Xiaotao Gu, Lance Kaplan, Timothy Harratty and Jiawei Han, "AutoNet: Automated Network Construction and Exploration System from Domain-Specific Corpora", in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), (demo paper) London, UK, August 2018

·         Jiaming Shen, Jinfeng Xiao, Yu Zhang, Carl Yang, Jingbo Shang, Jinda Han, Saurabh Sinha, Peipei Ping, Richard Weinshilboum, Zhiyong Lu and Jiawei Han, "SetSearch+: Entity-Set-Aware Search and Mining for Scientific Literature", in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), (demo paper), London, UK, August 2018

·         Hanwen Zha, Jiaming Shen, Keqian Li, Warren Greiff, Michelle Vanni, Jiawei Han and Xifeng Yan, "FTS: Faceted Taxonomy Construction and Search for Scientific Publications", in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), (demo paper), London, UK, August 2018

·         Carl Yang, Xiaolin Shi, Jie Luo and Jiawei Han, "I Know You’ll Be Back: Interpretable New User Clustering and Churn Prediction on a Mobile Social Application", in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), London, UK, August 2018

·         Chao Zhang, Fangbo Tao, Xiusi Chen, Jiaming Shen, Meng Jiang, Brian Sadler, Michelle Vanni and Jiawei Han, "TaxoGen: Constructing Topical Concept Taxonomy by Adaptive Term Embedding and Clustering", in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), London, UK, August 2018

·         Qi Li, Meng Jiang, Xikun Zhang, Meng Qu, Timothy Hanratty, Jing Gao and Jiawei Han, "TruePIE: Discovering Reliable Patterns in Pattern-Based Information Extraction", in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), London, UK, August 2018

·         Jiaming Shen, Zeqiu Wu, Dongming Lei, Chao Zhang, Xiang Ren, Michelle T. Vanni, Brian M. Sadler and Jiawei Han, "HiExpan: Task-Guided Taxonomy Construction by Hierarchical Tree Expansion", in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), London, UK, August 2018

·         Yu Shi, Qi Zhu, Fang Guo, Chao Zhang and Jiawei Han, "Easing Embedding Learning by Comprehensive Transcription of Heterogeneous Information Networks", in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), London, UK, August 2018

·         Yuning Mao, Xiang Ren, Jiaming Shen, Xiaotao Gu and Jiawei Han, "End-to-End Reinforcement Learning for Automatic Taxonomy Induction", in Proc. of 2018 Annual Meeting of the Association for Computational Linguistics (ACL'18), Melbourne, Australia, July 2018

·         Jiaming Shen, Jinfeng Xiao, Xinwei He, Jingbo Shang, Saurabh Sinha and Jiawei Han, "Entity Set Search of Scientific Literature: An Unsupervised Ranking Approach", in Proc. of 2018 Int. ACM SIGIR Conf. on Research and Development in Information Retrieval (SIGIR'18), Ann Arbor, MI, July 2018

·         Yu Shi, Huan Gui, Qi Zhu, Lance Kaplan,Jiawei Han, “AspEm: Embedding Learning by Aspects in Heterogeneous Information Networks,” Proc. of 2018 SIAM Int. Conf. on Data Mining (SDM’18), San Diego, CA, May 2018

·         Meng Qu, Xiang Ren, Yu Zhang, and Jiawei Han, “Weakly-supervised Relation Extraction by Pattern-enhanced Embedding Learning”, Proc. of 2018 Int. Conf. on World-Wide Web (WWW’18), Lyon, France, Apr. 2018

·         Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Frank F. Xu and Jiawei Han, "Open Information Extraction with Global Structure Constraints”, (poster paper), Proc. of 2018 Int. Conf. on World-Wide Web (WWW’18), Lyon, France, Apr. 2018 (received WWW'18 best poster award honorable mentioning)

·         Liyuan Liu, Jingbo Sahng, Frank Xu, Xiang Ren, Huan Gui, Jian Peng and Jiawei Han, "Empower Sequence Labeling with Task-Aware Neural Language Model", in Proc. of 2018 AAAI Conf. on Artificial Intelligence (AAAI'18), New Orleans, LA, Feb. 2018

·         Chao Zhang, Mengxiong Liu, Zhengchao Liu, Carl Yang, Luming Zhang, Jiawei Han, "Spatiotemporal Activity Modeling Under Data Scarcity: A Graph-Regularized Cross-Modal Embedding Approach", in Proc. of 2018 AAAI Conf. on Artificial Intelligence (AAAI'18), New Orleans, LA, Feb. 2018

·         Wanzheng Zhu, Chao Zhang, Shuochao Yao, Xiaobin Gao, Jiawei Han, "A Spherical Hidden Markov Model for Semantics-Rich Human Mobility Modeling", in Proc. of 2018 AAAI Conf. on Artificial Intelligence (AAAI'18), New Orleans, LA, Feb. 2018

·         Zeqiu Wu, Xiang Ren, Frank F. Xu, Ji Li and Jiawei Han, "Indirect Supervision for Relation Extraction using Question-Answer Pairs", in Proc. of 2018 ACM Int. Conf. on Web Search and Data Mining (WSDM'18), Los Angeles, CA, Feb. 2018

·         Meng Qu, Jian Tang, and Jiawei Han, "Curriculum Learning for Heterogeneous Star Network Embedding via Deep Reinforcement Learning", in Proc. of 2018 ACM Int. Conf. on Web Search and Data Mining (WSDM'18), Los Angeles, CA, Feb. 2018 

·         Quan Yuan, Jingbo Shang, Xin Cao, Chao Zhang, Xinhe Geng, Jiawei Han, "Detecting Multiple Periods and Periodic Patterns in Event Time Sequences", in Proc. of 2017 ACM Int. Conf. on Information and Knowledge Management (CIKM'17), Singapore, Nov. 2017

·         Mengxiong Liu, Zhengchao Liu, Chao Zhang, Keyang Zhang, Quan Yuan, Tim Hanrantty and Jiawei Han, "Urbanity: A System for Interactive Exploration of Urban Dynamics from Streaming Human Sensing Data" (system demo), in Proc. of 2017 ACM Int. Conf. on Information and Knowledge Management (CIKM'17), Singapore, Nov. 2017

·         Huan Gui, Jialu Liu, Fangbo Tao, Meng Jiang, Brandon Norick, Lance Kaplan and Jiawei Han, "Embedding Learning with Events in Heterogeneous Information Networks", IEEE Transactions on Knowledge and Data Engineering, 29(11): 2428- 2441, 2017

·         Jiaming Shen, Zeqiu Wu, Dongming Lei, Jingbo Shang, Xiang Ren, Jiawei Han, "SetExpan: Corpus-based Set Expansion via Context Feature Selection and Rank Ensemble", in Proc. of 2017 European Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD'17), Skopje, Macedonia, Sept. 2017

·         Carl Yang, Lanxiao Bai, Chao Zhang, Quan Yuan and Jiawei Han, "Bridging Collaborative Filtering and Semi-Supervised Learning: A Neural Approach for POI recommendation", in Proc. of 2017 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'17), Halifax, Nova Scotia, Canada, Aug. 2017

·         Chao Zhang, Liyuan Liu, Dongming Lei, Quan Yuan, Honglei Zhuang, Tim Hanratty and Jiawei Han, "TrioVecEvent: Embedding-Based Online Local Event Detection in Geo-Tagged Tweet Streams", in Proc. of 2017 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD'17), Halifax, Nova Scotia, Canada, Aug. 2017

·         Xiang Ren,  Wenqi He,  Meng Qu, Clare R. Voss, Heng Ji, Jiawei Han, "Label Noise Reduction in Entity Typing by Heterogeneous Partial-Label Embedding", in Proc. of 2016 ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (KDD'16), San Francisco, CA, Aug. 2016

·         Meng Jiang, Christos Faloutsos, Jiawei Han, "CatchTartan: Representing and Summarizing Dynamic Multicontextual Behaviors", in Proc. of 2016 ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (KDD'16), San Francisco, CA, Aug. 2016

·         Mengting Wan, Xiangyu Chen, Lance Kaplan, Jiawei Han, Jing Gao, Bo Zhao, "An Uncertainty-Aware Model to Summarize Trustworthy Quantitative Information", in Proc. of 2016 ACM SIGKDD Conf. on Knowledge Discovery and Data Mining (KDD'16), San Francisco, CA, Aug. 2016

 

Project Impact

 

·         Education:  Parts of the new research results are used in Data Mining courses (CS412, CS512) for both undergraduate and graduate students being taught in the Department of Computer Science, the University of Illinois at Urbana-Champaign.    Moreover, the research results have been and will continuously be published timely in international conferences and journals and be distributed world-wide for education and research.  The new progress will also be integrated into the new edition of our data mining textbook and other research collections.

·         Collaborations: For this project we have established collaborations with Boeing, ARL, NASA, IBM T.J. Watson Research Center, Yahoo! Labs, Microsoft Research, Google Research, and NCSA (National Center of Supercomputer Applications).  Through such collaborations we expect to have access to real datasets and applications and produce more research results.

 

Current and Future Activities

·         The following are some of the highlights of our ongoing work.  Please refer to the section: Publications and Products section for related references

Area Background

 

·         This project is based on the previous research on data mining, information network analysis, spatiotemporal data analysis, and data cube and multidimensional analysis.   

·         There have been many research papers published on these themes.   Several textbooks on data mining, information retrieval and information network analysis provide good overviews of the principles and algorithms, including (Han, Kamber and Pei, 2011) and (Sun and Han 2012).

 

Area References

 

·         P. Yu, J. Han, and C. Faloutsos, editors. Link Mining: Models, Algorithms, and Applications. Springer, 2010

·         X. L. Dong, L. Berti-Equille, and D. Srivastava. Truth discovery and copying detection in a dynamic world. PVLDB, 2(1):562–573, 2009.

·         Xiaoxin Yin, Jiawei Han and Philip S. Yu, "Truth Discovery with Multiple Conflicting Information Providers on the Web", IEEE Transactions on Knowledge and Data Engineering, 20(6):796-808, 2008.

·         Bo Zhao, Benjamin I. P. Rubinstein, Jim Gemmell, and Jiawei Han, "A Bayesian Approach to Discovering Truth from Conflicting Sources for Data Integration", PVLDB 5(6): 550-561 (2012) (Also, Proc. 2012 Int. Conf. on Very Large Data Bases (VLDB'12), Istanbul, Turkey, Aug. 2012)

Potential Related Projects

·         Any project related to social media analysis, information fusion, information and social network analysis, spatiotemporal data mining, and knowledge discovery.

Project Web site URL:  http://www.cs.uiuc.edu/~hanj/projs/social_media.htm

Online software: 

Online software related to this project can be downloaded at Github or at www.illimine.cs.uiuc.edu

Online resources:  Research publications related to this project can be downloaded at Selected Publications