Jiawei Han, Co-PI, Michael Aiken
Chair Professor
Department of Computer Science
University of Illinois, Urbana-Champaign
201 N. Goodwin Ave., Urbana, Illinois 61801 U.S.A.
Office: (217) 333-6903, Fax: (217) 265-6494
E-mail: hanj at illinois.edu, URL: http://hanj.cs.illinois.edu
List of Supported Students and Staff
§ Xiaotao Gu, Ph.D. student, Department of Computer Science, University of
Illinois at Urbana-Champaign
§ Priyanka Kargupta, Ph.D. student, Department of Computer Science,
University of Illinois at Urbana-Champaign
§ Yu Zhang, Ph.D. student, Department of Computer Science, University of
Illinois at Urbana-Champaign
§ Liyuan Liu, Ph.D. student, Department of Computer Science, University of
Illinois at Urbana-Champaign
§ Yunyi Zhang, Ph.D. student, Department of Computer Science, University
of Illinois at Urbana-Champaign
§ Ming Zhong, Ph.D. student, Department of Computer Science, University of
Illinois at Urbana-Champaign
Project Summary
Recent
years have witnessed the proliferation of various machine-readable knowledge
repositories, such as general knowledge bases and domain-specific
ontologies. Although existing knowledge
repositories have shown their power at simple search and question answering,
their usage in complex problem solving is very limited. In many domains,
knowledge varies with respect to contexts, and a flat structure that is
commonly adopted by existing knowledge repositories cannot capture the
complicated knowledge associated with different contexts. To make knowledge resources more findable,
accessible, interoperable, and reusable (FAIR), this project proposes to
conceptualize a new structure, Knowledge Hypercube (K-CUBE), for organizing and
retrieving knowledge that could support complex applications in various
domains. A knowledge hybercube organizes
knowledge with respect to selected important dimensions or aspects, and thus it
allows people to easily access knowledge in any context, encapsulate
distinctive entities and relationships, and conduct cross-dimensional
comparison and inference. The major
objective of this proposal is to form a paradigm of mining knowledge hybercubes
from massive collection of text documents and leveraging such hybercubes for
complex exploration and prediction tasks. The progress of
the project and the research results are also disseminated via the project Web
site (http://hanj.cs.illinois.edu/projs/hypercube.htm).
Intellectual Merit:
The proposed research bridges the gap between the
empirical success of network embedding, and existing statistical learning and
optimization theories. The core of this proposed research is the integration of
modern network mining techniques with sophisticated statistical learning and
optimization tools, which lays a foundation to design a new generation of
network embedding algorithms with strong theoretical guarantees, and to derive
new theories for various setups of network embedding. Extensive empirical evaluations
ensure the proposed algorithms' applicability in various application domains.
The proposed research is expected to advance the frontier of network embedding
and enable it to be good at taming modern massive networks in the wild.
Broader Impacts:
The successful completion of this project will lead
to a new advanced way to store, retrieve, share and exploit knowledge for
complex applications. It will have immediate impact on the process of knowledge
distillation, organization and exploitation and will broadly impact the field
of data science which centers around finding and using knowledge. The proposed research will provide an
important source to advance knowledge-based machine learning approaches.
Furthermore, the proposed research to mine and leverage knowledge can
potentially benefit a wide range of domains which have gigantic literature and
unsolved complex tasks by building a bridge between complex tasks and text
collections, such as drug repurposing and fake news detection. A repository of the developed software and
constructed knowledge hypercubes for the proposed domains will be constructed
and the results of this project will be disseminated to both within the
computer science area and in many other disciplines. This project has the potential to promote the
adoption of knowledge hypercubes by industry, making knowledge resources more
findable, accessible, interoperable, and reusable (FAIR). Moreover, the proposed research work will be
integrated tightly with education as we plan to leverage knowledge hypercubes
for educational tasks such as knowledge tracing. We will also encourage the participation of
undergraduate and minority students in data mining research at all three
institutions.
The
research results are to be published in various research and application forums
and be integrated into the educational programs at UIUC. The
progress of the project and the research results are also disseminated via the
project Web site (http://www.cs.uiuc.edu/homes/hanj/projs/hypercube.htm).
Publications and Products: (Note: major publications closely related to
this project are in bold font)
Note: Please search and download
all the papers in PDF, if available, at our group’s publication website by
following the link: Selected
research publications.
Books
Journal articles
·
Zhizhi Yu, Di Jin, Ziyang Liu, Dongxiao He, Xiao Wang, Hanghang
Tong, Jiawei Han, “Embedding text-rich graph neural networks with sequence and
topical semantic structures”, Knowledge and Information Systems, 65(2): 613-640
(2023)
·
Wei Shen, Yuhan Li, Yinan Liu, Jiawei Han, Jianyong Wang,
Xiaojie Yuan, “Entity Linking Meets Deep Learning: Techniques and Solutions”,
IEEE Trans. Knowl. Data Eng., 35(3): 2556-2578 (2023)
·
Di Jin, Zhizhi Yu, Dongxiao He, Carl Yang, Philip S. Yu, Jiawei
Han, “GCN for HIN via Implicit Utilization of Attention and Meta-Pathss”, IEEE
Trans. Knowl. Data Eng., 35(4): 3925-3937 (2023)
·
Yizhou Sun, Jiawei Han, Xifeng Yan, Philip S. Yu, Tianyi Wu,
“Heterogeneous Information Networks: the Past, the Present, and the Future”,
Proc. VLDB Endow., 15(12): 3807-3811 (2022)
·
Di Jin, Wenjun Wang, Guojie Song, Philip S. Yu, Jiawei Han,
“Guest Editorial: Special Issue on Network Structural Modeling and Learning in
Big Data, IEEE Transactions on Big Data, 8(4): 867-868 (2022)
·
Carl Yang, Yuxin Xiao, Yu Zhang, Yizhou Sun and Jiawei Han,
“Heterogeneous Network Representation Learning: A Unified Framework with Survey
and Benchmark ”, IEEE Transactions on Knowledge and Data Engineering, 34(10):
4854-4873 (2022)
·
Wei Shen, Yuwei Yin, Yang Yang, Jiawei Han, Jianyong Wang,
Xiaojie Yuan, “Toward Tweet Entity Linking with Heterogeneous Information
Networks”, IEEE Transactions on Knowledge and Data Engineering, 34(12):
6003-6017 (2022)
·
Yu Meng, Jiaxin Huang, Guangyuan Wang, Zihan Wang, Chao Zhang,
and Jiawei Han, ”Unsupervised Word Embedding Learning by Incorporating
Local and Global Contexts”, Frontier in Big Data, 3:9, 2020
·
Jingbo Shang, Jialu Liu, Meng Jiang, Xiang Ren, Clare R Voss,
Jiawei Han, "Automated Phrase Mining from Massive Text Corpora",
IEEE Transactions on Knowledge and Data Engineering, 30(10):1825-1837 (2018)
·
Jingbo Shang, Meng Jiang, Wenzhu Tong, Jinfeng Xiao, Jian Peng,
Jiawei Han. "DPPred: An Effective Prediction Framework with Concise
Discriminative Patterns", IEEE Transactions on Knowledge and
Data Engineering, 30(7): 1226-1239 (2018)
Refereed
Conference Publications
1.
Sizhe Zhou, Suyu Ge, Jiaming Shen,
Jiawei Han, “Corpus-Based Relation Extraction by Identifying and Refining
Relation Patterns”, in Proc. 2023 European Conf. on Machine Learning and
Principles and Practice of Knowledge Discovery in Databases (ECMLPKDD’23),
Turin, Italy, Sept. 2023
2.
Bowen Jin, Yu Zhang, Qi Zhu, Jiawei Han,
“Heterformer: Transformer-based Deep Node Representation Learning on
Heterogeneous Text-Rich Networks”, in Proc. 2023 ACM SIGKDD Int. Conf. on
Knowledge Discovery and Data Mining (KDD’23), Long Beach, CA, August 2023
3.
Yu Zhang, Bowen Jin, Xiusi Chen, Yanzhen
Shen, Yunyi Zhang, Yu Meng, Jiawei Han, “Weakly Supervised Multi-Label
Classification of Full-Text Scientific Papers”, in Proc. 2023 ACM SIGKDD Int.
Conf. on Knowledge Discovery and Data Mining (KDD’23), Long Beach, CA, August
2023
4.
Nishant Balepur, Shivam Agarwal, Karthik
Venkat Ramanan, Susik Yoon, Diyi Yang and Jiawei Han, “DynaMiTE: Discovering
Explosive Topic Evolutions with User Guidance”, in Proc. 2023 Annual Meeting of
the Association for Computational Linguistics (ACL Findings’23), Toronto,
Canada, July 2023, pp. 194-217
5.
Pengcheng Jiang, Shivam Agarwal, Bowen
Jin, Xuan Wang, Jimeng Sun and Jiawei Han, “Text Augmented Open Knowledge Graph
Completion via Pre-Trained Language Models”, in Proc. 2023 Annual Meeting of
the Association for Computational Linguistics (ACL Findings’23), Toronto,
Canada, July 2023, pp. 11161-11180
6.
Bowen Jin, Wentao Zhang, Yu Zhang, Yu
Meng, Xinyang Zhang, Qi Zhu and Jiawei Han, “Patton: Language Model Pretraining
on Text-Rich Networks”, in Proc. 2023 Annual Meeting of the Association for
Computational Linguistics (ACL’23), Toronto, Canada, July 2023, pp. 7005-7020
7.
Sha Li, Ruining Zhao, Manling Li, Heng
Ji, Chris Callison-Burch and Jiawei Han “Open-Domain Hierarchical Event Schema
Induction by Incremental Prompting and Verification”, in Proc. 2023 Annual
Meeting of the Association for Computational Linguistics (ACL’23), Toronto,
Canada, July 2023, pp. 5677-5697
8.
Siru Ouyang, Jiaao Chen, Jiawei Han and
Diyi Yang, “Compositional Data Augmentation for Abstractive Conversation
Summarization”, in Proc. 2023 Annual Meeting of the Association for
Computational Linguistics (ACL’23), Toronto, Canada, July 2023, pp. 1471-1488
9.
Ming Zhong, Siru Ouyang, Minhao Jiang,
Vivian Hu, Yizhu Jiao, Xuan Wang and Jiawei Han “ReactIE: Enhancing Chemical
Reaction Extraction with Weak Supervision”, in Proc. 2023 Annual Meeting of the
Association for Computational Linguistics (ACL Findings’23), Toronto, Canada,
July 2023, pp. 12120-12130
10.
Yu Meng, Martin Michalski, Jiaxin Huang,
Yu Zhang, Tarek Abdelzaher, Jiawei Han, “Tuning Language Models as Training
Data Generators for Augmentation-Enhanced Few-Shot Learning”, in Proc. 2023
Int. Conf. on Machine Learning (ICML’23), Honolulu, Hawaii, July 2023
11.
Susik Yoon, Dongha Lee, Yunyi Zhang and
Jiawei Han, “Unsupervised Story Discovery from Continuous News Streams via
Scalable Thematic Embedding”, in Proc. 2023 ACM SIGIR Int. Conf. on Research
and Development in Information Retrieval (SIGIR’23), Taipei, Taiwan, July 2023
12.
Bowen Jin, Yu Zhang, Yu Meng, Jiawei
Han, “Edgeformers: Graph-Empowered Transformers for Representation Learning on
Textual-Edge Networks”, in Proc. 2023 Int. Conf. on Learning Representations
(ICLR’23), Kigali Rwanda, May 2023
13.
Susik Yoon, Hou Pong Chan and Jiawei
Han, “PDSum: Prototype-driven Continuous Summarization of Evolving
Multi-document Sets Stream”, in Proc. 2023 The Web Conf. (WWW’23), Austin, TX,
Apr. 2023, pp. 1650-1661
14.
Susik Yoon, Yu Meng, Dongha Lee and
Jiawei Han, “SCStory: Self-supervised and Continual Online Story Discovery”, in
Proc. 2023 The Web Conf. (WWW’23), Austin, TX, Apr. 2023, pp. 1853-1864
15.
Yu Zhang, Bowen Jin, Qi Zhu, Yu Meng and
Jiawei Han, “The Effect of Metadata on Scientific Literature Tagging: A
Cross-Field Cross-Model Study”, in Proc. 2023 The Web Conf. (WWW’23), Austin,
TX, Apr. 2023, pp. 1626-1637
16.
Yizhu Jiao, Ming Zhong, Jiaming Shen,
Yunyi Zhang, Chao Zhang and Jiawei Han, “Unsupervised Event Chain Mining from
Multiple Documents”, in Proc. 2023 The Web Conf. (WWW’23), Austin, TX, Apr.
2023, pp. 1948-1959
17.
Jinfeng Xiao, Mohab Elkaref, Nathan
Herr, Geeth De Mel, and Jiawei Han, “Taxonomy-Guided Fine-Grained Entity Set
Expansion”, in Proc. 2023 SIAM Conf. on Data Mining (SDM’23), Minneapolis, MN,
Apr. 2023, pp. 1626-1637
18.
Suyu Ge, Jiaxin Huang, Yu Meng, and
Jiawei Han, “FineSum: Target-Oriented, Fine-Grained Opinion Summarization”, in
Proc. 2023 ACM Int. Conf. on Web Search and Data Mining (WSDM’23), Singapore,
Feb. 2023, pp. 1093-1101
19.
Yu Zhang, Yunyi Zhang, Martin Michalski,
Yucheng Jiang, Yu Meng, and Jiawei Han, “Effective Seed-Guided Topic Discovery
by Integrating Multiple Types of Contexts”, in Proc. 2023 ACM Int. Conf. on Web
Search and Data Mining (WSDM’23), Singapore, Feb. 2023, pp. 429-437
20.
Yizhu Jiao, Sha Li, Yiqing Xie, Ming
Zhong, Heng Ji and Jiawei Han, “Open-Vocabulary Argument Role Prediction for
Event Extraction”, in Proc. 2022 Conf. on Empirical Methods in Natural Language
Processing (EMNLP’22), Abu Dhabi, UAE, Dec. 2022
21.
Ming Zhong, Yang Liu, Suyu Ge, Yuning
Mao, Yizhu Jiao, Xingxing Zhang, Yichong Xu, Chenguang Zhu, Michael Zeng and
Jiawei Han, “Unsupervised Multi-Granularity Summarization”, in Proc. 2022 Conf.
on Empirical Methods in Natural Language Processing (EMNLP’22), Abu Dhabi, UAE,
Dec. 2022
22.
Sha Li, Heng Ji and Jiawei Han, “Open
Relation and Event Type Discovery with Type Abstraction”, in Proc. 2022 Conf.
on Empirical Methods in Natural Language Processing (EMNLP’22), Abu Dhabi, UAE,
Dec. 2022
23.
Yuning Mao, Ming Zhong and Jiawei Han,
“CiteSum: Citation Text-guided Scientific Extreme Summarization and
Low-resource Domain Adaptation”, in Proc. 2022 Conf. on Empirical Methods in
Natural Language Processing (EMNLP’22), Abu Dhabi, UAE, Dec. 2022
24.
Ming Zhong, Yang Liu, Da Yin, Yuning
Mao, Yizhu Jiao, Pengfei Liu, Chenguang Zhu, Heng Ji and Jiawei Han, “Towards A
Unified Multi-Dimensional Evaluator for Text Generation”, in Proc. 2022 Conf.
on Empirical Methods in Natural Language Processing (Findings of EMNLP’22), Abu
Dhabi, UAE, Dec. 2022
25.
Dongha Lee, Jiaming Shen, Seonghyeon
Lee, Susik Yoon, Hwanjo Yu and Jiawei Han, “Topic Taxonomy Expansion via
Hierarchy-Aware Topic Phrase Generation”, in Proc. 2022 Conf. on Empirical
Methods in Natural Language Processing (Findings of EMNLP 2022), Abu Dhabi,
UAE, Dec. 2022
26.
Xuan Wang, Vivian Hu, Minhao Jiang, Yu
Zhang, Jinfeng Xiao, Danielle Cherrice Loving, Heng Ji, Martin Burke, Jiawei
Han, “REACTCLASS: Cross-Modal Supervision for Subword-Guided Reactant Entity
Classificationn”, in Proc. 2022 IEEE Int. Conf. on Bioinformatics and
Biomedicine (BIBM’22), Las Vegas, NV, Dec. 2022, pp. 844-847
27.
Yu Meng, Jiaxin Huang, Yu Zhang, Jiawei
Han, “Generating Training Data with Language Models: Towards Zero-Shot Language
Understanding”, in Proc. of 2022 Conf. on Neural Information Processing Systems
(NeurIPS’22), New Orlean, LA, Nov. 2022
28.
Shivam Agarwal, Ramit Sawhney, Megh
Thakkar, Preslav Nakov, Jiawei Han, and Tyler Derr, “THINK: Temporal Hypergraph
Hyperbolic Network”, in Proc. of 2022 IEEE Int. Conf. on Data Mining (ICDM’22),
Orlando, FL, Nov. 2022, pp. 849-854
29.
Jiaxin Huang, Yu Meng, and Jiawei Han,
“Few-Shot Fine-Grained Entity Typing with Automatic Label Interpretation and
Instance Generation”, in Proc. of 2022 ACM SIGKDD Int. Conf. on Knowledge
Discovery and Data Mining (KDD’22), Washington, DC, Aug. 2022, pp. 605-614
30.
Yunyi Zhang, Fang Guo, Jiaming Shen, and
Jiawei Han., “Unsupervised Key Event Detection from Massive Text Corpus”, in
Proc. of 2022 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
(KDD’22), Washington, DC, Aug. 2022, pp. 2535-2544
31.
Yu Zhang, Yu Meng, Xuan Wang, Sheng
Wang, Jiawei Han, “Seed-Guided Topic Discovery with Out-of- Vocabulary Seeds”,
in Proc. of 2022 Annual Conference of the North American Chapter of the
Association for Computational Linguistics (NAACL’22), Seattle, WA, July 2022,
pp. 279-290
32.
Yuxin Xiao, Zecheng Zhang, Yuning Mao,
Carl Yang, Jiawei Han, “SAIS: Supervising and Augmenting Intermediate Steps for
Document-Level Relation Extraction”, in Proc. of 2022 Annual Conference of the
North American Chapter of the Association for Computational Linguistics
(NAACL’22), Seattle, WA, July 2022, pp. 2395-2409
33.
Xiaotao Gu, Yikang Shen, Jiaming Shen,
Jingbo Shang, Jiawei Han, “Phrase-aware Unsupervised Constituency Parsing”, in
Proc. of 2022 Annual Meeting of the Association for Computational Linguistics
(ACL’22), Dublin, Ireland, May 2022, pp. 6406-6415
34.
Yuning Mao, Lambert Mathias, Rui Hou,
Amjad Almahairi, Hao Ma, Jiawei Han, Wen-tau Yih, Madian Khabsa, “UniPELT: A
Unified Framework for Parameter-Efficient Language Model Tuning”, in Proc. of
2022 Annual Meeting of the Association for Computational Linguistics (ACL’22),
Dublin, Ireland, May 2022, pp. 6253-6264
35.
Yiqing Xie, Jiaming Shen, Sha Li, Yuning
Mao, Jiawei Han, “EIDER: Evidence-enhanced Document-level Relation Extraction”,
in Findings of the Association for Computational Linguistics (ACL’22 Findings),
Dublin, Ireland, May 2022, pp. 257-268
36.
Yu Meng, Chenyan Xiong, Payal Bajaj,
Saurabh Tiwary, Paul N. Bennett, Jiawei Han, Xia Song, “Pretraining Text
Encoders with Adversarial Mixture of Training Signal Generators”, in Proc. 2022
Int. Conf. on Learning Representations (ICLR’22), April 2022
37.
Minhao Jiang, Xiangchen Song, Jieyu
Zhang and Jiawei Han, “TaxoEnrich: Self-Supervised Taxonomy Completion via
Structure-Semantic Representations”, in Proc. The ACM Web Conf. 2022 (WWW’22),
April 2022, pp. 925-934
38.
Dongha Lee, Jiaming Shen, Seongku Kang,
Susik Yoon, Jiawei Han and Hwanjo Yu, “TaxoCom: Topic Taxonomy Completion with
Hierarchical Discovery of Novel Topic Clusters”, in Proc. The ACM Web Conf.
2022 (WWW’22), April 2022, pp. 2819-2829
39.
Yu Meng, Yunyi Zhang, Jiaxin Huang, Yu
Zhang and Jiawei Han, “Topic Discovery via Latent Space Clustering of Language
Model Embeddings”, in Proc. The ACM Web Conf. 2022 (WWW’22), April 2022, pp.
3143-3152
40.
Yiqing Xie, Zhen Wang, Carl Yang,
Yaliang Li, Bolin Ding, Hongbo Deng and Jiawei Han, “KoMen: Domain Knowledge
Guided Interaction Recommendation for Emerging Scenarios”, in Proc. The ACM Web
Conf. 2022 (WWW’22), April 2022, pp. 1301-1310
41.
Yu Zhang, Zhihong Shen, Chieh-Han Wu,
Boya Xie, Junheng Hao, Ye-Yi Wang, Kuansan Wang and Jiawei Han,
“Metadata-Induced Contrastive Learning for Zero-Shot Multi-Label Text
Classification”, in Proc. The ACM Web Conf. 2022 (WWW’22), April 2022, pp.
3162-3173
42.
Yu Zhang, Shweta Garg, Yu Meng, Xiusi
Chen, Jiawei Han, “MotifClass: Weakly Supervised Text Classi- fication with
Higher-order Metadata Information”, in Proc. 2022 ACM Int. Conf. on Web Search
and Data Mining (WSDM’22), Feb. 2022, pp. 1357-1367
43.
Xiaotao
Gu, Zihan Wang, Zhenyu Bi, Yu Meng, Liyuan Liu, Jiawei Han, Jingbo Shang,
"UCPhrase: Unsupervised Context-aware Quality Phrase
Tagging", in Proc. of 2021 ACM SIGKDD Int. Conf. on Knowledge
Discovery and Data Mining (KDD'21), Aug. 2021
44.
Yu
Meng, Jiaxin Huang, Yu Zhang, Jiawei Han, "On the Power of
Pre-Trained Text Representations: Models and Applications in Text Mining" (Conference
Tutorial), in Proc. of 2021 ACM SIGKDD Int. Conf. on Knowledge Discovery and
Data Mining (KDD'21), Aug. 2021
45.
Sha
Li, Heng Ji and Jiawei Han, "Document-Level Event
Argument Extraction by Conditional Generation", in Proc.
2021 Annual Conf. of the North American Chapter of the Association for
Computational Linguistics (NAACL-HLT'21), June 2021
46.
Jiaming
Shen, Wenda Qiu, Yu Meng, Jingbo Shang, Xiang Ren and Jiawei Han, "TaxoClass: Hierarchical Multi-Label Text Classification
Using Only Class Names", in Proc. 2021 Annual Conf. of the North American Chapter
of the Association for Computational Linguistics (NAACL-HLT'21), June 2021
47.
Xinyang
Zhang, Chenwei Zhang, Xin Luna Dong, Jingbo Shang and Jiawei Han, “Minimally-Supervised Structure-Rich
Text Categorization via Learning on Text-Rich Networks”, in Proc. The Web
Conf. 2021 (WWW’21), April 2021
48.
Yu
Zhang, Zhihong Shen, Yuxiao Dong, Kuansan Wang and Jiawei Han, “MATCH: Metadata-Aware
Text Classification in a Large Hierarchy”, in Proc. The Web
Conf. 2021 (WWW’21), April 2021
49.
Qi
Zhu, Fang Guo, Jingjing Tian, Yuning Mao, Jiawei Han, "SUMDocS:
Surrounding-aware Unsupervised Multiple Document Summarization", in Proc.
2021 SIAM Int. Conf. on Data Mining (SDM'21), April 2021
50.
Yu
Zhang, Xiusi Chen, Yu Meng and Jiawei Han, "Hierarchical Metadata-Aware Document Categorization under
Weak Supervision", in Proc. 2021 ACM Int. Conf. on Web Search and Data
Mining (WSDM'21), Feb. 2021
51.
Di
Jin, Xiangchen Song, Zhizhi Yu, Ziyang Liu, Heling Zhang, Zhaomeng Cheng and
Jiawei Han, "BiTe-GCN: A New GCN
Architecture via Bidirectional Convolution of Topology and Features on
Text-Rich Networks", in Proc. 2021 ACM Int. Conf. on Web Search
and Data Mining (WSDM'21), Feb. 2021
52.
Carl
Yang, Yuxin Xiao, Yu Zhang, Yizhou Sun and Jiawei Han, "Heterogeneous Network
Representation Learning: A Unified Framework with Survey and Benchmark", IEEE
Transactions on Knowledge and Data Engineering, 2021
53.
Xuan
Wang, Xiangchen Song, Bangzheng Li, Kang Zhou, Qi Li, and Jiawei Han, "Fine-Grained Named
Entity Recognition with Distant Supervision in COVID-19 Literature", in Proc.
2020 IEEE Int. Conf. on Bioinformatics and Biomedicine (IEEE BIBM 2020),
Dec. 2020
54.
Xuan
Wang, Yu Zhang, Aabhas Chauhan, Qi Li, and Jiawei Han, "Textual Evidence Mining via Spherical Heterogeneous
Information Network Embedding", in Proc.
2020 IEEE Int. Conf. on Big Data (IEEE BigData'20), Dec. 2020
55.
XuanWang,
Yingjun Guan, Yu Zhang, Qi Li, and Jiawei Han, "Pattern-enhanced Named Entity Recognition with Distant
Supervision", in Proc. 2020 IEEE Int. Conf. on Big Data (IEEE
BigData'20), Dec. 2020
56.
Carl
Yang, Liyuan Liu, Mengxiong Liu, Zongyi Wang, Chao Zhang, and Jiawei Han,
"Graph Clustering with Embedding Propagation", in Proc.
2020 IEEE Int. Conf. on Big Data (IEEE BigData'20), Dec. 2020
57.
Jiaxin
Huang, Yu Meng, Fang Guo, Heng Ji and Jiawei Han, "Aspect-Based Sentiment Analysis by Aspect-Sentiment Joint
Embedding", in Proc. 2020 Conf. on Empirical Methods in Natural
Language Processing (EMNLP'20), Nov. 2020
58.
Yuning
Mao, Yanru Qu, Yiqing Xie, Xiang Ren and Jiawei Han, "Multi-document
Summarization with Maximal Marginal Relevance-guided Reinforcement Learning", in Proc.
2020 Conf. on Empirical Methods in Natural Language Processing (EMNLP'20), Nov.
2020
59.
Yu
Meng, Yunyi Zhang, Jiaxin Huang, Chenyan Xiong, Heng Ji, Chao Zhang and Jiawei
Han, "Text Classification Using Label Names Only: A Language
Model Self-Training Approach", in Proc.
2020 Conf. on Empirical Methods in Natural Language Processing (EMNLP'20), Nov.
2020
60.
Jiaming
Shen, Wenda Qiu, Jingbo Shang, Michelle Vanni, Xiang Ren and Jiawei Han, "SynSetExpan: An Iterative Framework for Joint Entity Set
Expansion and Synonym Discovery", in Proc.
2020 Conf. on Empirical Methods in Natural Language Processing (EMNLP'20), Nov.
2020
61.
Edouard
Fouche, Yu Meng, Fang Guo, Honglei Zhuang, Klemens Boehm, and Jiawei Han,
"Mining Text Outliers in Document Directories", in Proc.
2020 IEEE Int. Conf. on Data Mining (ICDM'20), Nov. 2020
62.
Carl
Yang, Jieyu Zhang, and Jiawei Han, "Co-Embedding Network
Nodes and Hierarchical Labels with Taxonomy Based Generative Adversarial
Networks", in Proc. 2020 IEEE Int. Conf. on Data Mining (ICDM'20),
Nov. 2020 (Best Paper Award)
63.
Yu
Meng, Jiaxin Huang, Jiawei Han, “Embedding-Driven Multi-Dimensional Topic Mining and Text
Analysis”, (Conference tutorial), 2020 ACM SIGKDD Int. Conf. on Knowledge
Discovery and Data Mining (KDD’20), San Diego, CA, August 2020
64.
Jiaxin
Huang, Yiqing Xie, Yu Meng, Yunyi Zhang and Jiawei Han, “CoRel: Seed-Guided
Topical Taxonomy Construction by Concept Learning and Relation Transferring”, in Proc. of 2020
ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’20), San
Diego, CA, August 2020
65.
Yuning
Mao, Tong Zhao, Andrey Kan, Chenwei Zhang, Xin Luna Dong, Christos Faloutsos
and Jiawei Han, “Octet: Online Catalog
Taxonomy Enrichment with Self-Supervision”, in Proc. of 2020
ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’20), San
Diego, CA, August 2020
66.
Yu
Meng, Yunyi Zhang, Jiaxin Huang, Yu Zhang, Chao Zhang and Jiawei Han, “Hierarchical Topic
Mining via Joint Spherical Tree and Text Embedding”, in Proc. of 2020
ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’20), San
Diego, CA, August 2020
67.
Chanyoung
Park, Carl Yang, Qi Zhu, Donghyun Kim, Hwanjo Yu and Jiawei Han, “Unsupervised
Differentiable Multi-aspect Network Embedding”, in Proc. of 2020
ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’20), San
Diego, CA, August 2020
68.
Carl
Yang, Aditya Pal, Andrew Zhai, Nikil Pancha, Jiawei Han, Chuck Rosenburg and
Jure Leskovec, “MultiSage: Empowering
GCN with Contextualized Multi-Embeddings on Web-Scale Multipartite Networks”, in Proc. of 2020
ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining (KDD’20), San
Diego, CA, August 2020
69.
Yiqing
Xie, Sha Li, Carl Yang, Raymond Chi-Wing Wong, Jiawei Han, “When Do GNNs Work:
Understanding and Improving Neighborhood Aggregation”, in Proc. of 2020
Int. Joint Conf. on Artificial Intelligence and Pacific Rim Int. Conf. on
Artificial Intelligence (IJCAI-PRICAI’20), Yokohoma, Japan, July 2020
70.
Yu
Zhang, Yu Meng, Jiaxin Huang, Frank F. Xu, Xuan Wang and Jiawei Han, “Minimally Supervised Categorization of Text with Metadata”, in Proc. 2020 ACM
SIGIR Int. Conf. on Research and development in Information Retrieval
(SIGIR’20), Xi’an, China, July 2020
71.
Honglei
Zhuang, Fang Guo, Chao Zhang, Liyuan Liu and Jiawei Han, “Joint Aspect-Sentiment Analysis with Minimal User Guidance”, in Proc. 2020 ACM
SIGIR Int. Conf. on Research and development in Information Retrieval
(SIGIR’20), Xi’an, China, July 2020
72.
Carl
Yang, Jieyu Zhang, Haonan Wang, Bangzheng Li, Jiawei Han, "Neural Concept Map Generation for Effective Document
Classification with Interpretable Structured Summarization" (short
paper), in Proc. 2020 ACM SIGIR Int. Conf. on Research and development in
Information Retrieval (SIGIR'20), Xi'an, China, July 2020
73.
Yuning
Mao, Liyuan Liu, Qi Zhu, Xiang Ren and Jiawei Han, “Facet-Aware Evaluation
for Extractive Summarization”, in Proc. 2020
Annual Conf. of the Association for Computational Linguistics (ACL’20),
Seattle, WA, July 202
74.
Yunyi
Zhang, Jiaming Shen, Jingbo Shang and Jiawei Han, “Empower Entity Set
Expansion via Language Model Probing”, in Proc. 2020
Annual Conf. of the Association for Computational Linguistics (ACL’20),
Seattle, WA, July 2020
75.
Xuan
Wang, Yingjun Guan, Weili Liu, Aabhas Chauhan, Enyi Jiang, Qi Li, David Liem,
Dibakar Sigdel, John Caufield, Peipei Ping and Jiawei Han, “EVIDENCEMINER: Textual
Evidence Discovery for Life Sciences”, in Proc. 2020
Annual Conf. of the Association for Computational Linguistics (ACL’20) (System
demo), Seattle, WA, July 2020
76.
Xiaotao
Gu, Yuning Mao, Jiawei Han, Jialu Liu, You Wu, Cong Yu, Daniel Finnie, Hongkun
Yu, Jiaqi Zhai and Nicholas Zukoski, ”Generating
Representative Headlines for News Stories”, in Proc. 2020
Int. World Wide Web Conf. (WWW’20), Taipei, Taiwan, Apr. 2020
77.
Jiaxin
Huang, Yiqing Xie, Yu Meng, Jiaming Shen, Yunyi Zhang and Jiawei Han, ”Guiding Corpus-based
Set Expansion by Auxiliary Sets Generation and Co-Expansion”, in Proc. 2020
Int. World Wide Web Conf. (WWW’20), Taipei, Taiwan, Apr. 2020
78.
Yu
Meng, Jiaxin Huang, Guangyuan Wang, Zihan Wang, Chao Zhang, Yu Zhang and Jiawei
Han, ”Discriminative Topic
Mining via Category-Name Guided Text Embedding”, in Proc. 2020
Int. World Wide Web Conf. (WWW’20), Taipei, Taiwan, Apr. 2020
79.
Jingbo
Shang, Xinyang Zhang, Liyuan Liu, Sha Li and Jiawei Han, ”NetTaxo: Automated
Topic Taxonomy Construction from Large-Scale Text-Rich Network”, in Proc. 2020
Int. World Wide Web Conf. (WWW’20), Taipei, Taiwan, Apr. 2020
80.
Jiaming
Shen, Zhihong Shen, Chenyan Xiong, Chi Wang, Kuansan Wang and Jiawei Han ”TaxoExpan:
Self-supervised Taxonomy Expansion with Position-Enhanced Graph Neural Network”, in Proc. 2020 Int.
World Wide Web Conf. (WWW’20), Taipei, Taiwan, Apr. 2020
81.
Qi
Zhu, Hao Wei, Bunyamin Sisman, Da Zheng, Christos Faloutsos, Xin Luna Dong and
Jiawei Han, ”Collective Multi-type
Entity Alignment Between Knowledge Graphs”, in Proc. 2020
Int. World Wide Web Conf. (WWW’20), Taipei, Taiwan, Apr. 2020
82.
Liu,
Liyuan, Haoming Jiang, Pengcheng He, Weizhu Chen, Xiaodong Liu, Jianfeng Gao,
and Jiawei Han. "On the variance of the
adaptive learning rate and beyond," In Proc.
2020 Int. Conf. on Learning Representations (ICLR), Addis Ababa,
Ethiopia, Apr. 2020.
83.
Chanyoung
Park, Donghyun Kim, Hwanjo Yu, Jiawei Han, “Unsupervised Attributed
Multiplex Network Embedding”, in Proc. 2020 AAAI Int. Conf. on Artificial Intelligence
(AAAI’20), New York, NY, Feb. 2020
84.
Aravind
Sankar, Xinyang Zhang, Adit Krishnan and Jiawei Han, "A Deep Generative Approach to Integrate Social Homophily
and Temporal Influence in Diffusion Prediction", in Proc.
2020 ACM Int. Conf. on Web Search and Data Mining (WSDM'20), Houston, TX, Feb.
2020
85.
Carl
Yang, Jieyu Zhang, Haonan Wang, Sha Li, Myunghwan Kim, Matthew Walker, Yiou
Xiao and Jiawei Han, "Relation Learning on
Social Networks with Multi-Modal Graph Edge Variational Autoencoders", in Proc.
2020 ACM Int. Conf. on Web Search and Data Mining (WSDM'20), Houston, TX, Feb.
2020
86.
Yu
Meng, Jiaxin Huang, Guangyuan Wang, Zihan Wang, Chao Zhang, and Jiawei Han, ”Unsupervised Word Embedding Learning by Incorporating
Local and Global Contexts”, Frontier in Big Data, 3:9, 2020
Ph.D.
Dissertations
·
Xuan
Wang, Ph.D., Dec. 2022, thesis title: “Scientific Knowledge Extraction from
Massive Text Data”
·
Yuning
Mao, Ph.D., April 2022, thesis title: “Guided Text Summarization with Limited
Supervision”
·
Xiaotao
Gu, Ph.D., March 2022, thesis title: “Annotation-Free Knowledge Mining from
Massive Text Corpora”
·
Jiaming
Shen, Ph.D., Nov. 2021, thesis title: “Automated Taxonomy Discovery and
Exploration”
·
Carl Ji
Yang, Ph.D., Nov. 2020, thesis title: “Multi-Facet Graph Mining with
Contextualized Projections”
·
Shi Zhi,
Ph.D., Sept. 2020, thesis title: “Learning from Multiple Heterogeneous
Sources—Handling source trustworthiness and incompleteness”
·
Ahmed
El-Kishky, Ph.D., March 2020, thesis title: “Text Mining at Multiple
Granularity: Leveraging Subwords, Words, Phrases, and Sentences”
·
Jingbo Shang,
Ph.D., Nov. 2019, thesis title: “Constructing and
Mining Structured Heterogeneous Information Networks from Massive Text Corpora”, Ph.D.
Thesis won 2020 ACM SIGKDD Doctoral Dissertation Award Runner-Up
Project Impact
§ Education: Parts of the new research results are used in
Data Mining courses (CS412, CS512, CS412 MCD-DS online Coursera courses) for
both undergraduate and graduate students being taught in the Department of
Computer Science, the University of Illinois at Urbana-Champaign.
The research results have been and will continuously be published
timely in international conferences and journals and be distributed world-wide
for education and research. Most of the software developed in this project
have been made opensource published at Github. The new progress will also be
integrated into the new edition of our data mining textbook and other research
collections.
§ Collaborations: For this project we have
established collaborations with ARL, Google Research, Amazon, Adobe, IAI,
Microsoft Research, UCLA Medical School, LinkedIn, Facebook, and other industry
and research centers. Through such collaborations we expect to explore
many real applications and produce bigger Research Impacts.
Current and Future Activities
The following are some of the highlights of our
ongoing work. Please refer to the
section: Publications and Products section for related references.
1.
Study effective and scalable methods for embedding
at mining text and heterogeneous information networks
2.
Study effective and scalable methods for embedding
and text mining at construction of heterogeneous knowledge cubes from
unstructured data
3. Study effective and scalable methods for exploration of multidimensional
text-and knowledge-hypercubes to support new applications
Area Background
This project is
based on the previous research on data mining, text mining, embedding
in networks, and data cube and multidimensional analysis.
There have been many research papers published on these themes. Several
textbooks on data mining, text mining, information retrieval and information
network analysis provide good overviews of the principles and algorithms.
Area References
·
Xiang Ren and Jiawei Han, Mining
Structures of Factual Knowledge from Text: An Effort-Light Approach, Morgan
& Claypool Publishers, 2018
·
Jialu Liu, Jingbo Shang and Jiawei
Han, Phrase
Mining from Massive Text and Its Applications, Morgan &
Claypool, 2017
·
Yizhou Sun and Jiawei Han, Mining Heterogeneous
Information Networks: Principles and Methodologies, Morgan &
Claypool, 2012
Potential Related Projects
Project Web site URL: http://hanj.cs.illinois.edu/projs/hypercube.htm
Online software: Online
software can be downloaded at GitHub by githubing the first-authors of the corresponding
papers
Online resources: Research publications related to this project can be
downloaded at Selected
Publications