Jiawei Han,
Co-PI (PI: Quanquan
Gu, University of California at Los Angeles)
Department of Computer Science
University of Illinois, Urbana-Champaign
201 N. Goodwin Ave., Urbana, Illinois 61801 U.S.A.
Office: (217) 333-6903, Fax: (217) 265-6494
E-mail: hanj at Illinois.edu, URL: http://hanj.cs.illinois.edu
List of Supported Students and Staff
§ Chao Zhang, Ph.D. student, Department of Computer Science, University of
Illinois at Urbana-Champaign (duration working on this project: 2018-2019)
§ Qi Li, Postdoc research fellow, Department of Computer Science,
University of Illinois at Urbana-Champaign (duration working on this project:
2018-2019)
§ Yu Shi, Ph.D. student, Department of Computer Science, University of
Illinois at Urbana-Champaign (duration working on this project: 2018-2019)
§ Liyuan Liu, Ph.D. student, Department of Computer Science, University of
Illinois at Urbana-Champaign (duration working on this project: 2018-2019)
§ Xiaotao Gu, Ph.D.
student, Department of Computer Science, University of Illinois at
Urbana-Champaign (duration working on this project: 2019-2020)
§ Jiaming Shen, Ph.D. student, Department of Computer Science, University
of Illinois at Urbana-Champaign (duration working on this project: 2019-2020)
Project Summary
In the Internet
Age, information entities and objects are interconnected, thereby forming
gigantic information networks. In recent years, network embedding methods have
been shown to be greatly beneficial, for many unsupervised and supervised
learning problems over networks, and are now often the methods of choice,
especially for Big Networks. However, existing network embedding methods do not
have theoretical guarantees, and are still in its infancy. Particularly, most network embedding methods
are cast into nonconvex optimization problems and solved by ad hoc algorithms
without any convergence guarantee. Moreover, it is unclear under what
conditions the latent network representation is learnable, and what is the
sample complexity of network embedding. This prohibits us from designing new
algorithms in a principled way. To bridge such a discrepancy between theory and
practice, the PI will develop a new generation of network embedding methods for
taming massive networks, from homogeneous networks to heterogeneous networks,
from transductive to inductive, from unsupervised to
supervised, and from stochastic to online. The new methods to be developed
enjoy faster rates of convergence in optimization, lower computational
complexities, and statistical learning guarantees. To evaluate the proposed
algorithms, both theoretical analysis and extensive experimental evaluations on
real-world massive network datasets will be performed. The targeted
applications are including but not limited to semantic search and information retrieval
in social / information network analysis, expert finding in bibliographical
database, and recommendation systems. The
progress of the project and the research results are also disseminated via the
project Web site (http://hanj.cs.illinois.edu/projs/embedding.htm).
Intellectual
Merit:
The
proposed research bridges the gap between the empirical success of network
embedding, and existing statistical learning and optimization theories. The
core of this proposed research is the integration of modern network mining
techniques with sophisticated statistical learning and optimization tools,
which lays a foundation to design a new generation of network embedding
algorithms with strong theoretical guarantees, and to derive new theories for
various setups of network embedding. Extensive empirical evaluations ensure the
proposed algorithms' applicability in various application domains. The proposed
research is expected to advance the frontier of network embedding and enable it
to be good at taming modern massive networks in the wild.
Broader Impacts:
The results
of this research have the potential to impact the machine learning, data
mining, information retrieval and many other communities. The proposal also has
the potential to reshape the way one approaches problems relating to graph
mining and network analysis, and their roles in a wide range of applications
with massive networks. Our education plan includes developing open course materials
that integrate information network analysis and machine learning and providing
research-based training opportunities for both undergraduate and graduate
students in engineering, art and science. The PIs will actively get
underrepresented groups involved in research projects and train a new
generation of data scientists. This project also supports the outreach activity
to K-12 students, to stimulate their interest, and make the proposed research
accessible to a broader audience. This project will produce open source
software tools and the PIs have a strong track record for developing and
supporting widely-used tools.
Publications
and Products: (Note: major publications
closely related to this project are in bold font)
Note: Please search and download all the papers in
PDF, if available, at our group’s publication website by following the link: Selected
research publications.
Books
Journal articles
·
Jingbo
Shang, Jialu Liu, Meng Jiang, Xiang Ren, Clare R
Voss, Jiawei Han, "Automated Phrase Mining from Massive Text Corpora",
IEEE Transactions on Knowledge and Data Engineering, 30(10):1825-1837 (2018)
·
Jingbo
Shang, Meng Jiang, Wenzhu Tong, Jinfeng
Xiao, Jian Peng, Jiawei Han. "DPPred: An Effective
Prediction Framework with Concise Discriminative Patterns",
IEEE Transactions on Knowledge and Data Engineering, 30(7): 1226-1239
(2018)
·
Chenguang
Wang, Yangqiu Song, Haoran
Li, Ming Zhang, Jiawei Han, "Unsupervised meta-path selection for text similarity
measure based on heterogeneous information networks", Data
Mining and Knowledge Discovery, 32(6): 1735-1767 (2018)
·
Chao
Zhang, Dongming Lei, Quan Yuan, Honglei
Zhuang, Lance M. Kaplan, Shaowen Wang, Jiawei Han,
"GeoBurst+: Effective
and Real-Time Local Event Detection in
Geo-Tagged Tweet Streams", ACM Transactions on
Intelligent Systems and Technology (TIST) 9(3): 34:1-24 (2018)
·
Wei
Shen, Jiawei Han, Jianyong Wang, Xiaojie
Yuan, Zhenglu Yang, "SHINE+: A General Framework for Domain-Specific Entity Linking with
Heterogeneous Information Networks", IEEE Transactions on
Knowledge and Data Engineering, 30(2): 353-366 (2018)
Refereed
Conference Publications
·
Ahmed El-Kishky, Frank Xu, Aston Zhang, and Jiawei Han,
"Parsimonious Morpheme Segmentation with an
Application to Enriching Word Embeddings", in Proc. 2019 IEEE Int. Conf. on Big Data
(IEEE BigData'19), Los Angeles, CA, Dec. 2019
·
Hyungsul Kim, Ahmed El-Kishky, Xiang Ren, and
Jiawei Han, "Mining News Events from Comparable
News Corpora: A Multi-Attribute Proximity Network Modeling Approach", in Proc. 2019 IEEE Int. Conf. on
Big Data (IEEE BigData'19), Los Angeles, CA, Dec. 2019
·
Xuan Wang, Yu Zhang, Qi
Li, Jiawei Han, "Taming Unstructured Big Data: Automated
Information Extraction from Massive Text" (Conference tutorial), 2019
IEEE Int. Conf. on Big Data (IEEE BigData'19), Los Angeles, CA, Dec. 2019
·
Yu Meng, Jiaxin Huang, Guangyuan Wang,
Chao Zhang, Honglei Zhuang, Lance Kaplan and Jiawei
Han, "Spherical Text Embedding", in Proc. 2019 Conf. on
Neural Information Processing Systems (NeurIPS’19), Vancouver, Canada, Dec.
2019
·
Carl Yang, Peiye Zhuang, Wenhan Shi, Alan Luu and Pan Li, "Conditional Structure Generation
through Graph Variational Generative Adversarial Nets", in Proc. 2019 Conf. on
Neural Information Processing Systems (NeurIPS’19), Vancouver, Canada,
Dec. 2019
·
Xuan Wang, Yu Zhang,
Qi Li, Xiang Ren, Jingbo Shang, and Jiawei Han,
"Distantly Supervised Biomedical Named Entity Recognition with Dictionary
Expansion", in Proc. 2019 IEEE Int. Conf. on Bioinformatics and
Biomedicine (IEEE-BIBM'19), San Diego, CA, Nov. 2019
·
Yuning Mao, Jingjing Tian, Jiawei Han and
Xiang Ren, “Hierarchical Text Classification with Reinforced Label
Assignment”, in Proc. 2019 Conf. on Empirical Methods in Natural Language
Processing and Int. Joint Conf. on Natural Language Processing (EMNLP-IJNLP19),
Hong Kong, China, Nov. 2019
·
Zihan Wang, Jingbo Shang, Liyuan Liu, Lihao
Lu, Jiacheng Liu and Jiawei Han, “CrossWeigh: Training Named Entity Tagger from Imperfect
Annotations”, in Proc. 2019 Conf. on Empirical Methods in Natural Language
Processing and Int. Joint Conf. on Natural Language Processing (EMNLP-IJNLP19),
Hong Kong, China, Nov. 2019
·
Carl Yang, Mengxiong Liu, Frank He, Jian Peng, Jiawei Han, “cube2net: Efficient Quality Network
Construction with Data Cube Organization”, in Proc. of 2019 IEEE Int. Conf. on
Data Mining: PhD Forum, Beijing, Nov. 2019
·
Carl Yang, Jieyu Zhang, and Jiawei Han, “Neural Embedding Propagation on Heterogeneous Networks”, in Proc. of 2019 IEEE Int. Conf. on
Data Mining (ICDM’19), Beijing, Nov. 2019
·
Yu Zhang, Frank F. Xu,
Sha Li, Yu Meng, Xuan Wang, Qi Li, and Jiawei Han, “HiGitClass: Keyword-Driven Hierarchical Classification of
GitHub Repositories”, in Proc. of 2019 IEEE Int. Conf. on Data Mining (ICDM’19),
Beijing, Nov. 2019
·
Jiawei Han, “From Unstructured Text to TextCube:
Automated Construction and Multidimensional Exploration” (keynote speech), in Proc. 2019 ACM Int. Conf. on Information
and Knowledge Management (CIKM’19), Beijing, China, Nov. 2019
·
Chanyoung Park, Donghyun Kim, Qi Zhu, Jiawei
Han and Hwanjo Yu, “Task-Guided Pair Embedding in Heterogeneous Network”, in Proc. 2019 ACM Int. Conf. on Information
and Knowledge Management (CIKM’19), Beijing, China, Nov. 2019
·
Yu Shi, Jiaming Shen,
Yuchen Li, Naijing Zhang, Xinwei
He, Zhengzhi Lou, Qi Zhu, Matthew Walker, Myunghwan Kim and Jiawei Han, “Discovering Hypernymy in Text-Rich Heterogeneous Information
Network by Exploiting Context Granularity”, in Proc. 2019 ACM Int. Conf. on
Information and Knowledge Management (CIKM’19), Beijing, China, Nov. 2019
·
Carl Yang, Lingrui Gan, Zongyi Wang, Jiaming
Shen, Jinfeng Xiao and Jiawei Han, “Query-Specific Knowledge Summarization with
Entity Evolutionary Networks”, in Proc. 2019 ACM Int. Conf. on
Information and Knowledge Management (CIKM’19), Beijing, China, Nov. 2019
·
Yu Shi, Xinwei He, Naijing Zhang, Carl
Yang, and Jiawei Han, "User-Guided Clustering in Heterogeneous
Information Networks via Motif-Based Comprehensive Transcription", in Proc. 2019 European Conf. on
Machine Learning and Principles and Practice of Knowledge Discovery in
Databases (ECMLPKDD'19), Wurzburg, Germany, Sept. 2019
·
Yu Meng, Jiaxin Huang, Jingbo Shang, and
Jiawei Han, “TextCube: Automated Construction and Multidimensional
Exploration”, Conference tutorial at 2019 Int. Conf. on Very Large Data
Bases (VLDB’19), Los Angeles, CA, Aug. 2019
·
Yu Meng, Jiaxin Huang, Zihan Wang, Chenyu Fan, Guangyuan Wang, Chao
Zhang, Jingbo Shang, Lance Kaplan, Jiawei Han, "TopicMine: User-Guided Topic Mining by
Category-Oriented Embedding", in Proc. of 2019 ACM SIGKDD Int. Conf. on Knowledge
Discovery and Data Mining (KDD'19), (demo paper), Anchorage, AK, August 2019
·
Carl Yang, Dai Teng, Siyang Liu, Sayantani Basu, Jieyu Zhang, Jingbo Shang, Chao Zhang, Jiaming Shen, Lance Kaplan,
Timothy Hanratty, Jiawei Han, "CubeNet: Multi-Facet Hierarchical Heterogeneous
Network Construction, Analysis, and Mining", in Proc. of 2019 ACM SIGKDD Int. Conf.
on Knowledge Discovery and Data Mining (KDD'19), (demo paper), Anchorage, AK,
August 2019
·
Ahmed El-Kishky, Xingyu Fu, Aseel Addawood, Nahil Sobh, Clare Voss and Jiawei Han, "Constrained Sequence-to-sequence Semitic Root
Extraction for Enriching Word Embeddings", in Proc. of the 4th Arabic
Natural Language Processing Worksho (WANLP 2019),
co-located with ACL 2019, Florence, Italy, July 2019
·
Jingbo Shang, Jiaming Shen, Liyuan Liu, and Jiawei Han, "Constructing and Mining Heterogeneous Information Networks from
Massive Text", Conference tutorial at 2019 ACM SIGKDD Int. Conf.
on Knowledge Discovery and Data Mining (KDD'19), Anchorage, AK, Aug. 2019
·
Liyuan Liu, Jingbo Shang, and Jiawei Han, "Arabic Named Entity Recognition: What Works and Whats Next", in Proc. of the 4th Arabic Natural
Language Processing Worksho (WANLP 2019), co-located
with ACL 2019, Florence, Italy, July 2019
·
Diya Li, Lifu Huang, Heng Ji, Jiawei Han, "Biomedical Event Extraction based on Knowledge-driven Tree-LSTM", in Proc. 2019 Annual Conf. of the
North American Chapter of the Association for Computational Linguistics (NAACL-HLT'19), Minneapolis,
MN, June 2019
·
Shuochao Yao, Ailing Piao, Wenjun Jiang, Yiran
Zhao, Huajie Shao, Shengzhong
Liu, Dongxin Liu, Jinyang
Li, Tianshi Wang, Shaohan
Hu, Lu Su, Jiawei Han and Tarek Abdelzaher,
“STFNets: Learning Sensing Signals from the
Time-Frequency Perspective with Short-Time Fourier Neural Networks”, in Proc. the Web Conf. 2019 (WWW’19), San
Francisco, CA, May 2019
·
Carl Yang, Huy Hoang Do, Tomas Mikolov and
Jiawei Han “Place Deduplication with Embeddings”, in Proc. the Web Conf. 2019
(WWW’19), San Francisco, CA, May 2019
·
Honglei Zhuang, Timothy Hanratty, and Jiawei Han, "Aspect-Based Sentiment Analysis with Minimal
Guidance", in Proc. 2019 SIAM Int. Conf. on Data Mining (SDM'19),
Calgary, Alberta, Canada, May 201
·
Sha Li, Chao Zhang, Dongming Lei, Ji Li, Jiawei Han, "GeoAttn: Fine-Grained Localization of Social Media
Messages via Attentional Memory Network", in Proc. 2019 SIAM Int. Conf. on Data
Mining (SDM'19), Calgary, Alberta, Canada, May 2019
·
Qi Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Ahmed El-Kishky
and Jiawei Han, "Integrating Local Context and Global Cohesiveness for Open
Information Extraction", in Proc. 2019 ACM Int. Conf. on Web Search and Data
Mining (WSDM'19), Melbourne, Australia, Feb. 2019
·
Jiaming Shen, Ruiliang Lyu, Xiang Ren, Michelle
Vanni, Brian Sadler, Jiawei Han, “Mining Entity Synonyms with Efficient Neural Set Generation”, in Proc. 2019 AAAI Conf. on Artificial
Intelligence (AAAI-19), Honolulu, Hawaii, Jan. 2019
·
Yu Meng, Jiaming Shen,
Chao Zhang and Jiawei Han, “Weakly-Supervised Hierarchical Text Classification”, in Proc. 2019 AAAI Conf. on Artificial
Intelligence (AAAI-19), Honolulu, Hawaii, Jan. 2019
·
Xuan Wang, Yu Zhang,
Xiang Ren, Yuhao Zhang, Marinka
Zitnik, Jingbo Shang,
Curtis Langlotz, and Jiawei Han, "Cross-type Biomedical Named Entity
Recognition with Deep Multi-Task Learning", Bioinformatics 35(10): 1745-1752 (2019)
·
Xuan
Wang, Yu Zhang, Qi Li, Cathy Wu, and Jiawei Han, "PENNER:
Pattern-enhanced Nested Named Entity Recognition in Biomedical Literature", Proc. 2018 Int. Conf. on
Bioinformatics and Biomedicine (BIBM'18), Madrid, Spain, Dec. 2018
·
Qi
Li, Xuan Wang, Yu Zhang, Fei Ling, Cathy Wu, and Jiawei Han, "Pattern
Discovery for Wide-Window Open Information Extraction in Biomedical Literature", Proc. 2018 Int. Conf. on
Bioinformatics and Biomedicine (BIBM'18), Madrid, Spain, Dec. 2018
·
Shi Zhi, Fan Yang, Zheyi Zhu, Qi Li, Zhaoran Wang, and Jiawei Han, "Dynamic
Truth Discovery on Numerical Data", in Proc of 2018 IEEE
Int. Conf. on Data Mining (ICDM'18), Singapore, Nov. 2018
·
Carl
Yang, Yichen Feng, Pan Li, Yu Shi, and Jiawei Han,
"Meta-Graph
Based HIN Spectral Embedding: Methods, Analyses, and Insights", in Proc of 2018 IEEE Int.
Conf. on Data Mining (ICDM'18), Singapore, Nov. 2018
·
Fangbo
Tao, Chao Zhang, Xiusi Chen, Meng Jiang, Tim
Hanratty, Lance Kaplan, and Jiawei Han, "Doc2Cube:
Automated Document Allocation to Text Cube via Dimension-Aware Joint Embedding", in Proc of 2018 IEEE Int.
Conf. on Data Mining (ICDM'18), Singapore, Nov. 2018
·
Doris
Xin, Ahmed El-Kishky, De Liao, Brandon Norick, and Jiawei Han, "Active
Learning on Heterogeneous Information Networks: A Multi-armed Bandit Approach", in Proc of 2018 IEEE
Int. Conf. on Data Mining (ICDM'18), Singapore, Nov. 2018
·
Jingbo
Shang, Liyuan Liu, Xiaotao Gu, Xiang Ren, Teng Ren
and Jiawei Han, "Learning Named Entity Tagger using
Domain-Specific Dictionary",
in Proc. of 2018 Conf. on Empirical Methods in Natural Language Processing
(EMNLP'18), Brussels, Belgium, Oct. 2018
·
Liyuan
Liu, Xiang Ren, Jingbo Shang, Xiaotao
Gu, Jian Peng and Jiawei Han, "Efficient
Contextualized Representation: Language Model Pruning for Sequence Labeling", in Proc. of 2018 Conf. on
Empirical Methods in Natural Language Processing (EMNLP'18), Brussels, Belgium,
Oct. 2018
·
Quan
Yuan, Xiang Ren, Wenqi He, Chao Zhang, Xinhe Geng, Lifu
Huang, Heng Ji, Chin-Yew Lin and Jiawei Han, "Open-Schema
Event Profiling for Massive News Corpora", in Proc. of 2018 ACM
Int. Conf. on Information and Knowledge Management (CIKM'18), Turin, Italy, Oct. 2018
·
Yu
Meng, Jiaming Shen, Chao Zhang and Jiawei Han, "Weakly-Supervised
Neural Text Classification", in
Proc. of 2018 ACM Int. Conf. on Information and Knowledge Management
(CIKM'18), Turin,
Italy,
Oct. 2018
·
Jingbo
Shang, Jiaming Shen, Tianhang Sun, Xingbang Liu, Anja Gruenheid,
Flip Korn, Adam Lelkes, Cong Yu and Jiawei Han,
"Investigating Rumor News Using
Agreement-Aware Search", in
Proc. of 2018 ACM Int. Conf. on Information and Knowledge Management (CIKM'18), Turin, Italy, Oct. 2018
·
Carl
Yang, Mengxiong Liu, Frank He, Xikun
Zhang, Jian Peng, and Jiawei Han, "Similarity Modeling on Heterogeneous
Networks via Automatic Path Discovery", in Proc. of 2018 European
Conf. on Machine Learning and Principles and Practice of Knowledge Discovery in
Databases (ECMLPKDD'18), Dublin, Ireland, Sept. 2018
·
Carl
Yang, Mengxiong Liu, Vincent W. Zheng and Jiawei Han,
"Node, Motif and Subgraph: Leveraging
Network Functional Blocks Through Structural Convolution", in Proc. of 2018 IEEE/ACM
Int. Conf. on Social Networks Analysis and Mining (ASONAM'18), Barcelona,
Spain, Aug. 2018
·
Xuan
Wang, Yu Zhang, Qi Li, Yinyin Chen and Jiawei Han,
"Open
Information Extraction with Meta-pattern Discovery in Biomedical Literature", in Proc. of 2018 ACM Conf. on
Bioinformatics, Computational Biology, and Health Informatics (ACM-BCB'18),
Washington, DC, August 2018
·
Jingbo
Shang, Chao Zhang, Jiaming Shen, Jiawei Han, "Towards
Multidimensional Analysis of Text Corpora", Proc. of 2018 ACM SIGKDD Int.
Conf. on Knowledge Discovery and Data Mining (KDD'18), (Conference Tutorial),
London, UK, Aug. 2018
·
Jingbo
Shang, Qi Zhu, Jiaming Shen, Xuan Wang, Xiaotao Gu,
Lance Kaplan, Timothy Harratty and Jiawei Han, "AutoNet: Automated Network Construction and
Exploration System from Domain-Specific Corpora", in Proc. of 2018 ACM SIGKDD Int.
Conf. on Knowledge Discovery and Data Mining (KDD'18), (demo paper), London,
UK, August 2018
·
Jiaming
Shen, Jinfeng Xiao, Yu Zhang, Carl Yang, Jingbo Shang, Jinda Han, Saurabh
Sinha, Peipei Ping, Richard Weinshilboum,
Zhiyong Lu and Jiawei Han, "SetSearch+: Entity-Set-Aware Search and Mining
for Scientific Literature",
in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
(KDD'18), (demo paper), London, UK, August 2018
·
Hanwen Zha, Jiaming Shen, Keqian Li,
Warren Greiff, Michelle Vanni,
Jiawei Han and Xifeng Yan, "FTS:
Faceted Taxonomy Construction and Search for Scientific Publications", in Proc. of 2018 ACM SIGKDD
Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), (demo paper),
London, UK, August 2018
·
Carl
Yang, Xiaolin Shi, Jie Luo
and Jiawei Han, "I
Know You’ll Be Back: Interpretable New User Clustering and Churn Prediction on
a Mobile Social Application",
in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
(KDD'18), London, UK, August 2018
·
Chao
Zhang, Fangbo Tao, Xiusi
Chen, Jiaming Shen, Meng Jiang, Brian Sadler, Michelle Vanni
and Jiawei Han, "TaxoGen: Constructing Topical Concept
Taxonomy by Adaptive Term Embedding and Clustering", in Proc. of 2018 ACM SIGKDD
Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), London, UK, August
2018
·
Qi
Li, Meng Jiang, Xikun Zhang, Meng Qu, Timothy
Hanratty, Jing Gao and Jiawei Han, "TruePIE: Discovering Reliable Patterns in
Pattern-Based Information Extraction", in Proc. of 2018 ACM SIGKDD Int.
Conf. on Knowledge Discovery and Data Mining (KDD'18), London, UK, August 2018
·
Jiaming
Shen, Zeqiu Wu, Dongming
Lei, Chao Zhang, Xiang Ren, Michelle T. Vanni, Brian
M. Sadler and Jiawei Han, "HiExpan: Task-Guided Taxonomy Construction
by Hierarchical Tree Expansion", in Proc. of 2018 ACM SIGKDD
Int. Conf. on Knowledge Discovery and Data Mining (KDD'18), London, UK, August
2018
·
Yu Shi,
Qi Zhu, Fang Guo, Chao Zhang and Jiawei Han, "Easing
Embedding Learning by Comprehensive Transcription of Heterogeneous Information
Networks",
in Proc. of 2018 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
(KDD'18), London, UK, August 2018
·
Yuchen
Li, Zhengzhi Lou, Yu Shi and Jiawei Han,
"Temporal Motifs in Heterogeneous Information Networks", in Proc.
of 2018 Int. Workshop on Mining and Learning with Graphs (MLG'18),
co-located with KDD'18, London, UK, August 2018
·
Yuning
Mao, Xiang Ren, Jiaming Shen, Xiaotao Gu and Jiawei
Han, "End-to-End
Reinforcement Learning for Automatic Taxonomy Induction", in Proc. of 2018 Annual
Meeting of the Association for Computational Linguistics (ACL'18), Melbourne,
Australia, July 2018
·
Jiaming
Shen, Jinfeng Xiao, Xinwei
He, Jingbo Shang, Saurabh Sinha and Jiawei Han,
"Entity Set Search of Scientific
Literature: An Unsupervised Ranking Approach", in Proc. of 2018 Int.
ACM SIGIR Conf. on Research and Development in Information Retrieval
(SIGIR'18), Ann Arbor, MI, July 2018
·
Ahmed
El-Kishky, Frank Xu, Aston Zhang, Stephen Macke and
Jiawei Han, "Entropy-Based Subword
Mining for Word Embeddings",
in Proc. of the 2nd Workshop on Subword and Character
Level Models in NLP (SCLeM'18) (at NAACL 2018), New Orleans, LA, June
2018
·
Yu
Shi, Huan Gui, Qi Zhu, Lance Kaplan,Jiawei
Han, “AspEm: Embedding Learning by Aspects in
Heterogeneous Information Networks,” Proc. of 2018 SIAM Int. Conf. on
Data Mining (SDM’18), San Diego, CA, May 2018
·
Meng
Qu, Xiang Ren, Yu Zhang, and Jiawei Han, “Weakly-supervised
Relation Extraction by Pattern-enhanced Embedding Learning”, Proc. of 2018 Int. Conf. on World-Wide
Web (WWW’18), Lyon, France, Apr. 2018
·
Qi
Zhu, Xiang Ren, Jingbo Shang, Yu Zhang, Frank F. Xu
and Jiawei Han, "Open
Information Extraction with Global Structure Constraints”, (poster paper), Proc. of 2018 Int.
Conf. on World-Wide Web (WWW’18), Lyon, France, Apr. 2018 (received WWW'18 best poster award
honorable mentioning)
·
Carl
Yang, Chao Zhang, Jiawei Han, Xuewen
Chen, Jieping Ye, "Did
You Enjoy the Ride: Understanding Passenger Experience via Heterogeneous
Network Embedding",
Proc. of 2018, IEEE International Conference on Data Engineering, Paris,
France, April 2018
·
Liyuan
Liu, Jingbo Shang, Frank Xu, Xiang Ren, Huan Gui, Jian Peng and Jiawei Han, "Empower
Sequence Labeling with Task-Aware Neural Language Model", in Proc. of
2018 AAAI Conf. on Artificial Intelligence (AAAI'18), New
Orleans, LA, Feb. 2018
·
Chao
Zhang, Mengxiong Liu, Zhengchao
Liu, Carl Yang, Luming Zhang, Jiawei Han, "Spatiotemporal Activity Modeling
Under Data Scarcity: A Graph-Regularized Cross-Modal Embedding Approach", in Proc. of
2018 AAAI Conf. on Artificial Intelligence (AAAI'18), New
Orleans, LA, Feb. 2018
·
Wanzheng
Zhu, Chao Zhang, Shuochao Yao, Xiaobin
Gao, Jiawei Han, "A
Spherical Hidden Markov Model for Semantics-Rich Human Mobility Modeling", in Proc. of
2018 AAAI Conf. on Artificial Intelligence (AAAI'18), New
Orleans, LA, Feb. 2018
·
Zeqiu Wu,
Xiang Ren, Frank F. Xu, Ji Li and Jiawei Han, "Indirect
Supervision for Relation Extraction using Question-Answer Pairs", in Proc. of 2018 ACM Int.
Conf. on Web Search and Data Mining (WSDM'18), Los Angeles, CA, Feb. 2018
·
Meng
Qu, Jian Tang, and Jiawei Han, "Curriculum
Learning for Heterogeneous Star Network Embedding via Deep Reinforcement
Learning", in Proc. of 2018 ACM Int.
Conf. on Web Search and Data Mining (WSDM'18), Los Angeles, CA, Feb. 2018
Ph.D.
Dissertations
·
Xiang Ren, Ph.D.,
January 2018, thesis title: “Mining Entity and
Relation Structures from Text: An Effort-Light Approach", Ph.D. Thesis won 2018 ACM SIGKDD Doctoral
Dissertation Award
·
Chao Zhang, Ph.D.,
Nov. 2018, thesis title: “Multi-dimensional
Mining of Unstructured Data with Limited Supervision"", Ph.D. Thesis won 2019 ACM SIGKDD
Doctoral Dissertation Award Runner-Up
·
Yu Shi, Ph.D., March
2019, thesis title: “Harnessing
Heterogeneous Association in Real-World Networks”
·
Honglei Zhuang, Ph.D., March 2019, thesis title: “Text Mining with Word Embedding for Outlier and
Sentiment Analysis"
·
Jingbo Shang, Ph.D., Nov. 2019, thesis title: “Constructing and Mining Structured
Heterogeneous Information Networks from Massive Text Corpora”
Project Impact
§ Education: Parts of the new
research results are used in Data Mining courses (CS412, CS512, CS412 MCD-DS
online Coursera courses) for both undergraduate and graduate students being
taught in the Department of Computer Science, the University of Illinois
at Urbana-Champaign. The research results have been and will
continuously be published timely in international conferences and journals and
be distributed world-wide for education and research. Most of the
software developed in this project have been made opensource published at Github. The new progress will also be integrated into the
new edition of our data mining textbook and other research collections.
§ Collaborations: For this project
we have established collaborations with ARL, BBN, Adobe, IAI, MITRE, Microsoft
Research, Mayo Clinic, UCLA Medical School, LinkedIn, Facebook, and other
industry and research centers. Through such collaborations we expect to
explore many real applications and produce bigger Research Impacts.
Current and
Future Activities
The following are some of the highlights of our
ongoing work. Please refer to the
section: Publications and Products section for related references.
1.
Study effective and scalable methods for embedding
at mining heterogeneous information networks
2.
Study effective and scalable methods for embedding
and text mining at construction of heterogeneous information networks from
unstructured data
3.
Study effective and scalable methods for embedding
and mining for construction of multidimensional text-cubes and cube networks to
support new applications
Area Background
This project is
based on the previous research on data mining, text mining, embedding
in networks, and data cube and multidimensional analysis.
There have been many research papers published on these themes.
Several textbooks on data mining, text mining, information retrieval and
information network analysis provide good overviews of the principles and
algorithms.
Area References
Potential
Related Projects
Project Web
site URL: http://www.cs.uiuc.edu/~hanj/projs/embedding.htm
Online
software: Online software can be downloaded at http://illimine.cs.uiuc.edu,
and online system demo is at http://dm.cs.uiuc.edu/movemine
Online
resources: Research publications
related to this project can be downloaded at Selected
Publications