publications by categories in reversed chronological order.
I have published 30+ papers in prestigious journals and conferences, including DM venues (e.g., KDD*3, WWW*1, ICDM*5, TKDE*2, KAIS*2) and AI venues (e.g., AAAI*3). Among them, I got two best paper runner-ups in SIGSPATIAL’20 and ICDM’21 respectively. The representative papers can be categorized as follows:
@inproceedings{Ying2024,author={Ying, Wangyang and Wang, Dongjie and Hu, Xuanming and Zhou, Yuanchun and Aggarwal, Charu C. and Fu, Yanjie},title={Unsupervised Generative Feature Transformation via Graph Contrastive Pre-training and Multi-objective Fine-tuning},year={2024},booktitle={Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},pages={},numpages={},location={Barcelona, Spain},series={KDD '24},}
@article{10108019,author={Xiao, Meng and Wang, Dongjie and Fu, Yanjie and Liu, Kunpeng and Wu, Min and Xiong, Hui and Zhou, Yuanchun},journal={ACM Transactions on Knowledge Discovery from Data},title={Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective},year={2024},volume={},number={},pages={},}
2023
NeurIPS’23
Reinforcement-enhanced autoregressive feature transformation: Gradient-steered search in continuous space for postfix expressions
Dongjie Wang, Meng Xiao, Min Wu, Yuanchun Zhou, and Yanjie Fu
Advances in Neural Information Processing Systems, 2023
@article{wang2024reinforcement,title={Reinforcement-enhanced autoregressive feature transformation: Gradient-steered search in continuous space for postfix expressions},author={Wang, Dongjie and Xiao, Meng and Wu, Min and Zhou, Yuanchun and Fu, Yanjie},journal={Advances in Neural Information Processing Systems},volume={36},year={2023},}
KDD’23
Interdependent Causal Networks for Root Cause Localization
The goal of root cause analysis is to identify the underlying causes of system problems by discovering and analyzing the causal structure from system monitoring data. It is indispensable for maintaining the stability and robustness of large-scale complex systems. Existing methods mainly focus on the construction of a single effective isolated causal network, whereas many real-world systems are complex and exhibit interdependent structures (i.e., multiple networks of a system are interconnected by cross-network links). In interdependent networks, the malfunctioning effects of problematic system entities can propagate to other networks or different levels of system entities. Consequently, ignoring the interdependency results in suboptimal root cause analysis outcomes.In this paper, we propose REASON, a novel framework that enables the automatic discovery of both intra-level (i.e., within-network) and inter-level (i.e., across-network) causal relationships for root cause localization. REASON consists of Topological Causal Discovery (TCD) and Individual Causal Discovery (ICD). The TCD component aims to model the fault propagation in order to trace back to the root causes. To achieve this, we propose novel hierarchical graph neural networks to construct interdependent causal networks by modeling both intra-level and inter-level non-linear causal relations. Based on the learned interdependent causal networks, we then leverage random walk with restarts to model the network propagation of a system fault. The ICD component focuses on capturing abrupt change patterns of a single system entity. This component examines the temporal patterns of each entity’s metric data (i.e., time series), and estimates its likelihood of being a root cause based on the Extreme Value theory. Combining the topological and individual causal scores, the top K system entities are identified as root causes. Extensive experiments on three real-world datasets validate the effectiveness of the proposed framework.
@inproceedings{10.1145/3580305.3599849,author={Wang, Dongjie and Chen, Zhengzhang and Ni, Jingchao and Tong, Liang and Wang, Zheng and Fu, Yanjie and Chen, Haifeng},title={Interdependent Causal Networks for Root Cause Localization},year={2023},isbn={9798400701030},publisher={Association for Computing Machinery},address={New York, NY, USA},doi={10.1145/3580305.3599849},booktitle={Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},pages={5051–5060},numpages={10},keywords={interdependent networks, network propagation, graph neural networks, causal structure learning, root cause analysis},location={Long Beach, CA, USA},series={KDD '23},}
KDD’23
Incremental Causal Graph Learning for Online Root Cause Analysis
The task of root cause analysis (RCA) is to identify the root causes of system faults/failures by analyzing system monitoring data. Efficient RCA can greatly accelerate system failure recovery and mitigate system damages or financial losses. However, previous research has mostly focused on developing offline RCA algorithms, which often require manually initiating the RCA process, a significant amount of time and data to train a robust model, and then being retrained from scratch for a new system fault.In this paper, we propose CORAL, a novel online RCA framework that can automatically trigger the RCA process and incrementally update the RCA model. CORAL consists of Trigger Point Detection, Incremental Disentangled Causal Graph Learning, and Network Propagation-based Root Cause Localization. The Trigger Point Detection component aims to detect system state transitions automatically and in near-real-time. To achieve this, we develop an online trigger point detection approach based on multivariate singular spectrum analysis and cumulative sum statistics. To efficiently update the RCA model, we propose an incremental disentangled causal graph learning approach to decouple the state-invariant and state-dependent information. After that, CORAL applies a random walk with restarts to the updated causal graph to accurately identify root causes. The online RCA process terminates when the causal graph and the generated root cause list converge. Extensive experiments on three real-world datasets demonstrate the effectiveness and superiority of the proposed framework.
@inproceedings{10.1145/3580305.3599392,author={Wang, Dongjie and Chen, Zhengzhang and Fu, Yanjie and Liu, Yanchi and Chen, Haifeng},title={Incremental Causal Graph Learning for Online Root Cause Analysis},year={2023},isbn={9798400701030},publisher={Association for Computing Machinery},address={New York, NY, USA},doi={10.1145/3580305.3599392},booktitle={Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},pages={2269–2278},numpages={10},keywords={causal structure learning, incremental learning, trigger point detection, disentangled graph learning, root cause analysis},location={Long Beach, CA, USA},series={KDD '23},}
AAAI’23
Human-instructed Deep Hierarchical Generative Learning for Automated Urban Planning
Dongjie Wang, Lingfei Wu, Denghui Zhang, Jingbo Zhou, Leilei Sun, and Yanjie Fu
In Proceedings of the AAAI Conference on Artificial Intelligence, 2023
@inproceedings{wang2023human,title={Human-instructed Deep Hierarchical Generative Learning for Automated Urban Planning},author={Wang, Dongjie and Wu, Lingfei and Zhang, Denghui and Zhou, Jingbo and Sun, Leilei and Fu, Yanjie},booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},volume={37},number={4},pages={4660--4667},year={2023},}
@article{10108018,author={Wang, Dongjie and Wang, Pengyang and Fu, Yanjie and Liu, Kunpeng and Xiong, Hui and Hughes, Charles E.},journal={IEEE Transactions on Knowledge and Data Engineering},title={Reinforced Imitative Graph Learning for Mobile User Profiling},year={2023},volume={},number={},pages={1-13},}
In this paper, we propose a single-agent Monte Carlo-based reinforced feature selection method, as well as two efficiency improvement strategies, i.e., early stopping strategy and reward-level interactive strategy. Feature selection is one of the most important technologies in data prepossessing, aiming to find the optimal feature subset for a given downstream machine learning task. Enormous research has been done to improve its effectiveness and efficiency. Recently, the multi-agent reinforced feature selection (MARFS) has achieved great success in improving the performance of feature selection. However, MARFS suffers from the heavy burden of computational cost, which greatly limits its application in real-world scenarios. In this paper, we propose an efficient reinforcement feature selection method, which uses one agent to traverse the whole feature set and decides to select or not select each feature one by one. Specifically, we first develop one behavior policy and use it to traverse the feature set and generate training data. And then, we evaluate the target policy based on the training data and improve the target policy by Bellman equation. Besides, we conduct the importance sampling in an incremental way and propose an early stopping strategy to improve the training efficiency by the removal of skew data. In the early stopping strategy, the behavior policy stops traversing with a probability inversely proportional to the importance sampling weight. In addition, we propose a reward-level and training-level interactive strategy to improve the training efficiency via external advice. What’s more, we propose an incremental descriptive statistics method to represent the state with low computational cost. Finally, we design extensive experiments on real-world data to demonstrate the superiority of the proposed method.
@article{10.1007/s10115-022-01812-3,author={Liu, Kunpeng and Wang, Dongjie and Du, Wan and Wu, Dapeng Oliver and Fu, Yanjie},title={Interactive Reinforced Feature Selection with Traverse Strategy},year={2023},issue_date={May 2023},publisher={Springer-Verlag},address={Berlin, Heidelberg},volume={65},number={5},issn={0219-1377},url={https://doi.org/10.1007/s10115-022-01812-3},journal={Knowl. Inf. Syst.},month=jan,pages={1935–1962},numpages={28},keywords={Monte Carlo, Reinforcement learning, Feature selection}}
@article{huang2023imufs,title={IMUFS: Complementary and Consensus Learning-Based Incomplete Multi-View Unsupervised Feature Selection},author={Huang, Yanyong and Shen, Zongxin and Cai, Yuxin and Yi, Xiuwen and Wang, Dongjie and Lv, Fengmao and Li, Tianrui},journal={IEEE Transactions on Knowledge and Data Engineering},year={2023},publisher={IEEE}}
@article{wang2023automated,title={Automated urban planning aware spatial hierarchies and human instructions},author={Wang, Dongjie and Liu, Kunpeng and Huang, Yanyong and Sun, Leilei and Du, Bowen and Fu, Yanjie},journal={Knowledge and Information Systems},volume={65},number={3},pages={1337--1364},issue_date={May},year={2023},publisher={Springer}}
AAAI’23
Dish-TS: A General Paradigm for Alleviating Distribution Shift in Time Series Forecasting
Wei Fan, Pengyang Wang, Dongkun Wang, Dongjie Wang, Yuanchun Zhou, and Yanjie Fu
In Proceedings of the AAAI Conference on Artificial Intelligence, Jan 2023
@inproceedings{fan2023dish,title={Dish-TS: A General Paradigm for Alleviating Distribution Shift in Time Series Forecasting},author={Fan, Wei and Wang, Pengyang and Wang, Dongkun and Wang, Dongjie and Zhou, Yuanchun and Fu, Yanjie},booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},volume={37},number={6},pages={7522--7529},year={2023}}
ICDM’23
Beyond Discrete Selection: Continuous Embedding Space Optimization for Generative Feature Selection
Meng Xiao, Dongjie Wang, Min Wu, Pengfei Wang, Yuanchun Zhou, and Yanjie Fu
In 2023 IEEE international conference on data mining (ICDM), Jan 2023
@inproceedings{xiao2023beyond,title={Beyond Discrete Selection: Continuous Embedding Space Optimization for Generative Feature Selection},author={Xiao, Meng and Wang, Dongjie and Wu, Min and Wang, Pengfei and Zhou, Yuanchun and Fu, Yanjie},booktitle={2023 IEEE international conference on data mining (ICDM)},pages={in preprint},year={2023},organization={IEEE}}
ICDM’23
Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing
Wangyang Ying, Dongjie Wang, Kunpeng Liu, Leilei Sun, and Yanjie Fu
In 2023 IEEE international conference on data mining (ICDM), Jan 2023
@inproceedings{ying2023self-optimizing,title={Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing},author={Ying, Wangyang and Wang, Dongjie and Liu, Kunpeng and Sun, Leilei and Fu, Yanjie},booktitle={2023 IEEE international conference on data mining (ICDM)},pages={in preprint},year={2023},organization={IEEE}}
2022
KDD’22
Group-wise reinforcement feature generation for optimal and explainable representation space reconstruction
Dongjie Wang, Yanjie Fu, Kunpeng Liu, Xiaolin Li, and Yan Solihin
In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Jan 2022
@inproceedings{wang2022group,title={Group-wise reinforcement feature generation for optimal and explainable representation space reconstruction},author={Wang, Dongjie and Fu, Yanjie and Liu, Kunpeng and Li, Xiaolin and Solihin, Yan},booktitle={Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},pages={1826--1834},year={2022},}
WWW’22
Multi-level recommendation reasoning over knowledge graphs with reinforcement learning
Xiting Wang, Kunpeng Liu, Dongjie Wang, Le Wu, Yanjie Fu, and Xing Xie
In Proceedings of the ACM Web Conference 2022, Jan 2022
@inproceedings{wang2022multi,title={Multi-level recommendation reasoning over knowledge graphs with reinforcement learning},author={Wang, Xiting and Liu, Kunpeng and Wang, Dongjie and Wu, Le and Fu, Yanjie and Xie, Xing},booktitle={Proceedings of the ACM Web Conference 2022},pages={2098--2108},year={2022}}
2021
AAAI’21
Reinforced imitative graph representation learning for mobile user profiling: An adversarial training perspective
Dongjie Wang, Pengyang Wang, Kunpeng Liu, Yuanchun Zhou, Charles E Hughes, and Yanjie Fu
In Proceedings of the AAAI Conference on Artificial Intelligence, Jan 2021
@inproceedings{wang2021reinforced,title={Reinforced imitative graph representation learning for mobile user profiling: An adversarial training perspective},author={Wang, Dongjie and Wang, Pengyang and Liu, Kunpeng and Zhou, Yuanchun and Hughes, Charles E and Fu, Yanjie},booktitle={Proceedings of the AAAI Conference on Artificial Intelligence},volume={35},number={5},pages={4410--4417},year={2021},}
ICDM’21
Deep human-guided conditional variational generative modeling for automated urban planning
Dongjie Wang, Kunpeng Liu, Pauline Johnson, Leilei Sun, Bowen Du, and Yanjie Fu
In 2021 IEEE international conference on data mining (ICDM), Jan 2021
@inproceedings{wang2021deep,title={Deep human-guided conditional variational generative modeling for automated urban planning},author={Wang, Dongjie and Liu, Kunpeng and Johnson, Pauline and Sun, Leilei and Du, Bowen and Fu, Yanjie},booktitle={2021 IEEE international conference on data mining (ICDM)},pages={679--688},year={2021},organization={IEEE}}
ICDM’21
Efficient reinforced feature selection via early stopping traverse strategy
Kunpeng Liu, Pengfei Wang, Dongjie Wang, Wan Du, Dapeng Oliver Wu, and Yanjie Fu
In 2021 IEEE International Conference on Data Mining (ICDM), Jan 2021
@inproceedings{liu2021efficient,title={Efficient reinforced feature selection via early stopping traverse strategy},author={Liu, Kunpeng and Wang, Pengfei and Wang, Dongjie and Du, Wan and Wu, Dapeng Oliver and Fu, Yanjie},booktitle={2021 IEEE International Conference on Data Mining (ICDM)},pages={399--408},year={2021},organization={IEEE}}
2020
SIGSPATIAL’20
Reimagining city configuration: Automated urban planning via adversarial learning
Dongjie Wang, Yanjie Fu, Pengyang Wang, Bo Huang, and Chang-Tien Lu
In Proceedings of the 28th international conference on advances in geographic information systems, Jan 2020
@inproceedings{wang2020reimagining,title={Reimagining city configuration: Automated urban planning via adversarial learning},author={Wang, Dongjie and Fu, Yanjie and Wang, Pengyang and Huang, Bo and Lu, Chang-Tien},booktitle={Proceedings of the 28th international conference on advances in geographic information systems},pages={497--506},year={2020}}
ICDM’20
Defending water treatment networks: Exploiting spatio-temporal effects for cyber attack detection
Dongjie Wang, Pengyang Wang, Jinbo Zhou, Leilei Sun, Bowen Du, and Yanjie Fu
In 2020 IEEE International conference on data mining (ICDM), Jan 2020
@inproceedings{wang2020defending,title={Defending water treatment networks: Exploiting spatio-temporal effects for cyber attack detection},author={Wang, Dongjie and Wang, Pengyang and Zhou, Jinbo and Sun, Leilei and Du, Bowen and Fu, Yanjie},booktitle={2020 IEEE International conference on data mining (ICDM)},pages={32--41},year={2020},organization={IEEE}}