Publications | Dongjie Wang

I have published 30+ papers in prestigious journals and conferences, including DM venues (e.g., KDD*3, WWW*1, ICDM*5, TKDE*2, KAIS*2) and AI venues (e.g., AAAI*3). Among them, I got two best paper runner-ups in SIGSPATIAL’20 and ICDM’21 respectively. The representative papers can be categorized as follows:

Data-centric AI
- Automated Feature Selection Learning: [SDM’23], [ICDM’21], [KAIS’23], [TKDE’23], [ICDM’23]
- Automated Feature Generation Learning: [KDD’22], [ICDM’23] [NeurIPS’23] [TKDD’24]
Automated Urban Planning
- Generative Adversarial Network based Urban Planner: [SIGSPATIAL’20], [TSAS’23]
- Variational Autoencoder based Urban Planner: [ICDM’21], [KAIS’23]
- Transformer based Urban Planner: [AAAI’23]
- Flow-based Urban Planner: [SDM’24]
Anomaly Detection and Root Cause Analysis
- Anomaly Detection: [ICDM’20]
- Interdependent Causal Graph based RCA: [KDD’23]
- Incremental Causal Update based RCA: [KDD’23]
User Profiling and Recommendation
- User Profiling and POI recommendation: [TKDE’23], [AAAI’21], [TBD’23]

Here are some selected papers. You can find all my publications on my [Google Scholar profile].

2025

KDD’25

Continuous Optimization for Feature Selection with Permutation-Invariant Embedding and Policy-Guided Search

Rui Liu, Rui Xie, Zijun Yao, Yanjie Fu, and Dongjie Wang

In Proceedings of the 31th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2025

Bib

@inproceedings{Liu2025,
  author = {Liu, Rui and Xie, Rui and Yao, Zijun and Fu, Yanjie and Wang, Dongjie},
  title = {Continuous Optimization for Feature Selection with Permutation-Invariant Embedding and Policy-Guided Search},
  year = {2025},
  booktitle = {Proceedings of the 31th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  pages = {},
  numpages = {},
  location = {Toronto, Canada},
  series = {KDD '25},
}

2024

KDD’24

Unsupervised Generative Feature Transformation via Graph Contrastive Pre-training and Multi-objective Fine-tuning

Wangyang Ying, Dongjie Wang, Xuanming Hu, Yuanchun Zhou, Charu C. Aggarwal, and Yanjie Fu

In Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2024

Bib

@inproceedings{Ying2024,
  author = {Ying, Wangyang and Wang, Dongjie and Hu, Xuanming and Zhou, Yuanchun and Aggarwal, Charu C. and Fu, Yanjie},
  title = {Unsupervised Generative Feature Transformation via Graph Contrastive Pre-training and Multi-objective Fine-tuning},
  year = {2024},
  booktitle = {Proceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  pages = {},
  numpages = {},
  location = {Barcelona, Spain},
  series = {KDD '24},
}

TKDD

Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective

Meng Xiao, Dongjie Wang, Yanjie Fu, Kunpeng Liu, Min Wu, Hui Xiong, and Yuanchun Zhou

ACM Transactions on Knowledge Discovery from Data, 2024

Bib HTML

@article{10108019,
  author = {Xiao, Meng and Wang, Dongjie and Fu, Yanjie and Liu, Kunpeng and Wu, Min and Xiong, Hui and Zhou, Yuanchun},
  journal = {ACM Transactions on Knowledge Discovery from Data},
  title = {Traceable Group-Wise Self-Optimizing Feature Transformation Learning: A Dual Optimization Perspective},
  year = {2024},
  volume = {},
  number = {},
  pages = {},
}

2023

NeurIPS’23

Reinforcement-enhanced autoregressive feature transformation: Gradient-steered search in continuous space for postfix expressions

Dongjie Wang, Meng Xiao, Min Wu, Yuanchun Zhou, and Yanjie Fu

Advances in Neural Information Processing Systems, 2023

Bib

@article{wang2024reinforcement,
  title = {Reinforcement-enhanced autoregressive feature transformation: Gradient-steered search in continuous space for postfix expressions},
  author = {Wang, Dongjie and Xiao, Meng and Wu, Min and Zhou, Yuanchun and Fu, Yanjie},
  journal = {Advances in Neural Information Processing Systems},
  volume = {36},
  year = {2023},
}

KDD’23
Interdependent Causal Networks for Root Cause Localization

Dongjie Wang, Zhengzhang Chen, Jingchao Ni, Liang Tong, Zheng Wang, Yanjie Fu, and Haifeng Chen

In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Abs Bib HTML

The goal of root cause analysis is to identify the underlying causes of system problems by discovering and analyzing the causal structure from system monitoring data. It is indispensable for maintaining the stability and robustness of large-scale complex systems. Existing methods mainly focus on the construction of a single effective isolated causal network, whereas many real-world systems are complex and exhibit interdependent structures (i.e., multiple networks of a system are interconnected by cross-network links). In interdependent networks, the malfunctioning effects of problematic system entities can propagate to other networks or different levels of system entities. Consequently, ignoring the interdependency results in suboptimal root cause analysis outcomes.In this paper, we propose REASON, a novel framework that enables the automatic discovery of both intra-level (i.e., within-network) and inter-level (i.e., across-network) causal relationships for root cause localization. REASON consists of Topological Causal Discovery (TCD) and Individual Causal Discovery (ICD). The TCD component aims to model the fault propagation in order to trace back to the root causes. To achieve this, we propose novel hierarchical graph neural networks to construct interdependent causal networks by modeling both intra-level and inter-level non-linear causal relations. Based on the learned interdependent causal networks, we then leverage random walk with restarts to model the network propagation of a system fault. The ICD component focuses on capturing abrupt change patterns of a single system entity. This component examines the temporal patterns of each entity’s metric data (i.e., time series), and estimates its likelihood of being a root cause based on the Extreme Value theory. Combining the topological and individual causal scores, the top K system entities are identified as root causes. Extensive experiments on three real-world datasets validate the effectiveness of the proposed framework.
@inproceedings{10.1145/3580305.3599849, author = {Wang, Dongjie and Chen, Zhengzhang and Ni, Jingchao and Tong, Liang and Wang, Zheng and Fu, Yanjie and Chen, Haifeng}, title = {Interdependent Causal Networks for Root Cause Localization}, year = {2023}, isbn = {9798400701030}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, doi = {10.1145/3580305.3599849}, booktitle = {Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining}, pages = {5051–5060}, numpages = {10}, keywords = {interdependent networks, network propagation, graph neural networks, causal structure learning, root cause analysis}, location = {Long Beach, CA, USA}, series = {KDD '23}, }
KDD’23
Incremental Causal Graph Learning for Online Root Cause Analysis

Dongjie Wang, Zhengzhang Chen, Yanjie Fu, Yanchi Liu, and Haifeng Chen

In Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, 2023

Abs Bib HTML

The task of root cause analysis (RCA) is to identify the root causes of system faults/failures by analyzing system monitoring data. Efficient RCA can greatly accelerate system failure recovery and mitigate system damages or financial losses. However, previous research has mostly focused on developing offline RCA algorithms, which often require manually initiating the RCA process, a significant amount of time and data to train a robust model, and then being retrained from scratch for a new system fault.In this paper, we propose CORAL, a novel online RCA framework that can automatically trigger the RCA process and incrementally update the RCA model. CORAL consists of Trigger Point Detection, Incremental Disentangled Causal Graph Learning, and Network Propagation-based Root Cause Localization. The Trigger Point Detection component aims to detect system state transitions automatically and in near-real-time. To achieve this, we develop an online trigger point detection approach based on multivariate singular spectrum analysis and cumulative sum statistics. To efficiently update the RCA model, we propose an incremental disentangled causal graph learning approach to decouple the state-invariant and state-dependent information. After that, CORAL applies a random walk with restarts to the updated causal graph to accurately identify root causes. The online RCA process terminates when the causal graph and the generated root cause list converge. Extensive experiments on three real-world datasets demonstrate the effectiveness and superiority of the proposed framework.
@inproceedings{10.1145/3580305.3599392, author = {Wang, Dongjie and Chen, Zhengzhang and Fu, Yanjie and Liu, Yanchi and Chen, Haifeng}, title = {Incremental Causal Graph Learning for Online Root Cause Analysis}, year = {2023}, isbn = {9798400701030}, publisher = {Association for Computing Machinery}, address = {New York, NY, USA}, doi = {10.1145/3580305.3599392}, booktitle = {Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining}, pages = {2269–2278}, numpages = {10}, keywords = {causal structure learning, incremental learning, trigger point detection, disentangled graph learning, root cause analysis}, location = {Long Beach, CA, USA}, series = {KDD '23}, }

AAAI’23

Human-instructed Deep Hierarchical Generative Learning for Automated Urban Planning

Dongjie Wang, Lingfei Wu, Denghui Zhang, Jingbo Zhou, Leilei Sun, and Yanjie Fu

In Proceedings of the AAAI Conference on Artificial Intelligence, 2023

Bib HTML

@inproceedings{wang2023human,
  title = {Human-instructed Deep Hierarchical Generative Learning for Automated Urban Planning},
  author = {Wang, Dongjie and Wu, Lingfei and Zhang, Denghui and Zhou, Jingbo and Sun, Leilei and Fu, Yanjie},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  volume = {37},
  number = {4},
  pages = {4660--4667},
  year = {2023},
}

TKDE

Reinforced Imitative Graph Learning for Mobile User Profiling

Dongjie Wang, Pengyang Wang, Yanjie Fu, Kunpeng Liu, Hui Xiong, and Charles E. Hughes

IEEE Transactions on Knowledge and Data Engineering, 2023

Bib HTML

@article{10108018,
  author = {Wang, Dongjie and Wang, Pengyang and Fu, Yanjie and Liu, Kunpeng and Xiong, Hui and Hughes, Charles E.},
  journal = {IEEE Transactions on Knowledge and Data Engineering},
  title = {Reinforced Imitative Graph Learning for Mobile User Profiling},
  year = {2023},
  volume = {},
  number = {},
  pages = {1-13},
}

KAIS
Interactive Reinforced Feature Selection with Traverse Strategy

Kunpeng Liu, Dongjie Wang, Wan Du, Dapeng Oliver Wu, and Yanjie Fu

Knowl. Inf. Syst., Jan 2023

Abs Bib HTML

In this paper, we propose a single-agent Monte Carlo-based reinforced feature selection method, as well as two efficiency improvement strategies, i.e., early stopping strategy and reward-level interactive strategy. Feature selection is one of the most important technologies in data prepossessing, aiming to find the optimal feature subset for a given downstream machine learning task. Enormous research has been done to improve its effectiveness and efficiency. Recently, the multi-agent reinforced feature selection (MARFS) has achieved great success in improving the performance of feature selection. However, MARFS suffers from the heavy burden of computational cost, which greatly limits its application in real-world scenarios. In this paper, we propose an efficient reinforcement feature selection method, which uses one agent to traverse the whole feature set and decides to select or not select each feature one by one. Specifically, we first develop one behavior policy and use it to traverse the feature set and generate training data. And then, we evaluate the target policy based on the training data and improve the target policy by Bellman equation. Besides, we conduct the importance sampling in an incremental way and propose an early stopping strategy to improve the training efficiency by the removal of skew data. In the early stopping strategy, the behavior policy stops traversing with a probability inversely proportional to the importance sampling weight. In addition, we propose a reward-level and training-level interactive strategy to improve the training efficiency via external advice. What’s more, we propose an incremental descriptive statistics method to represent the state with low computational cost. Finally, we design extensive experiments on real-world data to demonstrate the superiority of the proposed method.
@article{10.1007/s10115-022-01812-3, author = {Liu, Kunpeng and Wang, Dongjie and Du, Wan and Wu, Dapeng Oliver and Fu, Yanjie}, title = {Interactive Reinforced Feature Selection with Traverse Strategy}, year = {2023}, issue_date = {May 2023}, publisher = {Springer-Verlag}, address = {Berlin, Heidelberg}, volume = {65}, number = {5}, issn = {0219-1377}, url = {https://doi.org/10.1007/s10115-022-01812-3}, journal = {Knowl. Inf. Syst.}, month = jan, pages = {1935–1962}, numpages = {28}, keywords = {Monte Carlo, Reinforcement learning, Feature selection} }

TKDE

IMUFS: Complementary and Consensus Learning-Based Incomplete Multi-View Unsupervised Feature Selection

Yanyong Huang, Zongxin Shen, Yuxin Cai, Xiuwen Yi, Dongjie Wang, Fengmao Lv, and Tianrui Li

IEEE Transactions on Knowledge and Data Engineering, Jan 2023

Bib HTML

@article{huang2023imufs,
  title = {IMUFS: Complementary and Consensus Learning-Based Incomplete Multi-View Unsupervised Feature Selection},
  author = {Huang, Yanyong and Shen, Zongxin and Cai, Yuxin and Yi, Xiuwen and Wang, Dongjie and Lv, Fengmao and Li, Tianrui},
  journal = {IEEE Transactions on Knowledge and Data Engineering},
  year = {2023},
  publisher = {IEEE}
}

KAIS

Automated urban planning aware spatial hierarchies and human instructions

Dongjie Wang, Kunpeng Liu, Yanyong Huang, Leilei Sun, Bowen Du, and Yanjie Fu

Knowledge and Information Systems, Jan 2023

Bib HTML

@article{wang2023automated,
  title = {Automated urban planning aware spatial hierarchies and human instructions},
  author = {Wang, Dongjie and Liu, Kunpeng and Huang, Yanyong and Sun, Leilei and Du, Bowen and Fu, Yanjie},
  journal = {Knowledge and Information Systems},
  volume = {65},
  number = {3},
  pages = {1337--1364},
  issue_date = {May},
  year = {2023},
  publisher = {Springer}
}

AAAI’23

Dish-TS: A General Paradigm for Alleviating Distribution Shift in Time Series Forecasting

Wei Fan, Pengyang Wang, Dongkun Wang, Dongjie Wang, Yuanchun Zhou, and Yanjie Fu

In Proceedings of the AAAI Conference on Artificial Intelligence, Jan 2023

Bib HTML

@inproceedings{fan2023dish,
  title = {Dish-TS: A General Paradigm for Alleviating Distribution Shift in Time Series Forecasting},
  author = {Fan, Wei and Wang, Pengyang and Wang, Dongkun and Wang, Dongjie and Zhou, Yuanchun and Fu, Yanjie},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  volume = {37},
  number = {6},
  pages = {7522--7529},
  year = {2023}
}

ICDM’23

Beyond Discrete Selection: Continuous Embedding Space Optimization for Generative Feature Selection

Meng Xiao, Dongjie Wang, Min Wu, Pengfei Wang, Yuanchun Zhou, and Yanjie Fu

In 2023 IEEE international conference on data mining (ICDM), Jan 2023

Bib HTML

@inproceedings{xiao2023beyond,
  title = {Beyond Discrete Selection: Continuous Embedding Space Optimization for Generative Feature Selection},
  author = {Xiao, Meng and Wang, Dongjie and Wu, Min and Wang, Pengfei and Zhou, Yuanchun and Fu, Yanjie},
  booktitle = {2023 IEEE international conference on data mining (ICDM)},
  pages = {in preprint},
  year = {2023},
  organization = {IEEE}
}

ICDM’23

Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing

Wangyang Ying, Dongjie Wang, Kunpeng Liu, Leilei Sun, and Yanjie Fu

In 2023 IEEE international conference on data mining (ICDM), Jan 2023

Bib HTML

@inproceedings{ying2023self-optimizing,
  title = {Self-optimizing Feature Generation via Categorical Hashing Representation and Hierarchical Reinforcement Crossing},
  author = {Ying, Wangyang and Wang, Dongjie and Liu, Kunpeng and Sun, Leilei and Fu, Yanjie},
  booktitle = {2023 IEEE international conference on data mining (ICDM)},
  pages = {in preprint},
  year = {2023},
  organization = {IEEE}
}

2022

KDD’22

Group-wise reinforcement feature generation for optimal and explainable representation space reconstruction

Dongjie Wang, Yanjie Fu, Kunpeng Liu, Xiaolin Li, and Yan Solihin

In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Jan 2022

Bib HTML Code

@inproceedings{wang2022group,
  title = {Group-wise reinforcement feature generation for optimal and explainable representation space reconstruction},
  author = {Wang, Dongjie and Fu, Yanjie and Liu, Kunpeng and Li, Xiaolin and Solihin, Yan},
  booktitle = {Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining},
  pages = {1826--1834},
  year = {2022},
}

WWW’22

Multi-level recommendation reasoning over knowledge graphs with reinforcement learning

Xiting Wang, Kunpeng Liu, Dongjie Wang, Le Wu, Yanjie Fu, and Xing Xie

In Proceedings of the ACM Web Conference 2022, Jan 2022

Bib HTML

@inproceedings{wang2022multi,
  title = {Multi-level recommendation reasoning over knowledge graphs with reinforcement learning},
  author = {Wang, Xiting and Liu, Kunpeng and Wang, Dongjie and Wu, Le and Fu, Yanjie and Xie, Xing},
  booktitle = {Proceedings of the ACM Web Conference 2022},
  pages = {2098--2108},
  year = {2022}
}

2021

AAAI’21

Reinforced imitative graph representation learning for mobile user profiling: An adversarial training perspective

Dongjie Wang, Pengyang Wang, Kunpeng Liu, Yuanchun Zhou, Charles E Hughes, and Yanjie Fu

In Proceedings of the AAAI Conference on Artificial Intelligence, Jan 2021

Bib HTML

@inproceedings{wang2021reinforced,
  title = {Reinforced imitative graph representation learning for mobile user profiling: An adversarial training perspective},
  author = {Wang, Dongjie and Wang, Pengyang and Liu, Kunpeng and Zhou, Yuanchun and Hughes, Charles E and Fu, Yanjie},
  booktitle = {Proceedings of the AAAI Conference on Artificial Intelligence},
  volume = {35},
  number = {5},
  pages = {4410--4417},
  year = {2021},
}

ICDM’21

Deep human-guided conditional variational generative modeling for automated urban planning

Dongjie Wang, Kunpeng Liu, Pauline Johnson, Leilei Sun, Bowen Du, and Yanjie Fu

In 2021 IEEE international conference on data mining (ICDM), Jan 2021

Bib HTML

@inproceedings{wang2021deep,
  title = {Deep human-guided conditional variational generative modeling for automated urban planning},
  author = {Wang, Dongjie and Liu, Kunpeng and Johnson, Pauline and Sun, Leilei and Du, Bowen and Fu, Yanjie},
  booktitle = {2021 IEEE international conference on data mining (ICDM)},
  pages = {679--688},
  year = {2021},
  organization = {IEEE}
}

ICDM’21

Efficient reinforced feature selection via early stopping traverse strategy

Kunpeng Liu, Pengfei Wang, Dongjie Wang, Wan Du, Dapeng Oliver Wu, and Yanjie Fu

In 2021 IEEE International Conference on Data Mining (ICDM), Jan 2021

Bib HTML

@inproceedings{liu2021efficient,
  title = {Efficient reinforced feature selection via early stopping traverse strategy},
  author = {Liu, Kunpeng and Wang, Pengfei and Wang, Dongjie and Du, Wan and Wu, Dapeng Oliver and Fu, Yanjie},
  booktitle = {2021 IEEE International Conference on Data Mining (ICDM)},
  pages = {399--408},
  year = {2021},
  organization = {IEEE}
}

2020

SIGSPATIAL’20

Reimagining city configuration: Automated urban planning via adversarial learning

Dongjie Wang, Yanjie Fu, Pengyang Wang, Bo Huang, and Chang-Tien Lu

In Proceedings of the 28th international conference on advances in geographic information systems, Jan 2020

Bib HTML

@inproceedings{wang2020reimagining,
  title = {Reimagining city configuration: Automated urban planning via adversarial learning},
  author = {Wang, Dongjie and Fu, Yanjie and Wang, Pengyang and Huang, Bo and Lu, Chang-Tien},
  booktitle = {Proceedings of the 28th international conference on advances in geographic information systems},
  pages = {497--506},
  year = {2020}
}

ICDM’20

Defending water treatment networks: Exploiting spatio-temporal effects for cyber attack detection

Dongjie Wang, Pengyang Wang, Jinbo Zhou, Leilei Sun, Bowen Du, and Yanjie Fu

In 2020 IEEE International conference on data mining (ICDM), Jan 2020

Bib HTML

@inproceedings{wang2020defending,
  title = {Defending water treatment networks: Exploiting spatio-temporal effects for cyber attack detection},
  author = {Wang, Dongjie and Wang, Pengyang and Zhou, Jinbo and Sun, Leilei and Du, Bowen and Fu, Yanjie},
  booktitle = {2020 IEEE International conference on data mining (ICDM)},
  pages = {32--41},
  year = {2020},
  organization = {IEEE}
}