About

    I always seek self-motivated undergraduate, master's, and Ph.D. intern students with strong programming skills. Please email me your CV if you are interested in working with me.

Experience

  • Microsoft
    • Systems Innovation    Senior Researcher Redmond, Feb, 2024 -  Now     
    • DKI           Senior Researcher Beijing,   Jul, 2023 -  Jan, 2024  
    • MSRA         Researcher Beijing,   Jul, 2021 -  Jun, 2023  
  • Georgia Tech        Visiting Scholar Atlanta, Sep, 2019 - Aug, 2020  
  • Alibaba            Research Intern Beijing, Dec, 2018 - Aug, 2019  
  • Sogou           Research Intern Beijing, Dec, 2016 -  Apr, 2018  

Publications

* denotes corresponding author and + indicates equal contribution.

Conference

  • FSE'24 MonitorAssistant: Simplifying Cloud Service Monitoring via Large Language Models
    Zhaoyang Yu, Minghua Ma, Chaoyun Zhang, Si Qin, Yu Kang, Chetan Bansal, Saravan Rajmohan, Yingnong Dang, Changhua Pei, Dan Pei, Qingwei Lin, Dongmei Zhang.
  • FSE'24 Automated Root Causing of Cloud Incidents using In-Context Learning with GPT-4
    Xuchao Zhang, Supriyo Ghosh, Chetan Bansal, Rujia Wang, Minghua Ma, Yu Kang, Saravan Rajmohan.
  • TheWebConf'24 Revisiting VAE for Unsupervised Time Series Anomaly Detection: A Frequency Perspective
    Zexin Wang, Changhua Pei, Minghua Ma, Xin Wang, Zhihan Li, Dan Pei, Saravan Rajmohan, Dongmei Zhang, Qingwei Lin, Haiming Zhang, Jianhui li, Gaogang Xie.
  • VLDB'24 ImDiffusion: Imputed Diffusion Models for Multivariate Time Series Anomaly Detection [Paper]
    Yuhang Chen, Chaoyun Zhang, Minghua Ma, Yudong Liu, Ruomeng Ding, Bowen Li, Shilin He, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang.
  • ICSE'24 Xpert: Empowering Incident Management with Query Recommendations via Large Language Models [Paper]
    Yuxuan Jiang, Chaoyun Zhang, Shilin He, Zhihao Yang, Minghua Ma, Si Qin, Yu Kang, Yingnong Dang, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang.
  • EuroSys'24 Automatic Root Cause Analysis via Large Language Models for Cloud Incidents [Paper]
    Yinfang Chen, Huaibing Xie, Minghua Ma*, Yu Kang, Xin Gao, Liu Shi, Yunjie Cao, Xuedong Gao, Hao Fan, Ming Wen, Jun Zeng, Supriyo Ghosh, Xuchao Zhang, Chaoyun Zhang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang.
  • ISSRE'23 CODEC: Cost-Effective Duration Prediction System for Deadline Scheduling in the Cloud [Paper]
    Haozhe Li, Minghua Ma, Yudong Liu, Si Qin, Bo Qiao, Randolph Yao, Harshwardhan Chaturvedi, Tri Tran, Murali Chintalapati, Saravan Rajmohan, Qingwei Lin and Dongmei Zhang.
  • FSE'23 Assess and Summarize: Improve Outage Understanding with Large Language Models [Paper]
    Pengxiang Jin+, Shenglin Zhang+, Minghua Ma, Haozhe Li, Yu Kang, Liqun Li, Yudong Liu, Bo Qiao, Chaoyun Zhang, Pu Zhao, Shilin He, Federica Sarro, Yingnong Dang, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang.
  • FSE'23 Detection is Better Than Cure - A Cloud Incidents Perspective [Paper]
    Vaibhav Ganatra, Anjaly Parayil, Supriyo Ghosh, Yu Kang, Minghua Ma, Chetan Bansal, Suman Nath, Jonathan Mace.
  • FSE'23 TraceDiag: Adaptive, Interpretable and Efficient Root Cause Analysis on Large-Scale Microservice Systems [Paper]
    Ruomeng Ding, Chaoyun Zhang, Lu Wang, Yong Xu, Minghua Ma, Xiaomin Wu, Meng Zhang, Qingjun Chen, Xin Gao, Xuedong Gao, HaoFan, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang.
  • KDD'23 Robust Multimodal Failure Detection for Microservice Systems [Paper]
    Chenyu Zhao+, Minghua Ma+, Zhenyu Zhong, Shenglin Zhang, Zhiyuan Tan, Xiao Xiong, LuLu Yu, Jiayi Feng, Yongqian Sun, Yuzhi Zhang, Dan Pei, Qingwei Lin, Dongmei Zhang.
  • DSN'23 Characterizing Large-Scale Private and Public Cloud Workloads [Paper]
    Xiaoting Qin, Minghua Ma, Yuheng Zhao, Jue Zhang, Chao Du, Yudong Liu, Anjaly Parayil, Chetan Bansal, Saravan Rajmohan, Inigo Goiri, Eli Cortez, Si Qin, Qingwei Lin, Dongmei Zhang.
  • TheWebConf'23 EDITS: An Easy-to-difficult Training Strategy for Cloud Failure Prediction [Paper]
    Tianci Li, Pu Zhao, Yudong Liu, Minghua Ma, Lingling Zheng, Murali Chintalapati, Bo Liu, Paul Wang, Hongyu Zhang, Yingnong Dang, Saravan Rajmohan, Qingwei Lin and Dongmei Zhang.
  • ICSE-SEIP'23 Aegis: Attribution of Control Plane Change Impact across Layers and Components for Cloud Systems [Paper]
    Xiaohan Yan, Ken Hsieh, Yasitha Liyanage, Minghua Ma, Murali Chintalapati, Qingwei Lin, Yingnong Dang and Dongmei Zhang.
  • ICSE-SEIP'23 TraceArk: Towards Actionable Performance Anomaly Alerting for Online Service Systems [Paper]
    Zhengran Zeng, Yuqun Zhang, Yong Xu, Minghua Ma, Bo Qiao, Wentao Zou, Qingjun Chen, Meng Zhang, Xu Zhang, Hongyu Zhang, Xuedong Gao, Hao Fan, Saravan Rajmohan, Qingwei Lin and Dongmei Zhang.
  • ICSE-SEIP'23 CONAN: Diagnosing Batch Failures for Cloud Systems [Paper]
    Liqun Li, Xu Zhang, Shilin He, Yu Kang, Hongyu Zhang, Minghua Ma, Yingnong Dang, Zhangwei Xu, Saravan Rajmohan, Qingwei Lin and Dongmei Zhang.
  • FSE'22 An Empirical Investigation of Missing Data Handling in Cloud Node Failure Prediction [Paper]
    Minghua Ma, Yudong Liu, Yuang Tong, Haozhe Li, Pu Zhao, Yong Xu, Hongyu Zhang, Shilin He, Lu Wang, Yingnong Dang, Saravan Rajmohan, Qingwei Lin.
  • FSE'22 An Empirical Study of Log Analysis at Microsoft [Paper]
    Shilin He, Xu Zhang, Pinjia He, Yong Xu, Liqun Li, Yu Kang, Minghua Ma, Yining Wei, Yingnong Dang, Saravan Rajmohan, Qingwei Lin.
  • KDD'22 Multi-task Hierarchical Classification for Disk Failure Prediction in Online Service Systems [Paper]
    Yudong Liu, Hailan Yang, Pu Zhao, Minghua Ma, Chengwu Wen, Hongyu Zhang, Chuan Luo, Qingwei Lin, Chang Yi, Jiaojian Wang, Chenjian Zhang, Paul Wang, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang.
  • TheWebConf'22 UniParser: A Unified Log Parser for Heterogeneous Log Data [Paper]
    Yudong Liu, Xu Zhang, Shilin He, Hongyu Zhang, Liqun Li, Yu Kang, Yong Xu, Minghua Ma, Qingwei Lin, Yingnong Dang, Saravan Rajmohan, Dongmei Zhang.
  • DEXA'22 Mining Fluctuation Propagation Graph among Time Series with Human-in-the-Loop [Paper]
    Mingjie Li, Minghua Ma, Xiaohui Nie, Kanglin Yin, Li Cao, Xidao Wen, Zhiyun Yuan, Duogang Wu, Guoying Li, Wei Liu, Xin Yang, Dan Pei.
  • ATC'21 Jump-Starting Multivariate Time Series Anomaly Detection for Online Service Systems [Paper][Code]
    Minghua Ma, Shenglin Zhang, Junjie Chen, Jim Xu, Haozhe Li, Yongliang Lin, Xiaohui Nie, Bo Zhou, Yong Wang, Dan Pei.
  • VLDB'20 Diagnosing Root Causes of Intermittent Slow Queries in Cloud Databases [Paper]
    Minghua Ma, Zheng Yin, Shenglin Zhang, Sheng Wang, Christopher Zheng, Xinhao Jiang, Hanwen Hu, Cheng Luo, Yilin Li, Nengjun Qiu, Feifei Li, Changcheng Chen, Dan Pei.
  • ISSRE'18 Robust and Rapid Adaption for Concept Drift in Software System Anomaly Detection [Paper]
    Minghua Ma, Shenglin Zhang, Dan Pei, Xin Huang, Hongwei Dai.
    🏆 IEEE the 29th ISSRE Best Research Paper
  • IWQoS'17 You Can Hide, but Your Periodic Schedule Can’t [Paper]
    Minghua Ma, Kai Zhao, Kaixin Sui, Lei Xu, Yong Li, Dan Pei.
  • IWQoS'16 Your Trajectory Privacy Can Be Breached Even If You Walk in Groups [Paper]
    Kaixin Sui, Youjian Zhao, Dapeng Liu, Minghua Ma, Lei Xu, Li Zimu, Dan Pei.
  • UbiComp'16 EDUM: Classroom Education Measurements via Large-scale WiFi Networks [Paper]
    Mengyu Zhou, Minghua Ma, Yangkun Zhang, Kaixin Sui, Dan Pei, Thomas Moscibroda.
  • MobiSys'16 Characterizing and Improving WiFi Latency in Large-Scale Operational Networks [Paper]
    Kaixin Sui, Mengyu Zhou, Dapeng Liu, Minghua Ma, Dan Pei.
  • INFOCOM'16 WiFi can Be the Weakest Link of Round Trip Network Latency [Paper]
    Changhua Pei, Youjian Zhao, Guo Chen, Ruming Tang, Yuan Meng, Minghua Ma, Ken Ling, Dan Pei.

Journal

  • TSC'23 Robust Failure Diagnosis of Microservice System through Multimodal Data [Paper]
    Shenglin Zhang, Pengxiang Jin, Zihan Lin, Yongqian Sun, Bicheng Zhang, Sibo Xia, Zhengdan Li, Zhenyu Zhong, Minghua Ma, Wa Jin, Dai Zhang, Zhenyu Zhu, Dan Pei.
  • TNSM'19 Automatic and Generic Periodicity Adaptation for KPI Anomaly Detection [Paper]
    Nengwen Zhao, Jing Zhu, Yao Wang, Minghua Ma, Wenchi Zhang, Dapeng Liu, Ming Zhang, Dan Pei.

Preprint

  • AllHands: Ask Me Anything on Large-scale Verbatim Feedback via Large Language Models [Paper]
    Chaoyun Zhang, Zicheng Ma, Yuhao Wu, Shilin He, Si Qin, Minghua Ma, Xiaoting Qin, Yu Kang, Yuyi Liang, Xiaoyu Gou, Yajie Xue, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang.
  • UFO: A UI-Focused Agent for Windows OS Interaction [Paper]
    Chaoyun Zhang, Liqun Li, Shilin He, Xu Zhang, Bo Qiao, Si Qin, Minghua Ma, Yu Kang, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang, Qi Zhang.
  • TaskWeaver: A Code-First Agent Framework [Paper]
    Bo Qiao, Liqun Li, Xu Zhang, Shilin He, Yu Kang, Chaoyun Zhang, Fangkai Yang, Hang Dong, Jue Zhang, Lu Wang, Minghua Ma, Pu Zhao, Si Qin, Xiaoting Qin, Chao Du, Yong Xu, Qingwei Lin, Saravan Rajmohan, Dongmei Zhang.
  • Everything of Thoughts: Defying the Law of Penrose Triangle for Thought Generation [Paper]
    Ruomeng Ding, Chaoyun Zhang, Lu Wang, Yong Xu, Minghua Ma, Wei Zhang, Si Qin, Saravan Rajmohan, Qingwei Lin, Dongmei Zhang.
  • A Survey of Time Series Anomaly Detection Methods in the AIOps Domain [Paper]
    Zhenyu Zhong and Qiliang Fan, Jiacheng Zhang, Minghua Ma, Shenglin Zhang, Yongqian Sun, Qingwei Lin, Yuzhi Zhang, Dan Pei.
  • Enhanced Fairness Testing via Generating Effective Initial Individual Discriminatory Instances [Paper]
    Minghua Ma, Zhao Tian, Max Hort, Federica Sarro, Hongyu Zhang, Qingwei Lin, Dongmei Zhang.
  • Constructing Large-Scale Real-World Benchmark Datasets for AIOps [Paper][Datasets]
    Zeyan Li, Nengwen Zhao, Shenglin Zhang, Yongqian Sun, Pengfei Chen, Xidao Wen, Minghua Ma, Dan Pei.
  • DockerMock: Pre-Build Detection of Dockerfile Faults through Mocking Instruction Execution [Paper]
    Mingjie Li, Xiaoying Bai, Minghua Ma, Dan Pei.

Services

  • Organizer
    • Challenge'21 Technical Chair
  • PC member
    • 2024: KDD, TheWebConf, FSE, ISSRE, APSEC
    • 2023: ASE, KDD, NLPCC, CCF AIOps Challenge, MiLeTS
  • Reviewer
    • TOSEM
    • Neurocomputing
  • AEC Member
    • 2022: OSDI, ATC