About
Building intelligent systems that keep the cloud reliable β from AIOps to autonomous AI agents.
Minghua Ma is currently a Senior Researcher at Microsoft M365 Research Group. Before joining Microsoft, he received his Ph.D. degree from Tsinghua University in 2021, under the supervision of Prof. Dan Pei in the Netman Group. His research interests are primarily focused on AIOps, AI4SE, and SE4AI. He is a Senior Member of CCF.
π Internship Opportunities: I am seeking self-motivated undergraduate, master's, and Ph.D. intern students. If you are interested in working with me, please email me your CV.
Featured Research Highlights
Triangle
Multi-agent incident triage system for cloud services. Highlighted by Azure CTO Mark Russinovich in the Advancing Reliability blog. Published at ASE'25.
AIOpsLab
Holistic framework for evaluating AI agents for enabling autonomous cloud operations. Open-sourced by Microsoft. Published at MLSys'25.
RCACopilot
Automatic root cause analysis via large language models for cloud incidents. Published at EuroSys'24, with 200+ citations.
TSGen
AI system that auto-generates Troubleshooting Guides from incident data. Deployed at Microsoft. Submitted to FSE.
