Write a Blog >>
Wed 11 Nov 2020 01:13 - 01:14 at Virtual room 2 - Cloud / Services 1

The management of cloud service incidents (unplanned interruptions or outages of a service/product) greatly affects customer satisfaction and business revenue. After years of efforts, cloud enterprises are able to solve most incidents automatically and timely. However, in practice, we still observe critical service incidents that occurred in an unexpected manner and orchestrated diagnosis workflow failed to mitigate them. In order to accelerate the understanding of unprecedented incidents and provide actionable recommendations, modern incident management system employs the strategy of AIOps (Artificial Intelligence for IT Operations). In this paper, to provide a broad view of industrial incident management and understand the modern incident management system, we conduct a comprehensive empirical study spanning over two years of incident management practices at Microsoft. Particularly, we identify two critical challenges (namely, incomplete service/resource dependencies and imprecise resource health assessment) and investigate the underlying reasons from the perspective of cloud system design and operations. We also present IcM BRAIN, our AIOps framework towards intelligent incident management, and show its practical benefits conveyed to the cloud services of Microsoft.

Materials of ESEC/FSE 2020 industry paper "Towards Intelligent Incident Management: Why We Need It and How We Make It" (fse20ind-p61-p_materials.zip)4.0MiB

Wed 11 Nov
Times are displayed in time zone: (UTC) Coordinated Universal Time change

01:00 - 01:02
Talk
Research Papers
Alex CummaudoDeakin University, Australia, Scott BarnettDeakin University, Australia, Rajesh VasaDeakin University, Australia, John GrundyMonash University, Australia, Mohamed AbdelrazekDeakin University, Australia
DOI
01:03 - 01:04
Talk
Industry Papers
Jiazhen GuFudan University, China, Jiaqi WenPeking University, China, Zijian WangFudan University, China, Pu ZhaoMicrosoft Research, China, Chuan LuoMicrosoft Research, China, Yu KangMicrosoft Research, China, Yangfan ZhouFudan University, China, Li YangMicrosoft Azure, USA, Jeffrey SunMicrosoft Azure, USA, Zhangwei XuMicrosoft, China, Bo QiaoMicrosoft Research, China, Liqun LiMicrosoft Research, China, Qingwei LinMicrosoft Research, China, Dongmei ZhangMicrosoft Research, China
DOI
01:05 - 01:06
Talk
Industry Papers
Jiajun JiangTianjin University, China, Weihai LuPeking University, China, Junjie ChenTianjin University, China, Qingwei LinMicrosoft Research, China, Pu ZhaoMicrosoft Research, China, Yu KangMicrosoft Research, China, Hongyu ZhangUniversity of Newcastle, Australia, Yingfei XiongPeking University, Feng GaoMicrosoft, China, Zhangwei XuMicrosoft, China, Yingnong DangMicrosoft, USA, Dongmei ZhangMicrosoft Research, China
DOI
01:07 - 01:08
Talk
Research Papers
Yujun ChenMicrosoft Research, China, Xian YangHong Kong Baptist University, China, Hang DongMicrosoft Research, China, Xiaoting HeChinese Academy of Sciences, China, Hongyu ZhangUniversity of Newcastle, Australia, Qingwei LinMicrosoft Research, China, Junjie ChenTianjin University, China, Pu ZhaoMicrosoft Research, China, Yu KangMicrosoft Research, China, Feng GaoMicrosoft, China, Zhangwei XuMicrosoft, China, Dongmei ZhangMicrosoft Research, China
DOI
01:09 - 01:10
Talk
Tool Demos
Anup K. KaliaIBM Research, USA, Jin XiaoIBM Research, USA, Chen LinIBM Research, USA, Saurabh SinhaIBM Research, John RofranoIBM Research, USA, Maja VukovicIBM Research, USA, Debasish BanerjeeIBM, n.n.
DOI
01:11 - 01:12
Talk
Tool Demos
Alex CummaudoDeakin University, Australia, Scott BarnettDeakin University, Australia, Rajesh VasaDeakin University, Australia, John GrundyMonash University, Australia
DOI
01:13 - 01:14
Talk
Industry Papers
Zhuangbin ChenChinese University of Hong Kong, China, Yu KangMicrosoft Research, China, Liqun LiMicrosoft Research, China, Xu ZhangMicrosoft Research, China, Hongyu ZhangUniversity of Newcastle, Australia, Hui XuFudan University, China, Yangfan ZhouFudan University, China, Li YangMicrosoft Azure, USA, Jeffrey SunMicrosoft Azure, USA, Zhangwei XuMicrosoft, China, Yingnong DangMicrosoft, USA, Feng GaoMicrosoft, China, Pu ZhaoMicrosoft Research, China, Bo QiaoMicrosoft Research, China, Qingwei LinMicrosoft Research, China, Dongmei ZhangMicrosoft Research, China, Michael LyuCUHK
DOI Media Attached File Attached
01:15 - 01:30
Talk
Paper Presentations
Alex CummaudoDeakin University, Australia, Anup K. KaliaIBM Research, USA, Jiajun JiangTianjin University, China, Zhuangbin ChenChinese University of Hong Kong, China, M: Satish ChandraFacebook, USA