Write a Blog >>
Wed 11 Nov 2020 01:07 - 01:08 at Virtual room 2 - Cloud / Services 1

In large-scale online service systems, incidents occur frequently due to a variety of causes, from updates of software and hardware to changes in operation environment. These incidents could significantly degrade system's availability and customers' satisfaction.
Some incidents are linked because they are duplicate or inter-related. The linked incidents can greatly help on-call engineers find mitigation solutions and identify the root causes.
In this work, we investigate the incidents and their links in a representative real-world incident management (IcM) system.
Based on the identified indicators of linked incidents, we further propose \textbf{LiDAR} (Linked Incident identification with DAta-driven Representation), a deep learning based approach to incident linking.
More specifically, we incorporate the textual description of incidents and structural information extracted from historical linked incidents to identify possible links among a large number of incidents.
To show the effectiveness of our method, we apply our method to a real-world IcM system and find that our method outperforms other state-of-the-art methods.

Conference Day
Wed 11 Nov

Displayed time zone: (UTC) Coordinated Universal Time change

01:00 - 01:30
01:00
2m
Talk
Beware the Evolving ‘Intelligent’ Web Service! An Integration Architecture Tactic to Guard AI-First Components
Research Papers
Alex CummaudoDeakin University, Australia, Scott BarnettDeakin University, Australia, Rajesh VasaDeakin University, Australia, John GrundyMonash University, Australia, Mohamed AbdelrazekDeakin University, Australia
DOI
01:03
1m
Talk
Efficient Customer Incident Triage via Linking with System Incidents
Industry Papers
Jiazhen GuFudan University, China, Jiaqi WenPeking University, China, Zijian WangFudan University, China, Pu ZhaoMicrosoft Research, China, Chuan LuoMicrosoft Research, China, Yu KangMicrosoft Research, China, Yangfan ZhouFudan University, China, Li YangMicrosoft Azure, USA, Jeffrey SunMicrosoft Azure, USA, Zhangwei XuMicrosoft, China, Bo QiaoMicrosoft Research, China, Liqun LiMicrosoft Research, China, Qingwei LinMicrosoft Research, China, Dongmei ZhangMicrosoft Research, China
DOI
01:05
1m
Talk
How to Mitigate the Incident? An Effective Troubleshooting Guide Recommendation Technique for Online Service Systems
Industry Papers
Jiajun JiangTianjin University, China, Weihai LuPeking University, China, Junjie ChenTianjin University, China, Qingwei LinMicrosoft Research, China, Pu ZhaoMicrosoft Research, China, Yu KangMicrosoft Research, China, Hongyu ZhangUniversity of Newcastle, Australia, Yingfei XiongPeking University, Feng GaoMicrosoft, China, Zhangwei XuMicrosoft, China, Yingnong DangMicrosoft, USA, Dongmei ZhangMicrosoft Research, China
DOI
01:07
1m
Talk
Identifying Linked Incidents in Large-Scale Online Service Systems
Research Papers
Yujun ChenMicrosoft Research, China, Xian YangHong Kong Baptist University, China, Hang DongMicrosoft Research, China, Xiaoting HeChinese Academy of Sciences, China, Hongyu ZhangUniversity of Newcastle, Australia, Qingwei LinMicrosoft Research, China, Junjie ChenTianjin University, China, Pu ZhaoMicrosoft Research, China, Yu KangMicrosoft Research, China, Feng GaoMicrosoft, China, Zhangwei XuMicrosoft, China, Dongmei ZhangMicrosoft Research, China
DOI
01:09
1m
Talk
Mono2Micro: An AI-Based Toolchain for Evolving Monolithic Enterprise Applications to a Microservice Architecture
Tool Demos
Anup K. KaliaIBM Research, USA, Jin XiaoIBM Research, USA, Chen LinIBM Research, USA, Saurabh SinhaIBM Research, John RofranoIBM Research, USA, Maja VukovicIBM Research, USA, Debasish BanerjeeIBM, n.n.
DOI
01:11
1m
Talk
Threshy: Supporting Safe Usage of Intelligent Web Services
Tool Demos
Alex CummaudoDeakin University, Australia, Scott BarnettDeakin University, Australia, Rajesh VasaDeakin University, Australia, John GrundyMonash University, Australia
DOI
01:13
1m
Talk
Towards Intelligent Incident Management: Why We Need It and How We Make It
Industry Papers
Zhuangbin ChenChinese University of Hong Kong, China, Yu KangMicrosoft Research, China, Liqun LiMicrosoft Research, China, Xu ZhangMicrosoft Research, China, Hongyu ZhangUniversity of Newcastle, Australia, Hui XuFudan University, China, Yangfan ZhouFudan University, China, Li YangMicrosoft Azure, USA, Jeffrey SunMicrosoft Azure, USA, Zhangwei XuMicrosoft, China, Yingnong DangMicrosoft, USA, Feng GaoMicrosoft, China, Pu ZhaoMicrosoft Research, China, Bo QiaoMicrosoft Research, China, Qingwei LinMicrosoft Research, China, Dongmei ZhangMicrosoft Research, China, Michael LyuCUHK
DOI Media Attached File Attached
01:15
15m
Talk
Conversations on Cloud / Services 1
Paper Presentations
Alex CummaudoDeakin University, Australia, Anup K. KaliaIBM Research, USA, Jiajun JiangTianjin University, China, Zhuangbin ChenChinese University of Hong Kong, China, M: Satish ChandraFacebook, USA