Graph-Based Trace Analysis for Microservice Architecture Understanding and Problem Diagnosis
Microservice systems are highly dynamic and complex. For such systems, operation engineers and developers highly rely on trace analysis to understand architectures and diagnose various problems such as service failures and quality degradation.
However, the huge number of traces produced at runtime makes it challenging to capture the required information in real-time. To address the faced challenges, in this paper, we propose a graph-based microservice trace analysis approach GMTA for understanding architecture and diagnosing various problems.
Built on a graph-based representation, GMTA includes efficient processing of traces produced on the fly.
It abstracts traces into different paths and further groups them into business flows.
To support various analytical applications, GMTA includes an efficient storage and access mechanism by combining a graph database and a real-time analytics database and using a carefully designed storage structure.
Based on GMTA, we construct analytical applications for architecture understanding and problem diagnosis, these applications support various needs such as visualizing service dependencies, making architectural decisions, analyzing the changes of services behaviors, detecting performance issues, and locating root causes.
GMTA has been implemented and deployed in eBay.
An experimental study based on trace data produced by eBay demonstrates GMTA's effectiveness and efficiency for architecture understanding and problem diagnosis.
Case studies conducted in eBay's monitoring team and Site Reliability Engineering (SRE) team further confirm GMTA's substantial benefits in industrial-scale microservice systems.
Conference DayWed 11 NovDisplayed time zone: (UTC) Coordinated Universal Time change
01:30 - 02:00
|A Principled Approach to GraphQL Query Cost AnalysisACM SIGSOFT Distinguished Paper Award|
Alan ChaIBM Research, USA, Erik WitternIBM, USA, Guillaume BaudartIBM Research, USA, James C. DavisPurdue University, USA, Louis MandelIBM Research, USA, Jim A. LaredoIBM Research, USADOI Pre-print Media Attached
|Block Public Access: Trust Safety Verification of Access Control Policies|
Malik BouchetAmazon, USA, Byron CookAmazon, Bryant CutlerAmazon, USA, Anna DruzkinaAmazon, USA, Andrew GacekAmazon, USA, Liana HadareanAmazon, Ranjit JhalaAmazon, USA, Brad MarshallAmazon, USA, Dan PeeblesAmazon, USA, Neha RungtaAmazon Web Services, Cole SchlesingerAmazon, USA, Chriss StephensAmazon, USA, Carsten VarmingAmazon, USA, Andy WarfieldAmazon, USADOI
|Efficient Incident Identification from Multi-dimensional Issue Reports via Meta-heuristic Search|
Jiazhen GuFudan University, China, Chuan LuoMicrosoft Research, China, Si QinMicrosoft Research, n.n., Bo QiaoMicrosoft Research, China, Qingwei LinMicrosoft Research, China, Hongyu ZhangUniversity of Newcastle, Australia, Ze LiMicrosoft, USA, Yingnong DangMicrosoft, USA, Shaowei CaiInstitute of Software at Chinese Academy of Sciences, China, Wei-Cheng WuUniversity of Southern California, USA, Yangfan ZhouFudan University, China, Murali ChintalapatiMicrosoft, n.n., Dongmei ZhangMicrosoft Research, ChinaDOI
|Graph-Based Trace Analysis for Microservice Architecture Understanding and Problem Diagnosis|
Xiaofeng GuoFudan University, China, Xin PengFudan University, China, Hanzhang WangeBay, Wanxue LieBay, USA, Huai JiangeBay, USA, Dan DingFudan University, China, Tao XiePeking University, Liangfei SueBay, USADOI
|Real-Time Incident Prediction for Online Service Systems|
Nengwen ZhaoTsinghua University, Junjie ChenTianjin University, China, Zhou WangBizSeer, China, Xiao PengBeijing University of Posts and Telecommunications, China, Gang WangChina EverBright Bank, Yong WuChina EverBright Bank, Fang ZhouChina EverBright Bank, Zhen FengEverBright Bank, China, Xiaohui NieEverBright Bank, China, Wenchi ZhangTsinghua University, China, Kaixin SuiBizSeer, Dan PeiBizSeer, ChinaDOI
|Scaling Static Taint Analysis to Industrial SOA Applications: A Case Study at Alibaba|
Jie WangPeking University, China / Ant Group, China / Alibaba Group, China, Yunguang WuAnt Group, China, Gang ZhouAnt Group, China, Yiming YuAnt Group, China, Zhenyu GuoAnt Group, China, Yingfei XiongPeking UniversityDOI
|Conversations on Cloud / Services 2|