Finding
Paper
Abstract
Modern cloud-based applications commonly contain multiple services and microservices in a distributed architecture, which makes the debugging and isolation of application faults a challenge. Current available APM tools are intrusive, using instrumentation while impacting application performance and do not combine analysis of KPIs with logs. Logs of distributed cloud applications contain massive amount of data about application behavior and can potentially provide signals and insights for application faults. In this paper we propose Sequencis, a machine learning based automated and non-intrusive solution for discovering distributed application fault characteristics. Sequencis efficiently discerns sequential patterns of large amounts of logged events in distributed services and correlates those patterns with application errors and KPI indications such as high CPU utilization or other faults. Over extensive experiments on several distributed applications, Sequencis demonstrates ability to capture correlations between errors and log sequence patterns including text-based, URLs and SQL query patterns.
Authors
Shay Horovitz, Ben Boren, Ben Wizen
Journal
2021 6th International Conference on Big Data and Computing