Saudi Journal of Engineering and Technology (SJEAT)
Volume-11 | Issue-05 | 501-510
Original Research Article
Service Failure Detection in Distributed Microservice Platforms
Farhan Tariq, Mabu Hussain Shaik, Shujath Baig Mirza, Md Ariful Islam
Published : May 26, 2026
Abstract
Service failure detection in distributed microservice platforms remains difficult because fault symptoms often appear in services other than the one where the problem begins. Traditional monitoring methods usually examine metrics, logs, or traces separately, which limits their ability to identify partial degradation, fault propagation, and cascading disruption. This paper proposes a multi-source, dependency aware framework for service failure detection in distributed microservice platforms. The method integrates distributed traces, service level metrics, and structured log events into a unified service state representation and interprets these signals through a dynamic service dependency graph. A hybrid failure scoring model identifies degraded or failed services, while a root cause ranking stage estimates the most likely origin of the incident. The framework captures both local anomalies and propagated effects across connected services. Experimental analysis compares the proposed method with metrics only, trace only, and logs only baselines under latency inflation, timeout propagation, service crash, resource exhaustion, and silent degradation scenarios. Results show that the proposed approach achieves stronger detection accuracy, lower detection delay, and better root cause ranking performance, particularly in cascading failure cases where single source methods often misidentify affected services as the source of the incident. These findings indicate that observability fusion with dependency aware analysis provides a more reliable basis for service level diagnosis in cloud native microservice systems.