A Smart Learning Approach for Predicting Job Failures in Cloud Systems

M Hema Priyanka; K Keerthana

doi:10.64751/

Authors

M Hema Priyanka Author
K Keerthana Author

DOI:

https://doi.org/10.64751/

Keywords:

Cloud computing, job failure prediction, ensemble learning, stacking classifier, voting classifier, XGBoost, explainable AI, fault detection.

Abstract

In contemporary cloud data centers, precise forecasting of job failures is crucial for guaranteeing optimal performance, effective resource use, and enhanced fault tolerance. This research introduces a sophisticated multilayer ensemble framework for predicting cloud task failures, utilizing the Google Cluster 2019 dataset. The framework incorporates many classification techniques, such as Decision Tree, K-Nearest Neighbors, Artificial Neural Network, Extreme Gradient Boosting, and Adaptive Boosting. The models are integrated via a hard Voting Classifier to enhance prediction stability and robustness. A Stacking Classifier incorporates Random Forest, KNN, and Multilayer Perceptron as base learners, with Logistic Regression serving as the meta-estimator to augment predictive accuracy. Experimental results indicate outstanding performance, with the Voting Classifier attaining 99.98% accuracy and the Stacking Classifier achieving 100% accuracy in forecasting cloud job outcomes. Explainable Artificial Intelligence methodologies, such as LIME and SHAP, are utilized to elucidate forecasts and emphasize feature contributions, hence ensuring transparency and reliability. The trained models are incorporated into a Flask-based web application for realworld deployment, including secure user registration and authentication using SQLite, real-time processing of user input, and interactive visualization of results. The system provides definitive outputs such as “job will complete successfully” or “job failure predicted,” facilitating dependable, interpretable, and user-friendly support for cloud job failure prediction.

A Smart Learning Approach for Predicting Job Failures in Cloud Systems

Authors

DOI:

Keywords:

Abstract

Downloads

Published

Issue

Section

License

How to Cite

Latest publications

IF

Information

Language