-
公开(公告)号:US11693727B2
公开(公告)日:2023-07-04
申请号:US17195016
申请日:2021-03-08
发明人: Ashok Vardhan Rao Bolagani , Vijaya Kadiyala , Alina D Rodean , Jason Bocz , Rajesh Chekuri , Gaurav Bhatia
IPC分类号: G06F11/07 , G06F16/245 , G06Q10/0631
CPC分类号: G06F11/0793 , G06F11/079 , G06F11/0724 , G06F11/0751 , G06F16/245 , G06Q10/06311
摘要: Various methods, apparatuses/systems, and media for identifying production incidents and implementing automated preventive and corrective measures are disclosed. A processor automatically triggers, in response to a generated incident of a job/process/host failure, a self-healing service. The processor identifies an application to which the event generated belongs to by accessing a database that stores the application and host details; fetches functional identification (ID) of the application from the database, identifies the type of job failure or service degradation; automatically executes, by utilizing predefined micro services, the steps required for mitigation; records, in response to executing, outcome of the mitigation in the database along with output at each stage of execution; and evaluates the outcome of the mitigation by executing health checks using micro services to determine whether the failed job or process or host is healthy; and closes the incident based on healthy determination.