-
1.
公开(公告)号:US20250004896A1
公开(公告)日:2025-01-02
申请号:US18217245
申请日:2023-06-30
Applicant: Intel Corporation
Inventor: Sridharan SAKTHIVELU , Kaushik BALASUBRAMANIAN , Krishna SURYA
IPC: G06F11/27
Abstract: Methods and apparatus to implement proactive hardware error screening are disclosed. In one embodiment, a computer processing system includes a plurality of computational units to execute tasks for one or more applications; a plurality of sensors collects measurement data of the plurality of computational units, to collect measurement data of the plurality of computational units; a data structure indicating hardware health statuses of the plurality of computational units determined based on the measurement data is stored in a storage; and the plurality of computational units is scheduled to perform task execution on the computer processing system for the one or more applications based on the hardware health statuses of the plurality of computational units indicated in the data structure, wherein a first computational unit is excluded from the task execution when a corresponding first hardware health status of the first computational unit indicates an impending hardware failure.