METHOD AND DEVICE FOR DETERMINING AN ESTIMATED TIME BEFORE A TECHNICAL INCIDENT IN A COMPUTING INFRASTRUCTURE FROM VALUES OF PERFORMANCE INDICATORS

    公开(公告)号:US20210026725A1

    公开(公告)日:2021-01-28

    申请号:US16927162

    申请日:2020-07-13

    申请人: BULL SAS

    IPC分类号: G06F11/07

    摘要: The invention relates to a method and a device for determining an estimated duration before a technical incident, said method comprising: a step (120) of receiving performance indicator values, a step (140) of identifying anomalous performance indicators, a step (150) of identifying first at-risk indicators, a step (160) of identifying other at-risk indicators, and a step (170) of determining an estimated duration before a technical incident comprising a calculation, from the anomalous indicators and at-risk indicators identified, of a shorter path leading to a risk of technical incident, and a calculation of an estimated duration before a technical incident, said estimated duration before a technical incident being calculated from the values of duration before becoming anomalous between correlated performance indicators for each of the performance indicators constituting the shortest path calculated.

    SYSTEM FOR DETERMINING A CARBON FOOTPRINT OF A COMPUTING INFRASTRUCTURE IN REAL TIME

    公开(公告)号:US20230195587A1

    公开(公告)日:2023-06-22

    申请号:US18081602

    申请日:2022-12-14

    申请人: BULL SAS

    IPC分类号: G06F11/30

    摘要: A device for determining a carbon footprint of a computing infrastructure in real time, including a collector that collect equipment data that includes, for each computing equipment, a measurement of its energy consumption and its time-stamped location, a receiver that received, for each site, a carbon intensity value provided by an electricity supplier of the each site, and a consolidator that associated a carbon intensity value of the each site with each computing equipment according to the time-stamped location of the computing equipment. The consolidator calculates the carbon footprint of each computing equipment from the carbon intensity value that is associated and the energy consumption that is measured.

    METHOD AND SYSTEM FOR RELEASING RESOURCES OF HIGH-PERFORMANCE COMPUTATION SYSTEM

    公开(公告)号:US20240330064A1

    公开(公告)日:2024-10-03

    申请号:US18618897

    申请日:2024-03-27

    申请人: BULL SAS

    IPC分类号: G06F9/50

    CPC分类号: G06F9/5044 G06F9/5022

    摘要: A method for releasing resources in a high-performance computer, the high-performance computer including at least one node, wherein each node includes at least one resource and is associated with at least one metric, each metric taking values within a range of values divided into a plurality of sub-ranges of values. The method includes, for each node including a resource allocated to the job, for each metric associated with the node, obtaining a set of samples; and counting, for each sub-range of values the number of samples whose values are comprised within said sub-range of values, in order to obtain a plurality of counted numbers. The method includes, determining, using at least one machine learning model, from the plurality of counted numbers, whether the job is active or inactive. If the job is inactive, emitting a termination command to terminate the job and release each resource allocated to the job.

    METHOD FOR SCHEDULING A SET OF COMPUTING TASKS IN A SUPERCOMPUTER

    公开(公告)号:US20230085116A1

    公开(公告)日:2023-03-16

    申请号:US17943385

    申请日:2022-09-13

    申请人: BULL SAS

    发明人: Pierre SEROUL

    IPC分类号: G06F9/48 G06N20/00

    摘要: A method for scheduling computing tasks on a supercomputer including offline reinforcement learning (OFRL) of a scheduler on a database (LDB). The database includes at least one execution history (HIST) that includes the state (LHPCS) of a learning supercomputer at several moments (T, T-1); the actions (LACT) related to the scheduling of learning tasks on the learning supercomputer at those moments (T); and a reward (REW) related to each task. The method also includes the use of the scheduler trained on the computing tasks to be scheduled.

    METHOD FOR REPRESENTING A DISTRIBUTED COMPUTING SYSTEM BY GRAPH EMBEDDING

    公开(公告)号:US20230055902A1

    公开(公告)日:2023-02-23

    申请号:US17891095

    申请日:2022-08-18

    申请人: BULL SAS

    IPC分类号: G06F15/80

    摘要: A method of representing a distributed computing system, the distributed computing system comprising a plurality of processing devices connected together according to a predefined topology. The method comprising receiving at least one piece of data from an activity log file relating to at least one processing device among the plurality of processing devices, receiving at least one metric relating to at least one processing device among the plurality of processing devices, receiving at least the predefined topology of the distributed computing system, constructing a graph representative of a distributed computing system operation, the graph comprising the data item extracted from the received log file, the received metric, and the received topology, and embedding at least one part of the graph to obtain at least one state vector representing the at least one part of the embedded graph.

    METHOD AND DEVICE FOR DETERMINING A TECHNICAL INCIDENT RISK VALUE IN A COMPUTING INFRASTRUCTURE FROM PERFORMANCE INDICATOR VALUES

    公开(公告)号:US20210026719A1

    公开(公告)日:2021-01-28

    申请号:US16927114

    申请日:2020-07-13

    申请人: BULL SAS

    IPC分类号: G06F11/07 G06N20/00

    摘要: The invention relates to a device and a method (100) for determining a technical incident risk value in an infrastructure (5), said method comprising: a step of receiving (120) performance indicator values, a step of identifying (140) anomalous performance indicators, so as to identify abnormal values, and identifying performance indicators associated with these abnormal values, a step of determining (150) at-risk indicators, comprising an identification of performance indicators of the computing infrastructure that are correlated with the identified anomalous indicators, a step of creating (160) an augmented anomalies vector, comprising the identifiers of the identified anomalous indicators and the identifiers of the determined at-risk indicators, a determination step (170), comprising the comparison of the augmented anomalies vector with predetermined technical incident reference data.