-
公开(公告)号:US10540227B2
公开(公告)日:2020-01-21
申请号:US15861381
申请日:2018-01-03
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Charles Johnson , Onkar Patil , Mesut Kuscu , Tuan Tran , Joseph Tucek , Harumi Kuno , Milind Chabbi , William Scherer
Abstract: A high performance computing system including processing circuitry and a shared fabric memory is disclosed. The processing circuitry includes processors coupled to local storages. The shared fabric memory includes memory devices and is coupled to the processing circuitry. The shared fabric memory executes a first sweep of a stencil code by sequentially retrieving data stripes. Further, for each retrieved data stripe, a set of values of the retrieved data stripe are updated substantially simultaneously. For each retrieved data stripe, the updated set of values are stored in a free memory gap adjacent to the retrieved data stripe. For each retrieved data stripe, the free memory gap is advanced to an adjacent memory location. A sweep status indicator is incremented from the first sweep to a second sweep.
-
公开(公告)号:US10565037B2
公开(公告)日:2020-02-18
申请号:US15847067
申请日:2017-12-19
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Charles Johnson , Mesut Kuscu , Onkar Patil , James Hyungsun Park , Harumi Kuno , Robert Schreiber
Abstract: A high performance computing system that includes a shared fabric memory and a plurality of processors is disclosed. A first processor is coupled to a local storage and executes a first process that, in combination with other processes, causes the plurality of processors to perform certain actions including transferring, from the shared fabric memory to the local storage, a first value corresponding to a first cell of a first set of cells and a first sweep of a stencil code. The actions further include transferring, from a first logical partition in the shared fabric memory associated with the first cell to the local storage, a second value corresponding to a second cell related to the first cell and not in the first set of cells. Further, these actions include updating, by the first process, the first value based on at least the first value and the second value.
-
公开(公告)号:US20180293144A1
公开(公告)日:2018-10-11
申请号:US15764040
申请日:2015-09-24
Applicant: Hewlett-Packard Enterprise Development LP
Inventor: Charles Johnson , Harumi Kuno , Al Davis
IPC: G06F11/20
Abstract: In some examples, a node of a computing system may include a failure identification engine and a failure response engine. The failure identification engine may identify a failure condition for a system function of the node and the failure response engine may store a failure indication in a shared memory to trigger takeover of the system function by a different node of the computing system.
-
公开(公告)号:US10922137B2
公开(公告)日:2021-02-16
申请号:US16073573
申请日:2016-04-27
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Qiong Cai , Charles Johnson , Paolo Faraboschi
IPC: G06F9/50 , G06F11/30 , G06F12/1027 , G06F9/38 , G06F9/46 , G06F9/48 , G06F9/52 , G06F12/0811
Abstract: In one example, a central processing unit (CPU) with dynamic thread mapping includes a set of multiple cores each with a set of multiple threads. A set of registers for each of the multiple threads monitors for in-flight memory requests the number of loads from and stores to at least a first memory interface and a second memory interface by each respective thread. The second memory interface has a greater latency than the first memory interface. The CPU further has logic to map and migrate each thread to respective CPU cores where the number of cores accessing only one of the at least first and second memory interfaces is maximized.
-
公开(公告)号:US20190034239A1
公开(公告)日:2019-01-31
申请号:US16073573
申请日:2016-04-27
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Qiong Cai , Charles Johnson , Paolo Faraboschi
IPC: G06F9/50 , G06F9/48 , G06F9/52 , G06F12/0811
Abstract: In one example, a central processing unit (CPU) with dynamic thread mapping includes a set of multiple cores each with a set of multiple threads. A set of registers for each of the multiple threads monitors for in-flight memory requests the number of loads from and stores to at least a first memory interface and a second memory interface by each respective thread. The second memory interface has a greater latency than the first memory interface. The CPU further has logic to map and migrate each thread to respective CPU cores where the number of cores accessing only one of the at least first and second memory interfaces is maximized.
-
公开(公告)号:US20190205205A1
公开(公告)日:2019-07-04
申请号:US15861381
申请日:2018-01-03
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Charles Johnson , Onkar Patil , Mesut Kuscu , Tuan Tran , Joseph Tucek , Harumi Kuno , Milind Chabbi , William Scherer
Abstract: A high performance computing system including processing circuitry and a shared fabric memory is disclosed. The processing circuitry includes processors coupled to local storages. The shared fabric memory includes memory devices and is coupled to the processing circuitry. The shared fabric memory executes a first sweep of a stencil code by sequentially retrieving data stripes. Further, for each retrieved data stripe, a set of values of the retrieved data stripe are updated substantially simultaneously. For each retrieved data stripe, the updated set of values are stored in a free memory gap adjacent to the retrieved data stripe. For each retrieved data stripe, the free memory gap is advanced to an adjacent memory location. A sweep status indicator is incremented from the first sweep to a second sweep.
-
公开(公告)号:US20190187924A1
公开(公告)日:2019-06-20
申请号:US15847067
申请日:2017-12-19
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Charles Johnson , Mesut Kuscu , Onkar Patil , James Hyungsun Park , Harumi Kuno , Robert Schreiber
IPC: G06F3/06
Abstract: A high performance computing system that includes a shared fabric memory and a plurality of processors is disclosed. A first processor is coupled to a local storage and executes a first process that, in combination with other processes, causes the plurality of processors to perform certain actions including transferring, from the shared fabric memory to the local storage, a first value corresponding to a first cell of a first set of cells and a first sweep of a stencil code. The actions further include transferring, from a first logical partition in the shared fabric memory associated with the first cell to the local storage, a second value corresponding to a second cell related to the first cell and not in the first set of cells. Further, these actions include updating, by the first process, the first value based on at least the first value and the second value.
-
-
-
-
-
-