-
公开(公告)号:US20210201136A1
公开(公告)日:2021-07-01
申请号:US17044633
申请日:2018-04-30
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: Sai Rahul Chalamalasetti , Paolo Faraboschi , Martin Foltin , Catherine Graves , Dejan S. Milojicic , John Paul Strachan , Sergey Serebryakov
Abstract: A crossbar array includes a number of memory elements. An analog-to-digital converter (ADC) is electronically coupled to the vector output register. A digital-to-analog converter (DAC) is electronically coupled to the vector input register. A processor is electronically coupled to the ADC and to the DAC. The processor may be configured to determine whether division of input vector data by output vector data from the crossbar array is within a threshold value, and if not within the threshold value, determine changed data values as between the output vector data and the input vector data, and write the changed data values to the memory elements of the crossbar array.
-
公开(公告)号:US10922137B2
公开(公告)日:2021-02-16
申请号:US16073573
申请日:2016-04-27
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Qiong Cai , Charles Johnson , Paolo Faraboschi
IPC: G06F9/50 , G06F11/30 , G06F12/1027 , G06F9/38 , G06F9/46 , G06F9/48 , G06F9/52 , G06F12/0811
Abstract: In one example, a central processing unit (CPU) with dynamic thread mapping includes a set of multiple cores each with a set of multiple threads. A set of registers for each of the multiple threads monitors for in-flight memory requests the number of loads from and stores to at least a first memory interface and a second memory interface by each respective thread. The second memory interface has a greater latency than the first memory interface. The CPU further has logic to map and migrate each thread to respective CPU cores where the number of cores accessing only one of the at least first and second memory interfaces is maximized.
-
公开(公告)号:US20200091930A1
公开(公告)日:2020-03-19
申请号:US16131722
申请日:2018-09-14
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: Anirban Nag , Naveen Muralimanohar , Paolo Faraboschi
Abstract: Computer-implemented methods, systems, and devices to perform lossless compression of floating point format time-series data are disclosed. A first data value may be obtained in floating point format representative of an initial time-series parameter. For example, an output checkpoint of a computer simulation of a real-world event such as weather prediction or nuclear reaction simulation. A first predicted value may be determined representing the parameter at a first checkpoint time. A second data value may be obtained from the simulation. A prediction error may be calculated. Another predicted value may be generated for a next point in time and may be adjusted by the previously determined prediction error (e.g., to increase accuracy of the subsequent prediction). When a third data value is obtained, the adjusted prediction value may be used to generate a difference (e.g., XOR) for storing in a compressed data store to represent the third data value.
-
34.
公开(公告)号:US20200073755A1
公开(公告)日:2020-03-05
申请号:US16115100
申请日:2018-08-28
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: John Paul Strachan , Catherine Graves , Dejan S. Milojicic , Paolo Faraboschi , Martin Foltin , Sergey Serebryakov
Abstract: A computer system includes multiple memory array components that include respective analog memory arrays which are sequenced to implement a multi-layer process. An error array data structure is obtained for at least a first memory array component, and from which a determination is made as to whether individual nodes (or cells) of the error array data structure are significant. A determination can be made as to any remedial operations that can be performed to mitigate errors of significance.
-
公开(公告)号:US20200042287A1
公开(公告)日:2020-02-06
申请号:US16052218
申请日:2018-08-01
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: Sai Rahul Chalamalasetti , Paolo Faraboschi , Martin Foltin , Catherine Graves , Dejan S. Milojicic , Sergey Serebryakov , John Paul Strachan
Abstract: Disclosed techniques provide for dynamically changing precision of a multi-stage compute process. For example, changing neural network (NN) parameters on a per-layer basis depending on properties of incoming data streams and per-layer performance of an NN among other considerations. NNs include multiple layers that may each be calculated with a different degree of accuracy and therefore, compute resource overhead (e.g., memory, processor resources, etc.). NNs are usually trained with 32-bit or 16-bit floating-point numbers. Once trained, an NN may be deployed in production. One approach to reduce compute overhead is to reduce parameter precision of NNs to 16 or 8 for deployment. The conversion to an acceptable lower precision is usually determined manually before deployment and precision levels are fixed while deployed. Disclosed techniques and implementations address automatic rather than manual determination or precision levels for different stages and dynamically adjusting precision for each stage at run-time.
-
公开(公告)号:US10282302B2
公开(公告)日:2019-05-07
申请号:US15199285
申请日:2016-06-30
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Qiong Cai , Paolo Faraboschi
IPC: G06F12/00 , G06F12/0875 , G06F12/0893 , G06F9/46 , G06F12/0866
Abstract: Examples disclosed herein relate to programmable memory-side cache management. Some examples disclosed herein may include a programmable memory-side cache and a programmable memory-side cache controller. The programmable memory-side cache may locally store data of a system memory. The programmable memory-side cache controller may include programmable processing cores, each of the programmable processing cores configurable by cache configuration codes to manage the programmable memory-side cache for different applications.
-
公开(公告)号:US20190034239A1
公开(公告)日:2019-01-31
申请号:US16073573
申请日:2016-04-27
Applicant: Hewlett Packard Enterprise Development LP
Inventor: Qiong Cai , Charles Johnson , Paolo Faraboschi
IPC: G06F9/50 , G06F9/48 , G06F9/52 , G06F12/0811
Abstract: In one example, a central processing unit (CPU) with dynamic thread mapping includes a set of multiple cores each with a set of multiple threads. A set of registers for each of the multiple threads monitors for in-flight memory requests the number of loads from and stores to at least a first memory interface and a second memory interface by each respective thread. The second memory interface has a greater latency than the first memory interface. The CPU further has logic to map and migrate each thread to respective CPU cores where the number of cores accessing only one of the at least first and second memory interfaces is maximized.
-
公开(公告)号:US20190012222A1
公开(公告)日:2019-01-10
申请号:US15645105
申请日:2017-07-10
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: Derek Alan Sherlock , Shawn Walker , Paolo Faraboschi
Abstract: In some examples, a controller includes a counter to track errors associated with a group of memory access operations, and processing logic to detect an error associated with the group of memory access operations, determine whether the detected error causes an error state change of the group of memory access operations, and cause advancing of the counter responsive to determining that the detected error causes the error state change of the group of memory access operations.
-
公开(公告)号:US20180204024A1
公开(公告)日:2018-07-19
申请号:US15746494
申请日:2015-07-29
Applicant: Hewlett Packard Enterprise Development Lp
Inventor: Mark Lillibridge , Paolo Faraboschi , Chris I. Dalton
CPC classification number: G06F21/70 , G06F3/0622 , G06F3/0659 , G06F3/0673 , G06F21/53 , G06F21/74
Abstract: Techniques for a firewall to determine access to a portion of memory are provided. In one aspect, an access request to access a portion of memory within a pool of shared memory may be received at a firewall. The firewall may determine whether the access request to access the portion of memory is allowed. The access request may be allowed to proceed based on the determination. The operation of the firewall may not utilize address translation.
-
40.
公开(公告)号:US20230418792A1
公开(公告)日:2023-12-28
申请号:US17851546
申请日:2022-06-28
Applicant: HEWLETT PACKARD ENTERPRISE DEVELOPMENT LP
Inventor: Annmary Justine KOOMTHANAM , Suparna Bhattacharya , Aalap Tripathy , Sergey Serebryakov , Martin Foltin , Paolo Faraboschi
IPC: G06F16/215 , G06F16/25 , G06F16/27 , G06N20/00 , G06K9/62
CPC classification number: G06F16/215 , G06F16/254 , G06F16/27 , G06N20/00 , G06K9/6256
Abstract: Systems and methods are provide for automatically constructing data lineage representations for distributed data processing pipelines. These data lineage representations (which are constructed and stored in a central repository shared by the multiple data processing sites) can be used to among other things, clone the distributed data processing pipeline for quality assurance or debugging purposes. Examples of the presently disclosed technology are able to construct data lineage representations for distributed data processing pipelines by (1) generating a hash content value for universally identifying each data artifact of the distributed data processing pipeline across the multiple processing stages/processing sites of the distributed data processing pipeline; and (2) creating an data processing pipeline abstraction hierarchy for associating each data artifact to input and output events for given executions of given data processing stages (performed by the multiple data processing sites).
-
-
-
-
-
-
-
-
-