-
公开(公告)号:US12175351B2
公开(公告)日:2024-12-24
申请号:US15994144
申请日:2018-05-31
Applicant: Google LLC
Inventor: Milad Olia Hashemi , Parthasarathy Ranganathan , Jamie Alexander Smith , Kevin Jordan Swersky
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for pre-fetching data from memory using neural networks. One example system receives a sequence of prior program counter addresses of a computer program and corresponding delta values. The system creates an input representation based on the sequence. The system provides the input representation as input to a recurrent neural network. The system receives from the recurrent neural network an output that defines a probability distribution over future delta values. Each probability in the distribution represents a likelihood that execution of a future instruction of the computer program will cause data to be fetched from a particular future memory address.
-
公开(公告)号:US20240231667A1
公开(公告)日:2024-07-11
申请号:US18152428
申请日:2023-01-10
Applicant: Google LLC
Inventor: Sheng Li , Sridhar Lakshmanamurthy , Norman Paul Jouppi , Martin Guy Dixon , Daniel Stodolsky , Quoc V. Le , Liqun Cheng , Erik Karl Norden , Parthasarathy Ranganathan
IPC: G06F3/06
CPC classification number: G06F3/0647 , G06F3/0611 , G06F3/067
Abstract: Aspects of the disclosure are directed to a heterogeneous machine learning accelerator system with compute and memory nodes connected by high speed chip-to-chip interconnects. While existing remote/disaggregated memory may require memory expansion via remote processing units, aspects of the disclosure add memory nodes into machine learning accelerator clusters via the chip-to-chip interconnects without needing assistance from remote processing units to achieve higher performance, simpler software stack, and/or lower cost. The memory nodes may support prefetch and intelligent compression to enable the use of low cost memory without performance degradation.
-
公开(公告)号:US11960936B2
公开(公告)日:2024-04-16
申请号:US17150285
申请日:2021-01-15
Applicant: Google LLC
Inventor: David Lo , Liqun Cheng , Parthasarathy Ranganathan , Sundar Jayakumar Dev
CPC classification number: G06F9/5027 , G06N20/00
Abstract: The subject matter described herein provides systems and techniques to address the challenges of growing hardware and workload heterogeneity using a Warehouse-Scale Computer (WSC) design that improves the efficiency and utilization of WSCs. The WSC design may include an abstraction layer and an efficiency layer in the software stack of the WSC. The abstraction layer and the efficiency layer may be designed to improve job scheduling, simplify resource management, and drive hardware-software co-optimization using machine learning techniques and automation in order to customize the WSC for applications at scale. The abstraction layer may embrace platform/hardware and workload diversity through greater coordination between hardware and higher layers of the WSC software stack in the WSC design. The efficiency layer may employ machine learning techniques at scale to realize hardware/software co-optimizations as a part of the autonomous WSC design.
-
公开(公告)号:US11275744B2
公开(公告)日:2022-03-15
申请号:US16840699
申请日:2020-04-06
Applicant: Google LLC
Inventor: Milad Olia Hashemi , Parthasarathy Ranganathan , Harsh Satija
IPC: G06F16/00 , G06F16/2455 , G06N3/08 , G06N3/04 , G05B13/00 , G06F16/176 , G06F16/2453 , G06N20/00
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disaggregating latent causes for computer system optimization. In one aspect, a method includes accessing a data stream for data values resulting from operations performed by a computer system; providing the data values as input to a data disaggregation machine learning model that generates descriptors of latent causes of the data values; providing the data values and the descriptors of the latent causes of the data values as inputs to a control system model that generates embedded representations of commands to modify the operations performed by the computer system; determining commands to modify the operations performed by the computer system based on the embedded representations of commands to modify the operations performed by the computer system; and providing the commands to the computer system.
-
公开(公告)号:US20190163381A1
公开(公告)日:2019-05-30
申请号:US16242669
申请日:2019-01-08
Applicant: Google LLC
Inventor: Rama Krishna Govindaraju , Liqun Cheng , Parthasarathy Ranganathan , Michael R. Marty , Andrew Gallatin
IPC: G06F3/06 , G06F12/1081
Abstract: An example method includes during execution of a software application by a processor, receiving, by a copy processor separate from the processor, a request for an asynchronous data copy operation to copy data within a memory accessible by the copy processor, wherein the request is received from a copy manager accessible by the software application in a user space of an operating system managing execution of the software application; in response to the request, initiating, by the copy processor, the asynchronous data copy operation; continuing execution of the software application by the processor; determining, by the copy processor, that the asynchronous data copy operation has completed; and in response to determining that the asynchronous copy operation has completed, selectively notifying, by the copy processor, the software application that the asynchronous copy operation has completed.
-
公开(公告)号:US10303604B2
公开(公告)日:2019-05-28
申请号:US15429579
申请日:2017-02-10
Applicant: Google LLC
Inventor: Richard Yoo , Liqun Cheng , Benjamin C. Serebrin , Parthasarathy Ranganathan , Rama Krishna Govindaraju
IPC: G06F12/00 , G06F12/0811 , G06F12/0871 , G06F12/0897 , G06F1/3234 , G06F9/4401
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for caching data not frequently accessed. One of the methods includes receiving a request for data from a component of a device, determining that the data satisfies an infrequency condition, in response to determining that the data satisfies the infrequency condition: determining a target cache level which defines a cache level within a cache level hierarchy of a particular cache at which to store infrequently accessed data, the target cache level being lower than a highest cache level in the cache level hierarchy, requesting and receiving the data from a memory that is not a cache of the device, and storing the data in a level of the particular cache that is at or below the target cache level in the cache level hierarchy, and providing the data to the component.
-
公开(公告)号:US10218779B1
公开(公告)日:2019-02-26
申请号:US15055300
申请日:2016-02-26
Applicant: Google LLC
Inventor: Liqun Cheng , Rama Krishna Govindaraju , Parthasarathy Ranganathan
IPC: H04L29/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for machine level resource distribution are disclosed. In one aspect, a method is implemented in a data processing apparatus, which includes, for each server computer in a set of two or more server computers within a data center, wherein each server computer includes a plurality of processing cores, receiving wear data describing, for each processing core of the server computer, a wear level for the processing core that is indicative of accumulated wear of the processing core, and moderating accumulation of wear in the processor cores based on the wear level of the processing cores from at least two different server computers.
-
公开(公告)号:US10908964B2
公开(公告)日:2021-02-02
申请号:US16198583
申请日:2018-11-21
Applicant: Google LLC
Inventor: Liqun Cheng , Rama Krishna Govindaraju , Haishan Zhu , David Lo , Parthasarathy Ranganathan , Nishant Patil
Abstract: Methods, systems, and computer storage media storing instructions for managing processing system efficiency. One of the methods includes obtaining data splitting a plurality of general-purpose processing units in a processing system into a high-priority domain and a low-priority domain, wherein the general-purpose processing units in the high-priority domain are assigned to perform one or more tasks comprising one or more high-priority tasks, and the general-purpose processing units in the low-priority domain are assigned to perform one or more low-priority tasks; and during runtime of the processing system, obtaining memory usage measurements that characterize usage of system memory by the high-priority domain and the low-priority domain; and adjusting, based on the memory usage measurements, a configuration of (i) the high-priority domain, (ii) the low-priority domain, or (iii) both to adjust utilization of the system memory by the general-purpose processing units.
-
公开(公告)号:US20200233871A1
公开(公告)日:2020-07-23
申请号:US16840699
申请日:2020-04-06
Applicant: Google LLC
Inventor: Milad Olia Hashemi , Parthasarathy Ranganathan , Harsh Satija
IPC: G06F16/2455 , G05B13/00 , G06N3/04 , G06N3/08 , G06F16/2453 , G06F16/176
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disaggregating latent causes for computer system optimization. In one aspect, a method includes accessing a data stream for data values resulting from operations performed by a computer system; providing the data values as input to a data disaggregation machine learning model that generates descriptors of latent causes of the data values; providing the data values and the descriptors of the latent causes of the data values as inputs to a control system model that generates embedded representations of commands to modify the operations performed by the computer system; determining commands to modify the operations performed by the computer system based on the embedded representations of commands to modify the operations performed by the computer system; and providing the commands to the computer system.
-
公开(公告)号:US20190108261A1
公开(公告)日:2019-04-11
申请号:US15726130
申请日:2017-10-05
Applicant: Google LLC
Inventor: Milad Olia Hashemi , Parthasarathy Ranganathan , Harsh Satija
IPC: G06F17/30
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for disaggregating latent causes for computer system optimization. In one aspect, a method includes accessing a data stream for data values resulting from operations performed by a computer system; providing the data values as input to a data disaggregation machine learning model that generates descriptors of latent causes of the data values; providing the data values and the descriptors of the latent causes of the data values as inputs to a control system model that generates embedded representations of commands to modify the operations performed by the computer system; determining commands to modify the operations performed by the computer system based on the embedded representations of commands to modify the operations performed by the computer system; and providing the commands to the computer system.
-
-
-
-
-
-
-
-
-