-
公开(公告)号:US11188494B2
公开(公告)日:2021-11-30
申请号:US16524964
申请日:2019-07-29
Applicant: Google LLC
Inventor: Nishant Patil , Liqun Cheng
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, are described for performing asymmetric data communication at a host-device interface of a system. The methods include identifying devices coupled to a host of the system and generating a system topology that identifies a connectivity of the devices and identifies bus lanes that enable data transfers at the system. The host determines that a first connection between the host and a first device of the multiple devices has an asymmetric bandwidth requirement. The host configures a set of bus lanes of a data bus connecting the first device and the host to allocate a different number of the bus lanes to data egress from the host than to data ingress to the host. The bus lanes are configured to allocate the differing number of bus lanes based on the asymmetric bandwidth requirement of the first connection.
-
公开(公告)号:US11042416B2
公开(公告)日:2021-06-22
申请号:US16381951
申请日:2019-04-11
Applicant: Google LLC
Inventor: Nishant Patil , Xiang Zhou , Andrew Swing
IPC: G06F9/50 , H04L12/751 , H04L12/715 , H04L12/803 , H04L29/08 , H04Q11/00
Abstract: Methods, systems, and apparatus, including an apparatus for generating clusters of building blocks of compute nodes using an optical network. In one aspect, a method includes receiving request data specifying requested compute nodes for a computing workload. The request data specifies a target n-dimensional arrangement of the compute nodes. A selection is made, from a superpod that includes a set of building blocks that each include an m-dimensional arrangement of compute nodes, a subset of the building blocks that, when combined, match the target n-dimensional arrangement specified by the request data. The set of building blocks are connected to an optical network that includes one or more optical circuit switches. A workload cluster of compute nodes that includes the subset of the building blocks is generated. The generating includes configuring, for each dimension of the workload cluster, respective routing data for the one or more optical circuit switches.
-
公开(公告)号:US20200341931A1
公开(公告)日:2020-10-29
申请号:US16393425
申请日:2019-04-24
Applicant: Google LLC
Inventor: Pankaj Makhija , Nishant Patil
IPC: G06F13/42 , H04L12/937 , H04L12/931
Abstract: Methods and systems for facilitating an equitable bandwidth distribution across downstream devices in asymmetrical switch topologies, and in particular asymmetrical PCIe switch topologies. The equitable distribution of bandwidth is achieved in asymmetrical topologies using virtual switch partitioning. An upstream switch that is connected to the root complex via an upstream port and that receives bandwidth B from the upstream port, is virtualized into two or more virtual switches. Each virtual switch equally shares the bandwidth. Each virtual switch is allocated to downstream devices that are connected to the upstream switch as well as to one or more downstream switches that are connected to the upstream switch. Each downstream switch may be connected to one or more additional downstream devices.
-
14.
公开(公告)号:US10073817B1
公开(公告)日:2018-09-11
申请号:US15792077
申请日:2017-10-24
Applicant: Google LLC
Inventor: Nishant Patil , Matthew Sarett , Rama Krishna Govindaraju , Benoit Steiner , Vincent O. Vanhoucke
CPC classification number: G06F17/16
Abstract: The present disclosure relates to optimized matrix multiplication using vector multiplication of interleaved matrix values. Two matrices to be multiplied are organized into specially ordered vectors, which are multiplied together to produce a portion of a product matrix.
-
公开(公告)号:US11915139B2
公开(公告)日:2024-02-27
申请号:US17672163
申请日:2022-02-15
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Nishant Patil , Norman Paul Jouppi
Abstract: Methods, systems, and apparatus for updating machine learning models to improve locality are described. In one aspect, a method includes receiving data of a machine learning model. The data represents operations of the machine learning model and data dependencies between the operations. Data specifying characteristics of a memory hierarchy for a machine learning processor on which the machine learning model is going to be deployed is received. The memory hierarchy includes multiple memories at multiple memory levels for storing machine learning data used by the machine learning processor when performing machine learning computations using the machine learning model. An updated machine learning model is generated by modifying the operations and control dependencies of the machine learning model to account for the characteristics of the memory hierarchy. Machine learning computations are performed using the updated machine learning model.
-
公开(公告)号:US11841817B2
公开(公告)日:2023-12-12
申请号:US18071773
申请日:2022-11-30
Applicant: Google LLC
Inventor: Pankaj Makhija , Nishant Patil
IPC: G06F13/42 , H04L49/253 , H04L49/00
CPC classification number: G06F13/4226 , H04L49/253 , H04L49/70 , G06F2213/0026
Abstract: Methods and systems for facilitating an equitable bandwidth distribution across downstream devices in asymmetrical switch topologies, and in particular asymmetrical PCIe switch topologies. The equitable distribution of bandwidth is achieved in asymmetrical topologies using virtual switch partitioning. An upstream switch that is connected to the root complex via an upstream port and that receives bandwidth B from the upstream port, is virtualized into two or more virtual switches. Each virtual switch equally shares the bandwidth. Each virtual switch is allocated to downstream devices that are connected to the upstream switch as well as to one or more downstream switches that are connected to the upstream switch. Each downstream switch may be connected to one or more additional downstream devices.
-
公开(公告)号:US11704158B2
公开(公告)日:2023-07-18
申请号:US17162682
申请日:2021-01-29
Applicant: Google LLC
Inventor: Liqun Cheng , Rama Krishna Govindaraju , Haishan Zhu , David Lo , Parthasarathy Ranganathan , Nishant Patil
CPC classification number: G06F9/5038 , G06F9/4881 , G06F9/505 , G06F9/5016 , G06F9/5061 , G06F9/5083 , G06N20/00
Abstract: Methods, systems, and computer storage media storing instructions for managing processing system efficiency. One of the methods includes obtaining data splitting a plurality of general-purpose processing units in a processing system into a high-priority domain and a low-priority domain, wherein the general-purpose processing units in the high-priority domain are assigned to perform one or more tasks comprising one or more high-priority tasks, and the general-purpose processing units in the low-priority domain are assigned to perform one or more low-priority tasks; and during runtime of the processing system, obtaining memory usage measurements that characterize usage of system memory by the high-priority domain and the low-priority domain; and adjusting, based on the memory usage measurements, a configuration of (i) the high-priority domain, (ii) the low-priority domain, or (iii) both to adjust utilization of the system memory by the general-purpose processing units.
-
公开(公告)号:US20230088346A1
公开(公告)日:2023-03-23
申请号:US18071773
申请日:2022-11-30
Applicant: Google LLC
Inventor: Pankaj Makhija , Nishant Patil
IPC: G06F13/42 , H04L49/253 , H04L49/00
Abstract: Methods and systems for facilitating an equitable bandwidth distribution across downstream devices in asymmetrical switch topologies, and in particular asymmetrical PCIe switch topologies. The equitable distribution of bandwidth is achieved in asymmetrical topologies using virtual switch partitioning. An upstream switch that is connected to the root complex via an upstream port and that receives bandwidth B from the upstream port, is virtualized into two or more virtual switches. Each virtual switch equally shares the bandwidth. Each virtual switch is allocated to downstream devices that are connected to the upstream switch as well as to one or more downstream switches that are connected to the upstream switch. Each downstream switch may be connected to one or more additional downstream devices.
-
公开(公告)号:US20220172060A1
公开(公告)日:2022-06-02
申请号:US17672163
申请日:2022-02-15
Applicant: Google LLC
Inventor: Doe Hyun Yoon , Nishant Patil , Norman Paul Jouppi
Abstract: Methods, systems, and apparatus for updating machine learning models to improve locality are described. In one aspect, a method includes receiving data of a machine learning model. The data represents operations of the machine learning model and data dependencies between the operations. Data specifying characteristics of a memory hierarchy for a machine learning processor on which the machine learning model is going to be deployed is received. The memory hierarchy includes multiple memories at multiple memory levels for storing machine learning data used by the machine learning processor when performing machine learning computations using the machine learning model. An updated machine learning model is generated by modifying the operations and control dependencies of the machine learning model to account for the characteristics of the memory hierarchy. Machine learning computations are performed using the updated machine learning model.
-
公开(公告)号:US20220083493A1
公开(公告)日:2022-03-17
申请号:US17537366
申请日:2021-11-29
Applicant: Google LLC
Inventor: Nishant Patil , Liqun Cheng
Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, are described for performing asymmetric data communication at a host-device interface of a system. The methods include identifying devices coupled to a host of the system and generating a system topology that identifies a connectivity of the devices and identifies bus lanes that enable data transfers at the system. The host determines that a first connection between the host and a first device of the multiple devices has an asymmetric bandwidth requirement. The host configures a set of bus lanes of a data bus connecting the first device and the host to allocate a different number of the bus lanes to data egress from the host than to data ingress to the host. The bus lanes are configured to allocate the differing number of bus lanes based on the asymmetric bandwidth requirement of the first connection.
-
-
-
-
-
-
-
-
-