-
公开(公告)号:US20230419079A1
公开(公告)日:2023-12-28
申请号:US18244171
申请日:2023-09-08
Applicant: Google LLC
Inventor: Noam M. Shazeer , Azalia Mirhoseini , Krzysztof Stanislaw Maziarz
Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.
-
公开(公告)号:US20230176840A1
公开(公告)日:2023-06-08
申请号:US17921933
申请日:2021-06-07
Applicant: Google LLC
Inventor: Yanqi Zhou , Sudip Roy , Amirali Abdolrashidi , Daniel Lin-Kit Wong , Chao Ma , Qiumin Xu , Hanxiao Liu , Phitchaya Mangpo Phothilimthana , Shen Wang , Anna Darling Goldie , Azalia Mirhoseini , James Laudon
IPC: G06F8/41
CPC classification number: G06F8/443
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for compiler optimizations using a compiler optimization network. One of the methods includes receiving an input program, wherein the input program defines a graph of operation modules, wherein each node in the graph is a respective operation module, and each edge between nodes in the graph represents one operation module receiving the output generated by another operation module. The input program is processed by a compiler optimization network comprising a graph-embedding network that is configured to encode operation features and operation dependencies of the operation modules of the input program into a graph embedding representation and a policy network that is configured to generate an optimization action for each of one or more nodes encoded in the graph embedding representation. The compiler optimization network generates an output optimization plan comprising one or more optimization actions for the input program.
-
公开(公告)号:US11556690B2
公开(公告)日:2023-01-17
申请号:US17555085
申请日:2021-12-17
Applicant: Google LLC
Inventor: Anna Darling Goldie , Azalia Mirhoseini , Ebrahim Songhori , Wenjie Jiang , Shen Wang , Roger David Carpenter , Young-Joon Lee , Mustafa Nazim Yazgan , Chian-min Richard Ho , Quoc V. Le , James Laudon , Jeffrey Adgate Dean , Kavya Srinivasa Setty , Omkar Pathak
IPC: G06F30/39 , G06F30/392 , G06F30/398 , G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a computer chip placement. One of the methods includes obtaining netlist data for a computer chip; and generating a computer chip placement, comprising placing a respective macro node at each time step in a sequence comprising a plurality of time steps, the placing comprising, for each time step: generating an input representation for the time step; processing the input representation using a node placement neural network having a plurality of network parameters, wherein the node placement neural network is configured to process the input representation in accordance with current values of the network parameters to generate a score distribution over a plurality of positions on the surface of the computer chip; and assigning the macro node to be placed at the time step to a position from the plurality of positions using the score distribution.
-
公开(公告)号:US20220108058A1
公开(公告)日:2022-04-07
申请号:US17555085
申请日:2021-12-17
Applicant: Google LLC
Inventor: Anna Darling Goldie , Azalia Mirhoseini , Ebrahim Songhori , Wenjie Jiang , Shen Wang , Roger David Carpenter , Young-Joon Lee , Mustafa Nazim Yazgan , Chian-min Richard Ho , Quoc V. Le , James Laudon , Jeffrey Adgate Dean , Kavya Srinivasa Setty , Omkar Pathak
IPC: G06F30/392 , G06F30/398 , G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a computer chip placement. One of the methods includes obtaining netlist data for a computer chip; and generating a computer chip placement, comprising placing a respective macro node at each time step in a sequence comprising a plurality of time steps, the placing comprising, for each time step: generating an input representation for the time step; processing the input representation using a node placement neural network having a plurality of network parameters, wherein the node placement neural network is configured to process the input representation in accordance with current values of the network parameters to generate a score distribution over a plurality of positions on the surface of the computer chip; and assigning the macro node to be placed at the time step to a position from the plurality of positions using the score distribution.
-
公开(公告)号:US20200279163A1
公开(公告)日:2020-09-03
申请号:US16878720
申请日:2020-05-20
Applicant: Google LLC
Inventor: Samuel Bengio , Mohammad Norouzi , Benoit Steiner , Jeffrey Adgate Dean , Hieu Hy Pham , Azalia Mirhoseini , Quoc V. Le , Naveen Kumar , Yuefeng Zhou , Rasmus Munk Larsen
Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices is described. The method includes receiving data specifying a machine learning model to be placed for distributed processing on multiple hardware devices; generating, from the data, a sequence of operation embeddings, each operation embedding in the sequence characterizing respective operations necessary to perform the processing of the machine learning model; processing the sequence of operation embeddings using a placement recurrent neural network in accordance with first values of a plurality network parameters of the placement recurrent neural network to generate a network output that defines a placement of the operations characterized by the operation embeddings in the sequence across the plurality of devices; and scheduling the machine learning model for processing by the multiple hardware devices by placing the operations on the multiple devices according to the placement defined by the network output.
-
公开(公告)号:US20190392294A1
公开(公告)日:2019-12-26
申请号:US16554217
申请日:2019-08-28
Applicant: Google LLC
Inventor: Benoit Steiner , Anna Darling Goldie , Jeffrey Adgate Dean , Hieu Hy Pham , Azalia Mirhoseini , Quoc V. Le
Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices includes receiving data specifying machine learning operations, and determining a placement that assigns each of the operations specified by the data to a respective device from the multiple hardware devices. Determining the placement includes: generating, from the data, a respective operation embedding for each of the operations; grouping the operations into multiple operation groups, comprising processing each of the respective operation embeddings using a grouper neural network having multiple grouper parameters, in which the grouper neural network is configured to, for each of the operations, process the operation embedding for the operation in accordance with first values of the grouper parameters to generate a grouper output that assigns the operation to an operation group from the multiple operation groups; and assigning each of the operation groups to a respective device from the multiple hardware devices.
-
公开(公告)号:US20190303761A1
公开(公告)日:2019-10-03
申请号:US16445330
申请日:2019-06-19
Applicant: Google LLC
Inventor: Samy Bengio , Mohammad Edward Norouzi , Benoit Steiner , Jeffrey Adgate Dean , Hieu Hy Pham , Azalia Mirhoseini , Quoc V. Le , Naveen Kumar , Yuefeng Zhou , Rasmus Munk Larsen
Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices is described. The method includes receiving data specifying a machine learning model to be placed for distributed processing on multiple hardware devices; generating, from the data, a sequence of operation embeddings, each operation embedding in the sequence characterizing respective operations necessary to perform the processing of the machine learning model; processing the sequence of operation embeddings using a placement recurrent neural network in accordance with first values of a plurality network parameters of the placement recurrent neural network to generate a network output that defines a placement of the operations characterized by the operation embeddings in the sequence across the plurality of devices; and scheduling the machine learning model for processing by the multiple hardware devices by placing the operations on the multiple devices according to the placement defined by the network output.
-
公开(公告)号:US20190251423A1
公开(公告)日:2019-08-15
申请号:US16393063
申请日:2019-04-24
Applicant: Google LLC
Inventor: Noam M. Shazeer , Azalia Mirhoseini , Krzysztof Stanislaw Maziarz
Abstract: A system includes a neural network that includes a Mixture of Experts (MoE) subnetwork between a first neural network layer and a second neural network layer. The MoE subnetwork includes multiple expert neural networks. Each expert neural network is configured to process a first layer output generated by the first neural network layer to generate a respective expert output. The MoE subnetwork further includes a gating subsystem that selects, based on the first layer output, one or more of the expert neural networks and determine a respective weight for each selected expert neural network, provides the first layer output as input to each of the selected expert neural networks, combines the expert outputs generated by the selected expert neural networks in accordance with the weights for the selected expert neural networks to generate an MoE output, and provides the MoE output as input to the second neural network layer.
-
公开(公告)号:US20190026624A1
公开(公告)日:2019-01-24
申请号:US16040186
申请日:2018-07-19
Applicant: Google LLC
Inventor: Benoit Steiner , Anna Darling Goldie , Jeffrey Adgate Dean , Hieu Hy Pham , Azalia Mirhoseini , Quoc V. Le
CPC classification number: G06N3/0454 , G06F9/5066 , G06F16/9024 , G06N3/0445 , G06N3/063 , G06N3/084 , G06N5/045 , G06N20/00
Abstract: A method for determining a placement for machine learning model operations across multiple hardware devices includes receiving data specifying machine learning operations, and determining a placement that assigns each of the operations specified by the data to a respective device from the multiple hardware devices. Determining the placement includes: generating, from the data, a respective operation embedding for each of the operations; grouping the operations into multiple operation groups, comprising processing each of the respective operation embeddings using a grouper neural network having multiple grouper parameters, in which the grouper neural network is configured to, for each of the operations, process the operation embedding for the operation in accordance with first values of the grouper parameters to generate a grouper output that assigns the operation to an operation group from the multiple operation groups; and assigning each of the operation groups to a respective device from the multiple hardware devices.
-
公开(公告)号:US12248745B2
公开(公告)日:2025-03-11
申请号:US18395251
申请日:2023-12-22
Applicant: Google LLC
Inventor: Anna Darling Goldie , Azalia Mirhoseini , Ebrahim Songhori , Wenjie Jiang , Shen Wang , Roger David Carpenter , Young-Joon Lee , Mustafa Nazim Yazgan , Chian-min Richard Ho , Quoc V. Le , James Laudon , Jeffrey Adgate Dean , Kavya Srinivasa Setty , Omkar Pathak
IPC: G06F30/392 , G06F30/398 , G06N3/08
Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating a computer chip placement. One of the methods includes obtaining netlist data for a computer chip; and generating a computer chip placement, comprising placing a respective macro node at each time step in a sequence comprising a plurality of time steps, the placing comprising, for each time step: generating an input representation for the time step; processing the input representation using a node placement neural network having a plurality of network parameters, wherein the node placement neural network is configured to process the input representation in accordance with current values of the network parameters to generate a score distribution over a plurality of positions on the surface of the computer chip; and assigning the macro node to be placed at the time step to a position from the plurality of positions using the score distribution.
-
-
-
-
-
-
-
-
-