Patent search ap:("Google LLC") AND inv:"Rahul Nagarajan" Page 1

1.

发明公开
Method to Detect Silent Data Corruption (SDC) for SIMD Compute Units 审中-公开

公开(公告)号：US20240320073A1

公开(公告)日：2024-09-26

申请号：US18124390

申请日：2023-03-21

Applicant: Google LLC

Inventor： Rahul Nagarajan

IPC: G06F11/07

CPC classification number: G06F11/0751 , G06F11/0721

Abstract: An aspect of the disclosed technology is a replay unit that enables replaying computations on unused or empty ALU slots. The replay unit may be added on a per lane basis or within a lane in a SIMD unit or device.

2.

发明公开
Cooperative Instruction Prefetch on Multicore System 审中-公开

公开(公告)号：US20240211264A1

公开(公告)日：2024-06-27

申请号：US18595866

申请日：2024-03-05

Applicant: Google LLC

Inventor： Rahul Nagarajan , Christopher Leary , Thejasvi Magudilu Vijayaraj , Thomas James Norrie

IPC: G06F9/38

CPC classification number: G06F9/3802 , G06F9/3887

Abstract: Aspects of the disclosure are directed to methods, systems, and apparatuses using an instruction prefetch pipeline architecture that provides good performance without the complexity of a full cache coherent solution deployed in conventional CPUs. The architecture can include components which can be used to construct an instruction prefetch pipeline, including instruction memory (TiMem), instruction buffer (iBuf), a prefetch unit, and an instruction router.

3.

发明授权
On-chip interconnect for memory channel controllers 有权

公开(公告)号：US12007913B2

公开(公告)日：2024-06-11

申请号：US17707849

申请日：2022-03-29

Applicant: Google LLC

Inventor： Rahul Nagarajan , Hema Hariharan

IPC: G06F13/16 , G06F12/02

CPC classification number: G06F13/1668 , G06F12/0238 , G06F13/1621 , G06F13/1642

Abstract: Methods, systems, and apparatus, including computer-readable media, are described for an integrated circuit that accelerates machine-learning computations. The circuit includes processor cores that each include: multiple channel controllers; an interface controller for coupling each channel controller to any memory channel of a system memory; and a fetch unit in each channel controller. Each fetch is configured to: receive channel data that encodes addressing information; obtain, based on the addressing information, data from any memory channel of the system memory using the interface controller; and write the obtained data to a vector memory of the processor core via the corresponding channel controller that includes the respective fetch unit.

4.

发明授权
Matrix processing apparatus 有权

公开(公告)号：US10417303B2

公开(公告)日：2019-09-17

申请号：US15695144

申请日：2017-09-05

Applicant: Google LLC

Inventor： Ravi Narayanaswami , Rahul Nagarajan , Dong Hyuk Woo , Christopher Daniel Leary

IPC: G06F17/16 , G06F17/14

Abstract: Methods, systems, and apparatus, including a system for transforming sparse elements to a dense matrix. The system is configured to receive a request for an output matrix based on sparse elements including sparse elements associated with a first dense matrix and sparse elements associated with a second dense matrix; obtain the sparse elements associated with the first dense matrix fetched by a first group of sparse element access units; obtain the sparse elements associated with the second dense matrix fetched by a second group of sparse element access units; and transform the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix to generate the output dense matrix that includes the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix.

5.

发明授权
Method to detect silent data corruption (SDC) for SIMD compute units 有权

公开(公告)号：US12210402B2

公开(公告)日：2025-01-28

申请号：US18124390

申请日：2023-03-21

Applicant: Google LLC

Inventor： Rahul Nagarajan

IPC: G06F11/07

Abstract: An aspect of the disclosed technology is a replay unit that enables replaying computations on unused or empty ALU slots. The replay unit may be added on a per lane basis or within a lane in a SIMD unit or device.

6.

发明公开
Sparse SIMD Cross-lane Processing Unit 审中-公开

公开(公告)号：US20240211269A1

公开(公告)日：2024-06-27

申请号：US18597005

申请日：2024-03-06

Applicant: Google LLC

Inventor： Rahul Nagarajan , Suvinay Subramanian , Arpith Chacko Jacob

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3887 , G06F9/30036

Abstract: Aspects of the disclosure are directed to a cross-lane processing unit (XPU) for performing data-dependent operations across multiple data processing lanes of a processor. Rather than implementing operation-specific circuits for each data-dependent operation, the XPU can be configured to perform different operations in response to input signals configuring individual operations performed by processing cells and crossbars arranged as a stacked network in the XPU. Each processing cell can receive and process data across multiple data processing lanes. Aspects of the disclosure include configuring the XPU to use a vector sort network to perform a duplicate count eliminating the need to configure the XPU separately for sorting and duplicate counting.

7.

发明授权
Cooperative instruction prefetch on multicore system 有权

公开(公告)号：US11972263B2

公开(公告)日：2024-04-30

申请号：US17972681

申请日：2022-10-25

Applicant: Google LLC

Inventor： Rahul Nagarajan , Christopher Leary , Thejasvi Magudilu Vijayaraj , Thomas James Norrie

IPC: G06F9/38

CPC classification number: G06F9/3802 , G06F9/3887

Abstract: Aspects of the disclosure are directed to methods, systems, and apparatuses using an instruction prefetch pipeline architecture that provides good performance without the complexity of a full cache coherent solution deployed in conventional CPUs. The architecture can include components which can be used to construct an instruction prefetch pipeline, including instruction memory (TiMem), instruction buffer (iBuf), a prefetch unit, and an instruction router.

8.

发明授权
Accelerated embedding layer computations 有权

公开(公告)号：US11948086B2

公开(公告)日：2024-04-02

申请号：US18305297

申请日：2023-04-21

Applicant: Google LLC

Inventor： Rahul Nagarajan , Lifeng Nai , George Kurian , Hema Hariharan

IPC: G06N3/08 , G06F1/03 , G06N3/063 , G06N20/10

CPC classification number: G06N3/08 , G06F1/03 , G06N3/063 , G06N20/10

Abstract: Methods, systems, and apparatus, including computer-readable media, are described for performing neural network computations using a system configured to implement a neural network on a hardware circuit. The system includes a host that receives a batch of inputs to a neural network layer. Each of the inputs is stored in a memory location identified by an address. The system identifies one or more duplicate addresses in a listing of addresses for one or more inputs. For each duplicate address: the system generates a unique identifier that identifies the duplicate address in the listing of addresses. The system (i) obtains first inputs from memory locations identified by addresses corresponding to the unique identifiers and (ii) generates an output of the layer from the obtained first inputs.

9.

发明授权
Accelerated embedding layer computations 有权

公开(公告)号：US11651209B1

公开(公告)日：2023-05-16

申请号：US16659527

申请日：2019-10-21

Applicant: Google LLC

Inventor： Rahul Nagarajan , Lifeng Nai , George Kurian , Hema Hariharan

IPC: G06N3/08 , G06F1/03 , G06N20/10 , G06N3/063

CPC classification number: G06N3/08 , G06F1/03 , G06N3/063 , G06N20/10

Abstract: Methods, systems, and apparatus, including computer-readable media, are described for performing neural network computations using a system configured to implement a neural network on a hardware circuit. The system includes a host that receives a batch of inputs to a neural network layer. Each of the inputs is stored in a memory location identified by an address. The system identifies one or more duplicate addresses in a listing of addresses for one or more inputs. For each duplicate address: the system generates a unique identifier that identifies the duplicate address in the listing of addresses. The system (i) obtains first inputs from memory locations identified by addresses corresponding to the unique identifiers and (ii) generates an output of the layer from the obtained first inputs.

10.

发明申请
MATRIX PROCESSING APPARATUS 有权

公开(公告)号：US20220391472A1

公开(公告)日：2022-12-08

申请号：US17842420

申请日：2022-06-16

Applicant: Google LLC

Inventor： Ravi Narayanaswami , Rahul Nagarajan , Dong Hyuk Woo , Christopher Daniel Leary

IPC: G06F17/16 , G06F17/14

Abstract: Methods, systems, and apparatus, including a system for transforming sparse elements to a dense matrix. The system is configured to receive a request for an output matrix based on sparse elements including sparse elements associated with a first dense matrix and sparse elements associated with a second dense matrix; obtain the sparse elements associated with the first dense matrix fetched by a first group of sparse element access units; obtain the sparse elements associated with the second dense matrix fetched by a second group of sparse element access units; and transform the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix to generate the output dense matrix that includes the sparse elements associated with the first dense matrix and the sparse elements associated with the second dense matrix.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification