Patent search ap:("Xilinx Page Inc.") AND inv:"Rajeev Patwari"

1.

发明公开
COMPRESSION OF SPARSE TENSORS 审中-公开

公开(公告)号：US20230185451A1

公开(公告)日：2023-06-15

申请号：US17643999

申请日：2021-12-13

Applicant: Xilinx, Inc.

Inventor： Vamsi Krishna Nalluri , Sai Lalith Chaitanya Ambatipudi , Mrinal J. Sarmah , Rajeev Patwari , Shreyas Manjunath , Sandeep Jayant Sathe

IPC: G06F3/06

CPC classification number: G06F3/0608 , G06F3/064 , G06F3/0673

Abstract: Approaches for data compression involve a compression circuit packing non-zero data elements of a succession of words of a plurality of blocks into packed words by packing non-zero data elements of one or more words of the succession in each packed word, and restricting each packed word to data elements of one uncompressed block. The compression circuit writes each packed word in a RAM and within a compressed address range associated with the uncompressed block when the packed word is full of non-zero data elements, or before the packed word is full if the next input word is of another uncompressed block.

2.

发明授权
Compression of sparse tensors 有权

公开(公告)号：US11941248B2

公开(公告)日：2024-03-26

申请号：US17643999

申请日：2021-12-13

Applicant: Xilinx, Inc.

Inventor： Vamsi Krishna Nalluri , Sai Lalith Chaitanya Ambatipudi , Mrinal J. Sarmah , Rajeev Patwari , Shreyas Manjunath , Sandeep Jayant Sathe

IPC: G06F3/06

CPC classification number: G06F3/0608 , G06F3/064 , G06F3/0673

Abstract: Approaches for data compression involve a compression circuit packing non-zero data elements of a succession of words of a plurality of blocks into packed words by packing non-zero data elements of one or more words of the succession in each packed word, and restricting each packed word to data elements of one uncompressed block. The compression circuit writes each packed word in a RAM and within a compressed address range associated with the uncompressed block when the packed word is full of non-zero data elements, or before the packed word is full if the next input word is of another uncompressed block.

3.

发明公开
PROGRAMMABLE NON-LINEAR ACTIVATION ENGINE FOR NEURAL NETWORK ACCELERATION 审中-公开

公开(公告)号：US20230297824A1

公开(公告)日：2023-09-21

申请号：US17655489

申请日：2022-03-18

Applicant: Xilinx, Inc.

Inventor： Rajeev Patwari , Chaithanya Dudha , Jorn Tuyls , Kaushik Barman , Aaron Ng

IPC: G06N3/08

CPC classification number: G06N3/08

Abstract: A programmable, non-linear (PNL) activation engine for a neural network is capable of receiving input data within a circuit. In response to receiving an instruction corresponding to the input data, the PNL activation engine is capable of selecting a first non-linear activation function from a plurality of non-linear activation functions by decoding the instruction. The PNL activation engine is capable of fetching a first set of coefficients corresponding to the first non-linear activation function from a memory. The PNL activation engine is capable of performing a polynomial approximation of the first non-linear activation function on the input data using the first set of coefficients. The PNL activation engine is capable of outputting a result from the polynomial approximation of the first non-linear activation function.

4.

发明授权
On-chip memory access pattern detection for power and resource reduction 有权

公开(公告)号：US11188697B1

公开(公告)日：2021-11-30

申请号：US17141983

申请日：2021-01-05

Applicant: Xilinx, Inc.

Inventor： Chaithanya Dudha , Rajeev Patwari , Nithin Kumar Guggilla , Ashish Sirasao , Krishna Garlapati

IPC: G06F30/333 , G06F30/343 , G06F30/3308 , G06F30/398 , G06F11/00 , G06F9/34 , G06F9/26 , G06F13/00 , G01R31/28 , G11C7/10 , G11C29/04 , G11B27/36 , G11B7/00 , G11B11/00 , H01L21/00 , G06F11/32 , G06F12/00

Abstract: Determining on-chip memory access patterns can include modifying a circuit design to include a profiler circuit for a random-access memory (RAM) of the circuit design, wherein the profiler circuit is configured to monitor an address bus of the RAM, and modifying the circuit design to include a debug circuit connected to the profiler circuit. Usage data for the RAM can be generated by detecting, using the profiler circuit, addresses of the RAM accessed during a test of the circuit design, as implemented in an integrated circuit. The usage data for the RAM can be output using the debug circuit.

5.

发明授权
Instruction set architecture for data processing array control 有权

公开(公告)号：US12248786B2

公开(公告)日：2025-03-11

申请号：US17818309

申请日：2022-08-08

Applicant: Xilinx, Inc.

Inventor： Xiao Teng , Tejus Siddagangaiah , Bryan Lozano , Ehsan Ghasemi , Rajeev Patwari , Elliott Delaye , Jorn Tuyls , Aaron Ng , Sanket Pandit , Pramod Peethambaran , Satyaprakash Pareek

IPC: G06F9/30 , G06F9/38 , G06F9/46

Abstract: Controlling a data processing (DP) array includes creating a replica of a register address space of the DP array based on the design and the DP array. A sequence of instructions, including write instructions and read instructions, is received. The write instructions correspond to buffer descriptors specifying runtime data movements for a design for a DP array. The write instructions are converted into transaction instructions and the read instructions are converted into wait instructions based on the replica of the register address space. The transaction instructions and the wait instructions are included in an instruction buffer. The instruction buffer is provided to a microcontroller configured to execute the transaction instructions and the wait instructions to implement the runtime data movements for the design as implemented in the DP array. In another aspect, the instruction buffer is stored in a file for subsequent execution by the microcontroller.

6.

发明公开
INSTRUCTION GENERATION AND PROGRAMMING MODEL FOR A DATA PROCESSING ARRAY AND MICROCONTROLLER 审中-公开

公开(公告)号：US20240069511A1

公开(公告)日：2024-02-29

申请号：US17823902

申请日：2022-08-31

Applicant: Xilinx, Inc.

Inventor： Jorn Tuyls , Xiao Teng , Sanket Pandit , Rajeev Patwari , Qian Zhou , Ehsan Ghasemi , Ephrem C. Wu , Elliott Delaye , Aaron Ng

IPC: G05B19/042

CPC classification number: G05B19/042 , G05B2219/25255 , G05B2219/25257

Abstract: Instruction generation for a data processing array and microcontroller includes generating a tensor-level intermediate representation from a machine learning model using kernel expressions. Statements of the tensor-level intermediate representation are partitioned into a first set of statements and a second set of statements. From the first set of statements, kernel instructions are generated based on a reconfigurable neural engine model. The kernel instructions are executable by a compute tile of a data processing array to implement compute functions of the machine learning model. From the set of second statements, microcontroller instructions are generated based on a super-graph model. The microcontroller instructions are executable by a microcontroller of the data processing array to move data into and out from the data processing array.

7.

发明公开
RECONFIGURABLE NEURAL ENGINE WITH EXTENSIBLE INSTRUCTION SET ARCHITECTURE 审中-公开

公开(公告)号：US20240028556A1

公开(公告)日：2024-01-25

申请号：US17814817

申请日：2022-07-25

Applicant: Xilinx, Inc.

Inventor： Sanket Pandit , Jorn Tuyls , Xiao Teng , Rajeev Patwari , Ehsan Ghasemi , Elliott Delaye , Aaron Ng

IPC: G06F15/80 , G06F9/455

CPC classification number: G06F15/8053 , G06F9/45533

Abstract: An integrated circuit includes a plurality of kernels and a virtual machine coupled to the plurality of kernels. The virtual machine is configured to interpret instructions directed to different ones of the plurality of kernels. The virtual machine is configured to control operation of the different ones of the plurality of kernels responsive to the instructions.

8.

发明授权
Scalable acceleration of reentrant compute operations 有权

公开(公告)号：US12147379B2

公开(公告)日：2024-11-19

申请号：US18089780

申请日：2022-12-28

Applicant: XILINX, INC.

Inventor： Rajeev Patwari , Jorn Tuyls , Elliott Delaye , Xiao Teng , Ephrem Wu

IPC: G06F15/78 , G06F13/28

Abstract: Examples herein describe techniques for performing parallel processing using a plurality of processing elements (PEs) and a controller for data that has data dependencies. For example, a calculation may require an entire row or column to be summed, or to determine its mean. The PEs can be assigned different chunks of a data set (e.g., a tensor set, a column, or a row) for processing. The PEs can use one or more tokens to inform the controller when they are done with partial processing of their data chunks. The controller can then gather the partial results and determine an intermediate value for the data set. The controller can then distribute this intermediate value to the PEs which then re-process their respective data chunks using the intermediate value to generate final results.

9.

发明授权
Reconfigurable neural engine with extensible instruction set architecture 有权

公开(公告)号：US12079158B2

公开(公告)日：2024-09-03

申请号：US17814817

申请日：2022-07-25

Applicant: Xilinx, Inc.

Inventor： Sanket Pandit , Jorn Tuyls , Xiao Teng , Rajeev Patwari , Ehsan Ghasemi , Elliott Delaye , Aaron Ng

IPC: G06F15/76 , G06F9/455 , G06F15/80

CPC classification number: G06F15/8053 , G06F9/45533

Abstract: An integrated circuit includes a plurality of kernels and a virtual machine coupled to the plurality of kernels. The virtual machine is configured to interpret instructions directed to different ones of the plurality of kernels. The virtual machine is configured to control operation of the different ones of the plurality of kernels responsive to the instructions.

10.

发明公开
INSTRUCTION SET ARCHITECTURE FOR DATA PROCESSING ARRAY CONTROL 审中-公开

公开(公告)号：US20240045692A1

公开(公告)日：2024-02-08

申请号：US17818309

申请日：2022-08-08

Applicant: Xilinx, Inc.

Inventor： Xiao Teng , Tejus Siddagangaiah , Bryan Lozano , Ehsan Ghasemi , Rajeev Patwari , Elliott Delaye , Jorn Tuyls , Aaron Ng , Sanket Pandit , Pramod Peethambaran , Satyaprakash Pareek

IPC: G06F9/38 , G06F9/46 , G06F9/30

CPC classification number: G06F9/3814 , G06F9/467 , G06F9/3004

Abstract: Controlling a data processing (DP) array includes creating a replica of a register address space of the DP array based on the design and the DP array. A sequence of instructions, including write instructions and read instructions, is received. The write instructions correspond to buffer descriptors specifying runtime data movements for a design for a DP array. The write instructions are converted into transaction instructions and the read instructions are converted into wait instructions based on the replica of the register address space. The transaction instructions and the wait instructions are included in an instruction buffer. The instruction buffer is provided to a microcontroller configured to execute the transaction instructions and the wait instructions to implement the runtime data movements for the design as implemented in the DP array. In another aspect, the instruction buffer is stored in a file for subsequent execution by the microcontroller.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification