Patent search ap:("Xilinx Page Inc.") AND inv:"Ashish Sirasao"

1.

发明授权
Method and apparatus for enhancing performance by moving or adding a pipelined register stage in a cascaded chain 有权

公开(公告)号：US10430539B1

公开(公告)日：2019-10-01

申请号：US15382439

申请日：2016-12-16

Applicant: Xilinx, Inc.

Inventor： Chaithanya Dudha , Zhao Ma , Krishna Garlapati , Ashish Sirasao

IPC: G06F17/50

Abstract: Methods and apparatus relating generally to synthesis are described. In such a method, a directed graph for a circuit design is generated. A cascaded chain is identified in the directed graph with a timing violation. A pipeline register stage of the cascaded chain is moved (or added) to remove the timing violation. The circuit design is transformed to provide a netlist including the pipeline register stage.

2.

发明授权
Loop optimization for implementing circuit designs in hardware 有权

公开(公告)号：US10331836B1

公开(公告)日：2019-06-25

申请号：US15730431

申请日：2017-10-11

Applicant: Xilinx, Inc.

Inventor： Anup Hosangadi , Sumanta Datta , Aman Gayasen , Ashish Sirasao

IPC: G06F17/50 , H03K19/173

Abstract: Implementing a circuit design can include determining a chain of a plurality of loop elements of a circuit design, wherein each loop element includes a bit select node configured to perform a bit assignment operation and a corresponding address calculation node, wherein the address calculation nodes use a common variable to calculate a starting bit location provided to the corresponding bit select node. In response to the determining, the chain is replicated resulting in one chain for each value of the common variable and transforming each chain into a plurality of wires. A multiplexer is inserted into the circuit design. The plurality of wires for each chain is coupled to inputs of the multiplexer and the common variable is provided to the multiplexer as a select signal.

3.

发明申请
MACHINE LEARNING RUNTIME LIBRARY FOR NEURAL NETWORK ACCELERATION 审中-公开

公开(公告)号：US20190114533A1

公开(公告)日：2019-04-18

申请号：US15785679

申请日：2017-10-17

Applicant: Xilinx, Inc.

Inventor： Aaron Ng , Jindrich Zejda , Elliott Delaye , Xiao Teng , Sonal Santan , Soren T. Soe , Ashish Sirasao , Ehsan Ghasemi , Sean Settle

IPC: G06N3/063 , G06N3/10 , G06N3/04 , G06N3/08

Abstract: Embodiments herein describe techniques for interfacing a neural network application with a neural network accelerator using a library. The neural network application may execute on a host computing system while the neural network accelerator executes on a massively parallel hardware system, e.g., a FPGA. The library operates a pipeline for submitting the tasks received from the neural network application to the neural network accelerator. In one embodiment, the pipeline includes a pre-processing stage, an FPGA execution stage, and a post-processing stage which each correspond to different threads. When receiving a task from the neural network application, the library generates a packet that includes the information required for the different stages in the pipeline to perform the tasks. Because the stages correspond to different threads, the library can process multiple packets in parallel which can increase the utilization of the neural network accelerator on the hardware system.

4.

发明授权
Folding duplicate instances of modules in a circuit design 有权

公开(公告)号：US09875330B2

公开(公告)日：2018-01-23

申请号：US14960176

申请日：2015-12-04

Applicant: Xilinx, Inc.

Inventor： Ilya K. Ganusov , Henri Fraisse , Ashish Sirasao , Alireza S. Kaviani

IPC: G06F17/50

CPC classification number: G06F17/5072 , G06F17/5045 , G06F17/505 , G06F17/5054

Abstract: Disclosed approaches for processing a circuit design include identifying duplicate instances of a module in a representation of the circuit design. A processor circuit performs folding operations for at least one pair of the duplicate instances of the module. One instance of the duplicates is removed from the circuit design, and a multiplexer is inserted. The multiplexer receives and selects one of the input signals to the duplicate instances and provides the selected input signal to the remaining instance. For each flip-flop in the remaining instance, a pipelined flip-flop is inserted. Connections to a first clock signal in the remaining instance are replaced with connections to a second clock signal having twice the frequency of the first clock signal. An alignment circuit is inserted to receive the output signal from the first instance and provide concurrent first and second output signals.

5.

发明授权
Neural network processing system having host controlled kernel acclerators 有权

公开(公告)号：US11568218B2

公开(公告)日：2023-01-31

申请号：US15786288

申请日：2017-10-17

Applicant: Xilinx, Inc.

Inventor： Aaron Ng , Jindrich Zejda , Elliott Delaye , Xiao Teng , Ashish Sirasao

IPC: G06N3/063 , G06N3/04

Abstract: A disclosed neural network processing system includes a host computer system, a RAMs coupled to the host computer system, and neural network accelerators coupled to the RAMs, respectively. The host computer system is configured with software that when executed causes the host computer system to write input data and work requests to the RAMS. Each work request specifies a subset of neural network operations to perform and memory locations in a RAM of the input data and parameters. A graph of dependencies among neural network operations is built and additional dependencies added. The operations are partitioned into coarse grain tasks and fine grain subtasks for optimal scheduling for parallel execution. The subtasks are scheduled to accelerator kernels of matching capabilities. Each neural network accelerator is configured to read a work request from the respective RAM and perform the subset of neural network operations on the input data using the parameters.

6.

发明授权
Image preprocessing for generalized image processing 有权

公开(公告)号：US11386644B2

公开(公告)日：2022-07-12

申请号：US15786267

申请日：2017-10-17

Applicant: Xilinx, Inc.

Inventor： Elliott Delaye , Ashish Sirasao , Aaron Ng , Yongjun Wu , Jindrich Zejda

IPC: G06V10/94 , G06F3/06 , G06F17/15 , G06N3/04 , G06N3/063 , G06T1/20 , G06T1/60 , G06V10/44

Abstract: An example preprocessor circuit includes: a first buffer configured to store rows of image data and output a row thereof; a second buffer, coupled to the first buffer, including storage locations to store respective image samples of the row output by the first buffer; shift registers; an interconnect network including connections, each connection coupling a respective one of the shift registers to more than one of the storage locations, one or more of the storage locations being coupled to more than one of the connections; and a control circuit configured to load the shift registers with the image samples based on the connections and shift the shift registers to output streams of image samples.

7.

发明授权
Re-targetable interface for data exchange between heterogeneous systems and accelerator abstraction into software instructions 有权

公开(公告)号：US11204747B1

公开(公告)日：2021-12-21

申请号：US15786395

申请日：2017-10-17

Applicant: Xilinx, Inc.

Inventor： Jindrich Zejda , Elliott Delaye , Yongjun Wu , Aaron Ng , Ashish Sirasao , Khang K. Dao , Christopher J. Case

IPC: G06F9/45 , G06F8/41 , G06N3/02 , G06F13/28 , G06F8/30 , G06F9/451 , G06F9/50 , G06F13/362

Abstract: Embodiments herein describe techniques for interfacing a neural network application with a neural network accelerator that operate on two heterogeneous computing systems. For example, the neural network application may execute on a central processing unit (CPU) in a computing system while the neural network accelerator executes on a FPGA. As a result, when moving a software-hardware boundary between the two heterogeneous systems, changes may be made to both the neural network application (using software code) and to the accelerator (using RTL). The embodiments herein describe a software defined approach where shared interface code is used to express both sides of the interface between the two heterogeneous systems in a single abstraction (e.g., a software class).

8.

发明授权
Circuit arrangements and methods for traversing input feature maps 有权

公开(公告)号：US11106968B1

公开(公告)日：2021-08-31

申请号：US15989075

申请日：2018-05-24

Applicant: Xilinx, Inc.

Inventor： Ehsan Ghasemi , Elliott Delaye , Ashish Sirasao

IPC: G06N3/04

Abstract: A circuit arrangement includes a buffer, a height traversal circuit configured to generate a sequence of IFM height values in response to first control signals, a width traversal circuit configured to generate a sequence of IFM width values in response to second control signals, a control circuit, and an address generation circuit. The control circuit is configured to input an OFM height, an OFM width, a kernel height, and a kernel width; generate the first control signals at times based on the OFM height and the kernel height; and generate the second control signals at times based on the OFM width and the kernel width. The address generation circuit is configured to generate a sequence of addresses based on the sequences of IFM height values and IFM width values, provide the sequence of addresses to the buffer, and enable reading from the buffer.

9.

发明授权
Systems for optimization of read-only memory (ROM) 有权

公开(公告)号：US10726175B1

公开(公告)日：2020-07-28

申请号：US16291952

申请日：2019-03-04

Applicant: Xilinx, Inc.

Inventor： Chaithanya Dudha , Satyaprakash Pareek , Bing Tian , Ashish Sirasao

IPC: G06F30/30 , H01L27/02 , G06F30/327 , G06F30/398 , G06F30/00

Abstract: A memory optimization method includes identifying, within a circuit design, a memory having an arithmetic operator at an output side and/or an input side of the memory. The memory may include a read-only memory (ROM). In some examples, an input of the arithmetic operator includes a constant value. In some embodiments, the memory optimization method further includes absorbing a function of the arithmetic operator into the memory. By way of example, the absorbing the function includes modifying contents of the memory based on the function of the arithmetic operator to provide an updated memory and removing the arithmetic operator from the circuit design.

10.

发明授权
Sparse matrix processing circuitry 有权

公开(公告)号：US10572409B1

公开(公告)日：2020-02-25

申请号：US15976722

申请日：2018-05-10

Applicant: Xilinx, Inc.

Inventor： Jindrich Zejda , Ling Liu , Yifei Zhou , Ashish Sirasao

IPC: G06F3/00 , G06F5/00 , G06F13/20 , G06N3/08

Abstract: A memory arrangement can store a matrix of matrix data elements specified as index-value pairs that indicate row and column indices and associated values. First split-and-merge circuitry is coupled between the memory arrangement and a first set of FIFO buffers for reading the matrix data elements from the memory arrangement and putting the matrix data elements in the first set of FIFO buffers based on column indices. A pairing circuit is configured to read vector data elements, pair the vector data elements with the matrix data elements, and put the paired matrix and vector data elements in a second set of FIFO buffers based on column indices. Second split-and-merge circuitry is configured to read paired matrix and vector data elements from the second set of FIFO buffers and put the paired matrix and vector data elements in a third set of FIFO buffers based on row indices.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification