Patent search ap:("Xilinx Page Inc.") AND inv:"Aaron Ng"

11.

发明授权
Software-driven design optimization for mapping between floating-point and fixed-point multiply accumulators 有权

公开(公告)号：US10678509B1

公开(公告)日：2020-06-09

申请号：US16106743

申请日：2018-08-21

Applicant: Xilinx, Inc.

Inventor： Sean Settle , Elliott Delaye , Aaron Ng , Ehsan Ghasemi , Ashish Sirasao , Xiao Teng , Jindrich Zejda

IPC: G06F7/544 , G06F7/533 , G06N3/08 , G06N3/04

Abstract: An example multiply accumulate (MACC) circuit includes a multiply-accumulator having an accumulator output register, a scaler, coupled to the multiply accumulator, and a control circuit coupled to the multiply-accumulator and the scaler. The control circuit is configured to provide control data to the scaler, the control data indicative of: a most-significant bit (MSB) to least significant bit (LSB) range for selecting bit indices from the accumulator output register for implementing a first right shift; a multiplier; and a second right shift.

12.

发明授权
Inline image preprocessing for convolution operations using a matrix multiplier on an integrated circuit 有权

公开(公告)号：US10460416B1

公开(公告)日：2019-10-29

申请号：US15786244

申请日：2017-10-17

Applicant: Xilinx, Inc.

Inventor： Ashish Sirasao , Elliott Delaye , Aaron Ng , Ehsan Ghasemi

IPC: G06T1/20 , G06T1/60 , H04N21/2381 , G06F3/03 , H03K19/177 , H04N5/30 , G06F12/00

Abstract: An example preprocessor circuit for formatting image data into a plurality of streams of image samples includes: a plurality of memory banks configured to store the image data; multiplexer circuitry coupled to the memory banks; a first plurality of registers coupled to the multiplexer circuitry; a second plurality of registers coupled to the first plurality of registers, outputs of the second plurality of registers configured to provide the plurality of streams of image samples; and control circuitry configured to generate addresses for the plurality of memory banks, control the multiplexer circuitry to select among outputs of the plurality of memory banks, control the first plurality of registers to store outputs of the second plurality of multiplexers, and control the second plurality of registers to store outputs of the first plurality of registers.

13.

发明授权
Software-defined memory bandwidth reduction by hierarchical stream buffering for general matrix multiplication in a programmable IC 有权

公开(公告)号：US10354733B1

公开(公告)日：2019-07-16

申请号：US15786321

申请日：2017-10-17

Applicant: Xilinx, Inc.

Inventor： Jindrich Zejda , Elliott Delaye , Ashish Sirasao , Yongjun Wu , Aaron Ng

IPC: G11C8/00 , G11C16/10 , G06N3/04 , G06F12/06 , G06F13/16 , G06N20/00

Abstract: Methods and apparatus are described for partitioning and reordering block-based matrix multiplications for high-speed data streaming in general matrix multiplication (GEMM), which may be implemented by a programmable integrated circuit (IC). By preloading and hierarchically caching the blocks, examples of the present disclosure reduce the double data rate (DDR) memory intake bandwidth for software-defined GEMM accelerators.

14.

发明授权
Programmable integrated circuit design flow using timing-driven pipeline analysis 有权

公开(公告)号：US09836568B1

公开(公告)日：2017-12-05

申请号：US15069524

申请日：2016-03-14

Applicant: Xilinx, Inc.

Inventor： Ilya K. Ganusov , Aaron Ng , Ronald E. Plyler , Sabyasachi Das , Frederic Revenu

IPC: G06F9/455 , G06F17/50

CPC classification number: G06F17/5072 , G06F17/5081

Abstract: Improving timing of a circuit design may include determining, using a processor, critical feed-forward paths of the circuit design, determining, using the processor, a sequential loop having a largest loop delay within the circuit design, and iteratively cutting, using the processor, the critical feed-forward paths and feed-forward paths parallel to the cut critical feed-forward paths until a stopping condition is met. The stopping condition may be determined according to the largest loop delay. The circuit design may be modified by inserting a register at each cut feed-forward path.

15.

发明授权
Post-routing structural netlist optimization for circuit designs 有权

公开(公告)号：US09646126B1

公开(公告)日：2017-05-09

申请号：US14671920

申请日：2015-03-27

Applicant: Xilinx, Inc.

Inventor： Ruibing Lu , Zhiyong Wang , Aaron Ng , Sabyasachi Das

IPC: G06F9/455 , G06F17/50

CPC classification number: G06F17/5068 , G06F17/5031 , G06F17/5077 , G06F2217/84

Abstract: Post-routing processing of a circuit design may include determining, using a processor, a baseline delay for a path of a routed circuit design, comparing, using the processor, the baseline delay of the path with a timing constraint of the path, and selectively applying, according to the comparing, a structural netlist optimization to the path resulting in an optimized path using a processor.

16.

发明授权
Reconfigurable neural engine with extensible instruction set architecture 有权

公开(公告)号：US12079158B2

公开(公告)日：2024-09-03

申请号：US17814817

申请日：2022-07-25

Applicant: Xilinx, Inc.

Inventor： Sanket Pandit , Jorn Tuyls , Xiao Teng , Rajeev Patwari , Ehsan Ghasemi , Elliott Delaye , Aaron Ng

IPC: G06F15/76 , G06F9/455 , G06F15/80

CPC classification number: G06F15/8053 , G06F9/45533

Abstract: An integrated circuit includes a plurality of kernels and a virtual machine coupled to the plurality of kernels. The virtual machine is configured to interpret instructions directed to different ones of the plurality of kernels. The virtual machine is configured to control operation of the different ones of the plurality of kernels responsive to the instructions.

17.

发明公开
INSTRUCTION SET ARCHITECTURE FOR DATA PROCESSING ARRAY CONTROL 审中-公开

公开(公告)号：US20240045692A1

公开(公告)日：2024-02-08

申请号：US17818309

申请日：2022-08-08

Applicant: Xilinx, Inc.

Inventor： Xiao Teng , Tejus Siddagangaiah , Bryan Lozano , Ehsan Ghasemi , Rajeev Patwari , Elliott Delaye , Jorn Tuyls , Aaron Ng , Sanket Pandit , Pramod Peethambaran , Satyaprakash Pareek

IPC: G06F9/38 , G06F9/46 , G06F9/30

CPC classification number: G06F9/3814 , G06F9/467 , G06F9/3004

Abstract: Controlling a data processing (DP) array includes creating a replica of a register address space of the DP array based on the design and the DP array. A sequence of instructions, including write instructions and read instructions, is received. The write instructions correspond to buffer descriptors specifying runtime data movements for a design for a DP array. The write instructions are converted into transaction instructions and the read instructions are converted into wait instructions based on the replica of the register address space. The transaction instructions and the wait instructions are included in an instruction buffer. The instruction buffer is provided to a microcontroller configured to execute the transaction instructions and the wait instructions to implement the runtime data movements for the design as implemented in the DP array. In another aspect, the instruction buffer is stored in a file for subsequent execution by the microcontroller.

18.

发明公开
MACHINE LEARNING DEPLOYMENT PLATFORM 审中-公开

公开(公告)号：US20230244966A1

公开(公告)日：2023-08-03

申请号：US17649912

申请日：2022-02-03

Applicant: Xilinx, Inc.

Inventor： Varun Sharma , Aaron Ng

IPC: G06N5/04 , G06N20/00

CPC classification number: G06N5/043 , G06N20/00

Abstract: An inference server is capable of receiving a plurality of inference requests from one or more client systems. Each inference request specifies one of a plurality of different endpoints. The inference server can generate a plurality of batches each including one or more of the plurality of inference requests directed to a same endpoint. The inference server also can process the plurality of batches using a plurality of workers executing in an execution layer therein. Each batch is processed by a worker of the plurality of workers indicated by the endpoint of the batch.

19.

发明授权
Neural network processing system having host controlled kernel acclerators 有权

公开(公告)号：US11568218B2

公开(公告)日：2023-01-31

申请号：US15786288

申请日：2017-10-17

Applicant: Xilinx, Inc.

Inventor： Aaron Ng , Jindrich Zejda , Elliott Delaye , Xiao Teng , Ashish Sirasao

IPC: G06N3/063 , G06N3/04

Abstract: A disclosed neural network processing system includes a host computer system, a RAMs coupled to the host computer system, and neural network accelerators coupled to the RAMs, respectively. The host computer system is configured with software that when executed causes the host computer system to write input data and work requests to the RAMS. Each work request specifies a subset of neural network operations to perform and memory locations in a RAM of the input data and parameters. A graph of dependencies among neural network operations is built and additional dependencies added. The operations are partitioned into coarse grain tasks and fine grain subtasks for optimal scheduling for parallel execution. The subtasks are scheduled to accelerator kernels of matching capabilities. Each neural network accelerator is configured to read a work request from the respective RAM and perform the subset of neural network operations on the input data using the parameters.

20.

发明授权
Image preprocessing for generalized image processing 有权

公开(公告)号：US11386644B2

公开(公告)日：2022-07-12

申请号：US15786267

申请日：2017-10-17

Applicant: Xilinx, Inc.

Inventor： Elliott Delaye , Ashish Sirasao , Aaron Ng , Yongjun Wu , Jindrich Zejda

IPC: G06V10/94 , G06F3/06 , G06F17/15 , G06N3/04 , G06N3/063 , G06T1/20 , G06T1/60 , G06V10/44

Abstract: An example preprocessor circuit includes: a first buffer configured to store rows of image data and output a row thereof; a second buffer, coupled to the first buffer, including storage locations to store respective image samples of the row output by the first buffer; shift registers; an interconnect network including connections, each connection coupling a respective one of the shift registers to more than one of the storage locations, one or more of the storage locations being coupled to more than one of the connections; and a control circuit configured to load the shift registers with the image samples based on the connections and shift the shift registers to output streams of image samples.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification