-
公开(公告)号:US12008066B2
公开(公告)日:2024-06-11
申请号:US17816487
申请日:2022-08-01
发明人: Frederick A. Ware , Cheng C. Wang
摘要: An integrated circuit including a multiplier-accumulator execution pipeline including a plurality of multiplier-accumulator circuits to process the data, using filter weights, via a plurality of multiply and accumulate operations. The integrated circuit includes first conversion circuitry, coupled the pipeline, having inputs to receive a plurality of sets of data, wherein each set of data includes a plurality of data, Winograd conversion circuitry to convert each set of data to a corresponding Winograd set of data, floating point format conversion circuitry, coupled to the Winograd conversion circuitry, to convert the data of each Winograd set of data to a floating point data format. In operation, the multiplier-accumulator circuits are configured to perform the plurality of multiply and accumulate operations using the data of the plurality of Winograd sets of data from the first conversion circuitry and the filter weights, and generate output data based on the multiply and accumulate operations.
-
2.
公开(公告)号:US11960886B2
公开(公告)日:2024-04-16
申请号:US17728829
申请日:2022-04-25
IPC分类号: G06F9/30
CPC分类号: G06F9/3001
摘要: An integrated circuit including a plurality of processing components to process image data of a plurality of image frames, wherein each image frame includes a plurality of stages. Each processing component includes a plurality of execution pipelines, wherein each pipeline includes a plurality of multiplier-accumulator circuits configurable to perform multiply and accumulate operations using image data and filter weights, wherein: (i) a first processing component is configurable to process all of the data associated with a first plurality of stages of each image frame, and (ii) a second processing component of the plurality of processing components is configurable to process all of the data associated with a second plurality of stages of each image frame. The first and second processing component processes data associated with the first and second plurality of stages, respectively, of a first image frame concurrently.
-
公开(公告)号:US20240004612A1
公开(公告)日:2024-01-04
申请号:US18215993
申请日:2023-06-29
发明人: Frederick A. Ware , Cheng C. Wang
IPC分类号: G06F7/544
CPC分类号: G06F7/5443
摘要: An integrated circuit device includes broadcast data paths, a weighting-value memory, multiply-accumulate (MAC) units, and shared shift-out circuitry. The MAC units are coupled in common to each of the broadcast data paths and coupled to receive respective weighting values from the weighting-value memory via respective weighting-value paths. Each of the MAC units includes MAC circuits that each receive an input data value via a respective one of the broadcast data paths and a shared one of the weighting values via a shared one of the respective weighting-value paths; generate a sequence of multiplication products by multiplying the input data value with the shared one of the weighting values; accumulate a sum of the multiplication products; and output the sum of the multiplication products to a respective one of a plurality of serially coupled storage elements within the shared shift-out path.
-
公开(公告)号:US11663016B2
公开(公告)日:2023-05-30
申请号:US17701749
申请日:2022-03-23
发明人: Cheng C. Wang
IPC分类号: G06F9/38 , G06F7/544 , G06F9/30 , H03K19/17736 , G06F13/40 , H03K19/17724 , G06F5/16
CPC分类号: G06F9/3893 , G06F5/16 , G06F7/5443 , G06F9/30079 , G06F13/4068 , H03K19/17724 , H03K19/17736
摘要: An integrated circuit including configurable multiplier-accumulator circuitry, wherein, during processing operations, a plurality of the multiplier-accumulator circuits are serially connected into pipelines to perform concatenated multiply and accumulate operations. The integrated circuit includes a first memory and a second memory, and a switch interconnect network, including configurable multiplexers arranged in a plurality of switch matrices. The first and second memories are configurable as either a dedicated read memory or a dedicated write memory and connected to a given pipeline, via the switch interconnect network, during a processing operation performed thereby; wherein, during a first processing operations, the first memory is dedicated to write data to a first pipeline and the second memory is dedicated to read data therefrom and, during a second processing operation, the first memory is dedicated to read data from a second pipeline and the second memory is dedicated to write data thereto.
-
5.
公开(公告)号:US20220391343A1
公开(公告)日:2022-12-08
申请号:US17819958
申请日:2022-08-16
发明人: Frederick A. Ware , Cheng C. Wang
摘要: An integrated circuit including control/configure circuitry which interfaces with a plurality of interconnected MACs and/or one or more rows of interconnected connected MACs. The control/configure circuitry may include a plurality of control/configure circuits, each control/configure circuit interfaces with at least one MAC pipeline, wherein each pipeline includes a plurality of linearly connected multiplier-accumulator circuits. Each control/configure circuit may include one or more of (i) a configurable input data signal path to provide data to the MACs of the pipeline during the execution sequence(s) and (ii) a configurable output data path for the output data generated by execution sequence (i.e., input data that was processed via the multiplier-accumulator circuits of the pipeline). In one embodiment, the sum data, generated by the accumulator during an execution cycle is stored in the associated MAC for use in the subsequent execution cycle as the second data by the same accumulator of the associated MAC.
-
6.
公开(公告)号:US11314504B2
公开(公告)日:2022-04-26
申请号:US16816164
申请日:2020-03-11
IPC分类号: G06F9/30
摘要: An integrated circuit including a plurality of processing components, including first and second processing components, wherein each processing component includes first memory to store image data and a plurality of multiplier-accumulator execution pipelines, wherein each multiplier-accumulator execution pipeline includes a plurality of multiplier-accumulator circuits to, in operation, perform multiply and accumulate operations using data from the first memory and filter weights. The first processing component is configured to process all of the data associated with all of stages of a first image frame via the plurality of multiplier-accumulator execution pipelines of the first processing component. The second processing component is configured to process all of the data associated with all of stages of a second image frame via the plurality of multiplier-accumulator execution pipelines of the second processing component, wherein the first image frame and the second image frame are successive image frames.
-
公开(公告)号:US20220027152A1
公开(公告)日:2022-01-27
申请号:US17376415
申请日:2021-07-15
发明人: Frederick A Ware , Cheng C. Wang
摘要: An integrated circuit comprising a plurality of multiplier-accumulator circuits connected in series in a linear pipeline to perform a plurality of concatenated multiply and accumulate operations, wherein each multiplier-accumulator circuit of the plurality of multiplier-accumulator circuits includes: a multiplier to multiply first data by a multiplier weight data and generate a product data, and an accumulator, coupled to the multiplier of the associated multiplier-accumulator circuit, to add second data and the product data of the associated multiplier to generate sum data. The integrated circuit also includes a plurality of granularity configuration circuits, wherein each granularity configuration circuit is associated with a different multiplier-accumulator circuit of the plurality of multiplier-accumulator circuits to operationally (i) disconnect the multiplier and accumulator of the associated multiplier-accumulator circuit from the linear pipeline during operation or (ii) connect the multiplier and accumulator of the associated multiplier-accumulator circuit to the linear pipeline during operation.
-
8.
公开(公告)号:US20210326286A1
公开(公告)日:2021-10-21
申请号:US17212411
申请日:2021-03-25
发明人: Frederick A. Ware , Cheng C. Wang
摘要: An integrated circuit including control/configure circuitry which interfaces with a plurality of interconnected (e.g., serially) multiplier-accumulator circuits and/or one or more rows of interconnected (e.g., serially) multiplier-accumulator circuits. The control/configure circuitry may include a plurality of control/configure circuits, each control/configure circuit interfaces with at least one multi-bit MAC execution pipeline, wherein each pipeline includes a plurality of interconnected (e.g., serially) multiplier-accumulator circuits. Each control/configure circuit may include one or more (or all) of (i) a configurable input data signal path to provide data to the MACs of the pipeline during the execution sequence(s), (ii) a configurable accumulation data path for the ongoing/accumulating MAC accumulation totals generated by the MACs during an execution sequence, and (iii) a configurable output data path for the output data generated by execution sequence (i.e., input data that was processed via the multiplier-accumulator circuits or MAC processors of the execution pipeline).
-
公开(公告)号:US20210273641A1
公开(公告)日:2021-09-02
申请号:US17219952
申请日:2021-04-01
发明人: Cheng C. Wang
IPC分类号: H03K19/1776 , H03K19/17724
摘要: An integrated circuit comprising a plurality of multiply-accumulator circuits, connected in series, wherein the plurality of multiply-accumulator circuits includes a first MAC circuit, including a multiplier to multiply first data and first multiplier weight data and output first product data, and an accumulator, coupled to the multiplier of the first MAC circuit, to add second data and the first product data and output first sum data. The plurality of multiply-accumulator circuits further includes a second MAC circuit including a multiplier to multiply third data and second multiplier weight data and output second product data, and an accumulator, coupled to the multiplier of the second MAC circuit and the accumulator of the first MAC circuit, to generate and output second sum data. A first load-store register is coupled to an output of the accumulator of the first MAC circuit and an input of the accumulator of the second MAC circuit.
-
10.
公开(公告)号:US10855284B1
公开(公告)日:2020-12-01
申请号:US16579766
申请日:2019-09-23
发明人: Yongning Liu , Fan Mo , Cheng C. Wang
IPC分类号: H03K19/17736 , H03K19/17796 , H03K19/17732
摘要: A method of routing interconnects of a field programmable gate array including: a plurality of logic tiles, and a tile-to-tile interconnect network, having a plurality of tile-to-tile interconnects to interconnect logic tile networks of the logic tiles, the method comprises: routing a first plurality of tile-to-tile interconnects in a first plurality of logic tiles. After routing the first plurality of tile-to-tile interconnects, routing a second plurality of tile-to-tile interconnects in a second plurality of logic tiles. The start/end point of each tile-to-tile interconnect in the first plurality and the second plurality of tiles is independent of the start/end point of the other tile-to-tile interconnects in the first and second plurality, respectively. Routing the second plurality of tile-to-tile interconnects includes connecting at least one start/end point of each tile-to-tile interconnect in the second plurality of tiles to at least one start/end point of each interconnect in the first plurality of tiles.
-
-
-
-
-
-
-
-
-