DISCRETE COSINE TRANSFORM/INVERSE DISCRETE COSINE TRANSFORM (DCT/IDCT) SYSTEMS AND METHODS

    公开(公告)号:US20190228049A1

    公开(公告)日:2019-07-25

    申请号:US16370955

    申请日:2019-03-30

    Abstract: The present disclosure is directed to systems and methods for performing discrete cosine transforms and inverse discrete cosine transforms (DCT/IDCT) using a CORDIC algorithm implemented in systolic array circuitry that includes a plurality cells or nodes, each containing circuitry to implement the CORDIC algorithm. DCT/IDCT control circuitry multiplies the systolic array output matrix generated by the systolic array circuitry by a scaling factor that may include a defined scaling value or an actual cosine value. The DCT/IDCT control circuitry causes the transfer of the scaled systolic array output matrix to combination circuitry where the DCT/IDCT input matrix is combined with the scaled systolic array output matrix to provide the DCT/IDCT output matrix. The DCT/IDCT control circuitry also transfers bypass information to at least a portion of the cells or nodes in the systolic array circuitry.

    HARDWARE-SOFTWARE CO-DESIGNED MULTI-CAST FOR IN-MEMORY COMPUTING ARCHITECTURES

    公开(公告)号:US20220113974A1

    公开(公告)日:2022-04-14

    申请号:US17561029

    申请日:2021-12-23

    Abstract: A memory architecture includes processing circuits co-located with memory subarrays for performing computations within the memory architecture. The memory architecture includes a plurality of decoders in hierarchical levels that include a multicast capability for distributing data or compute operations to individual subarrays. The multicast may be configurable with respect to individual fan-outs at each hierarchical level. A computation workflow may be organized into a compute supertile representing one or more “supertiles” of input data to be processed in the compute supertile. The individual data tiles of the input data supertile may be used by multiple compute tiles executed by the processing circuits of the subarrays, and the data tiles multicast to the respective processing circuits for efficient data loading and parallel computation.

    Systems and methods for reconfigurable systolic arrays

    公开(公告)号:US11169957B2

    公开(公告)日:2021-11-09

    申请号:US16371017

    申请日:2019-03-31

    Abstract: Systems and techniques are provided for hardware architecture used in parallel computing applications to improve computation efficiency. An integrated circuit system may include a data store that stores data for processing and a reconfigurable systolic array that may process the data. The reconfigurable systolic array may include a first row of processing elements (PE) that process the data according to a first function and a second row of PE that process the data according to a second function. The reconfigurable systolic array may also include a routing block coupled to the first row of PE, the second row of PE, and the data store. Further, the reconfigurable systolic array may receive data from the first row of PE, transmit the data received from the first row of PE to the second row of PE, and transmit data output by the second row of PE to the first row of PE.

    Discrete cosine transform/inverse discrete cosine transform (DCT/IDCT) systems and methods

    公开(公告)号:US11138290B2

    公开(公告)日:2021-10-05

    申请号:US16370955

    申请日:2019-03-30

    Abstract: The present disclosure is directed to systems and methods for performing discrete cosine transforms and inverse discrete cosine transforms (DCT/IDCT) using a CORDIC algorithm implemented in systolic array circuitry that includes a plurality cells or nodes, each containing circuitry to implement the CORDIC algorithm. DCT/IDCT control circuitry multiplies the systolic array output matrix generated by the systolic array circuitry by a scaling factor that may include a defined scaling value or an actual cosine value. The DCT/IDCT control circuitry causes the transfer of the scaled systolic array output matrix to combination circuitry where the DCT/IDCT input matrix is combined with the scaled systolic array output matrix to provide the DCT/IDCT output matrix. The DCT/IDCT control circuitry also transfers bypass information to at least a portion of the cells or nodes in the systolic array circuitry.

    System and Method for Configurable Systolic Array with Partial Read/Write

    公开(公告)号:US20210200711A1

    公开(公告)日:2021-07-01

    申请号:US16729381

    申请日:2019-12-28

    Abstract: A system is provided that includes a reconfigurable systolic array circuitry. The reconfigurable systolic array circuitry includes a first circuit block comprising one or more groups of processing elements and a second circuit block comprising one or more groups of processing elements. The reconfigurable systolic array circuitry further includes a first bias addition with accumulation circuitry configured to add a matrix bias to an accumulated value, to a multiplication product, or to a combination thereof. The reconfigurable systolic array circuitry additionally includes a first routing circuitry configured to route derivations from the first circuit block into the second circuit block, from the first circuit block into the first bias addition with accumulation circuitry, or into a combination thereof.

    SYSTEMS AND METHODS FOR RECONFIGURABLE SYSTOLIC ARRAYS

    公开(公告)号:US20200311021A1

    公开(公告)日:2020-10-01

    申请号:US16371017

    申请日:2019-03-31

    Abstract: Systems and techniques are provided for hardware architecture used in parallel computing applications to improve computation efficiency. An integrated circuit system may include a data store that stores data for processing and a reconfigurable systolic array that may process the data. The reconfigurable systolic array may include a first row of processing elements (PE) that process the data according to a first function and a second row of PE that process the data according to a second function. The reconfigurable systolic array may also include a routing block coupled to the first row of PE, the second row of PE, and the data store. Further, the reconfigurable systolic array may receive data from the first row of PE, transmit the data received from the first row of PE to the second row of PE, and transmit data output by the second row of PE to the first row of PE.

Patent Agency Ranking