SYSTEMS AND METHODS FOR SPEECH OR TEXT PROCESSING USING MATRIX OPERATIONS

    公开(公告)号:US20240152575A1

    公开(公告)日:2024-05-09

    申请号:US18414901

    申请日:2024-01-17

    CPC classification number: G06F17/16 G06F7/5443

    Abstract: Disclosed herein includes a system, a method, and a device for processing and converting data using matrix operations. Circuitry can partition an input of a first data format across a plurality of lookup tables each residing in a respective memory. The circuitry can access weight information from a load store memory, and the partitioned input on a per column basis from the plurality of lookup tables. The circuitry can perform a number of multiply-accumulate (MAC) operations per cycle between the weight information from the load store memory and the partitioned input read on a per column basis from the plurality of lookup tables. The number of MAC operations performed per cycle can correspond to a total number of columns of the plurality of lookup tables. The circuitry can generate, responsive to the MAC operations on the partitioned input, a plurality of outputs in a second data format.

    Efficient multiply-accumulation based on sparse matrix

    公开(公告)号:US11429394B2

    公开(公告)日:2022-08-30

    申请号:US16997460

    申请日:2020-08-19

    Abstract: Disclosed herein includes improving computational efficiency of multiply-accumulate (MAC) operation. In one aspect, a computing device identifies, a first vector including non-zero elements of a base matrix, and a second vector indicating a location of each of the non-zero elements of the base matrix. In one aspect, the device determines a first element and a second element of the first vector. In one aspect, the device determines a third element and a fourth element of the second vector. In one aspect, the device determines i) a fifth element of an input vector according to the third element of the second vector, and ii) a sixth element of the input vector according to the fourth element of the second vector. In one aspect, the device causes a MAC circuitry to perform a dot product according to the first element, the second element, the fifth element, and the sixth element.

    DATA COMPRESSION USING INSTRUCTION SET ARCHITECTURE

    公开(公告)号:US20240220259A1

    公开(公告)日:2024-07-04

    申请号:US18525083

    申请日:2023-11-30

    CPC classification number: G06F9/30178 G06F9/30038 G06F9/30134

    Abstract: In one embodiment, a computing system may set data to a first group of registers. The first group of registers may be configured to be accessed during a single operation cycle. The system may set a number of patterns to a second group of registers. Each pattern of the number of patterns may include an array of index for the data stored in the first group of registers. The system may select, for a first vector register associated with a vector engine, a first pattern from the patterns stored in the second group of registers. The system may load a first portion of the data from the first group of registers to the first vector register based on the first pattern selected for the first vector register from the patterns stored in the second group of registers.

    Systems and methods for speech or text processing using matrix operations

    公开(公告)号:US11899745B1

    公开(公告)日:2024-02-13

    申请号:US16997401

    申请日:2020-08-19

    CPC classification number: G06F17/16 G06F7/5443

    Abstract: Disclosed herein includes a system, a method, and a device for processing and converting data using matrix operations. Circuitry can partition an input of a first data format across a plurality of lookup tables each residing in a respective memory. The circuitry can access weight information from a load store memory, and the partitioned input on a per column basis from the plurality of lookup tables. The circuitry can perform a number of multiply-accumulate (MAC) operations per cycle between the weight information from the load store memory and the partitioned input read on a per column basis from the plurality of lookup tables. The number of MAC operations performed per cycle can correspond to a total number of columns of the plurality of lookup tables. The circuitry can generate, responsive to the MAC operations on the partitioned input, a plurality of outputs in a second data format.

Patent Agency Ranking