METHOD AND APPARATUS FOR EFFICIENT MATRIX TRANSPOSE

    公开(公告)号:US20190042248A1

    公开(公告)日:2019-02-07

    申请号:US15941526

    申请日:2018-03-30

    Abstract: Disclosed embodiments relate to a method and apparatus for efficient matrix transpose. In one example, a processor to execute a matrix transpose instruction includes fetch circuitry to fetch the matrix transpose instruction specifying a destination matrix and a source matrix having (N×M) elements and (M×N) elements, respectively, a (N×M) load buffer, decode circuitry to decode the fetched matrix transpose instruction, and execution circuitry, responsive to the decoded matrix transpose instruction to, for each row X of M rows of the specified source matrix: fetch and buffer N elements of the row in a load register, and cause the N buffered elements to be written, in the same relative order as in the row, to column X of M columns of the load buffer, and the execution circuitry subsequently to write each of N rows of the load buffer to a same row of the load buffer.

    INSTRUCTIONS FOR VECTOR MULTIPLICATION OF SIGNED WORDS WITH ROUNDING

    公开(公告)号:US20210081200A1

    公开(公告)日:2021-03-18

    申请号:US16642766

    申请日:2017-09-27

    Abstract: Disclosed embodiments relate to executing a vector multiplication instruction. In one example, a processor includes fetch circuitry to fetch the vector multiplication instruction having fields for an opcode, first and second source identifiers, and a destination identifier, decode circuitry to decode the fetched instruction, execution circuitry to, on each of a plurality of corresponding pairs of fixed-sized elements of the identified first and second sources, execute the decoded instruction to generate a double-sized product of each pair of fixed-sized elements, the double-sized product being represented by at least twice a number of bits of the fixed size, and generate a signed fixed-sized result by rounding the most significant fixed-sized portion of the double-sized product to fit into the identified destination.

    INSTRUCTIONS FOR VECTOR MULTIPLICATION OF UNSIGNED WORDS WITH ROUNDING

    公开(公告)号:US20210072985A1

    公开(公告)日:2021-03-11

    申请号:US16642778

    申请日:2017-09-27

    Abstract: Disclosed embodiments relate to executing a vector multiplication instruction. In one example, a processor includes fetch circuitry to fetch the vector multiplication instruction having fields for an opcode, first and second source identifiers, and a destination identifier, decode circuitry to decode the fetched instruction, execution circuitry to, on each of a plurality of corresponding pairs of fixed-sized elements of the identified first and second sources, execute the decoded instruction to generate a double-sized product of each pair of fixed-sized elements, the double-sized product being represented by at least twice a number of bits of the fixed size, and generate an unsigned fixed-sized result by rounding the most significant fixed-sized portion of the double-sized product to fit into the identified destination.

Patent Agency Ranking