Sparse matrix calculations utilizing tightly coupled memory and gather/scatter engine

    公开(公告)号:US11836489B2

    公开(公告)日:2023-12-05

    申请号:US17973466

    申请日:2022-10-25

    发明人: Fei Sun

    摘要: A processor for sparse matrix calculation includes an on-chip memory, a cache, a gather/scatter engine, and a core. The on-chip memory stores a first matrix or vector, and the cache stores a compressed sparse second matrix data structure. The compressed sparse second matrix data structure includes a value array including non-zero element values of the sparse second matrix, where each entry includes a given number of element values; and a column index array where each entry includes the given number of offsets matching the value array. The gather/scatter engine gathers element values of the first matrix or vector using the column index array of the sparse second matrix. In a hybrid horizontal/vertical implementation, the gather/scatter engine gathers sets of element values from sets of rows and from different sub-banks within the same rows based on the column index array of the sparse matrix.

    INFORMATION PROCESSING DEVICE AND SEARCH METHOD

    公开(公告)号:US20230376315A1

    公开(公告)日:2023-11-23

    申请号:US18110607

    申请日:2023-02-16

    申请人: Fujitsu Limited

    IPC分类号: G06F9/38 G06F9/30

    CPC分类号: G06F9/3887 G06F9/3001

    摘要: A non-transitory computer-readable recording medium stores a program for causing a computer to execute a process, the process includes, in a search for combinations of conditions that allow extraction of sample data groups that have n or more attribute pairs whose correlation coefficients exceed a threshold value, when a number of combinations of the conditions is equal to or greater than a number capable of being parallelized, parallelizing processing for the combinations of the conditions per the number capable of being parallelized to calculate the correlation coefficients of respective attribute pairs for each of the combinations of the conditions in addition to a single instruction multiple data (SIMD) conversion process that uses predicate registers as many as the number capable of being parallelized, and searching for the combinations of conditions using the correlation coefficients of the respective attribute pairs for each of the combinations of the conditions.

    INFORMATION PROCESSING DEVICE, INFORMATION PROCESSING METHOD, AND RECORDING MEDIUM

    公开(公告)号:US20230237097A1

    公开(公告)日:2023-07-27

    申请号:US18010948

    申请日:2020-06-25

    申请人: NEC Corporation

    发明人: Osamu DAIDO

    IPC分类号: G06F16/901 G06F9/38 G06F9/30

    摘要: An information processing device performs a decision tree based on a decision tree which has condition determination nodes and leaf nodes. In the information processing device, an instruction unification means generates a unified instruction by unifying an instruction, which each of the condition determination nodes included in the decision tree executes, to be suitable for a parallel processing. An acquisition means acquires a plurality of pieces of input data. A condition determination means performs, by the parallel processing, a condition determination with respect to the plurality of pieces of input data for each of the condition determination nodes.

    Implementing 128-bit SIMD operations on a 64-bit datapath

    公开(公告)号:US11709674B2

    公开(公告)日:2023-07-25

    申请号:US17072378

    申请日:2020-10-16

    IPC分类号: G06F9/30 G06F9/38

    摘要: A method of implementing a processor architecture and corresponding system includes operands of a first size and a datapath of a second size. The second size is different from the first size. Given a first array of registers and a second array of registers, each register of the first and second arrays being of the second size, selecting a first register and corresponding second register from the first array and the second array, respectively, to perform operations of the first size. This allows a user, who is interfacing with the hardware processor through software, to provide data of the datapath bit-width instead of the register bit-width. Advantageously, the user is agnostic to the size of the registers.