Computing array and processor having the same

    公开(公告)号:US12013809B2

    公开(公告)日:2024-06-18

    申请号:US17483395

    申请日:2021-09-23

    IPC分类号: G06F15/80 G06F9/30 G06F9/38

    摘要: A computing array includes a plurality of process element groups, and each of the plurality of the process element groups includes four process elements arranged in two rows and two columns and a merging unit. Each of the four process elements includes an input subunit; a fetch and decode subunit configured to obtain and compile the instruction to output a logic computing type; an operation subunit configured to obtain computing result data according to the logic computing type and the operation data; an output subunit configured to output the computing result data. The merging unit is connected to the output subunit of each of the four process elements, and configured to receive the computing result data output by the output subunit of each of the four process elements, merge the computing result data and output the merged computing result data.

    PIPELINE ARCHITECTURE FOR BITWISE MULTIPLIER-ACCUMULATOR (MAC)

    公开(公告)号:US20240192962A1

    公开(公告)日:2024-06-13

    申请号:US18444695

    申请日:2024-02-18

    发明人: Avidan AKERIB

    IPC分类号: G06F9/38 G06F7/544 G06F9/30

    摘要: A unit for accumulating a plurality of multiplied bit values includes a first row and a second row of input units, a bit-wise multiplier and a bit-wise accumulator. The first row receives a pipeline of the bits of a multiplicand A and the second row, to the left of the first row, receives a pipeline of the bits of a multiplicand B. The bit-wise multiplier, below the first row of input units, includes multiplier bit-line processors formed into rows and columns. Some rows of the bit-wise multiplier bit-wise multiplies bits of a current multiplicand A with one bit of a current multiplicand B and some rows of the bit-wise multiplier handle sum and carry values between the bits. The bit-wise accumulator, to the right of the bit-wise multiplier, includes a column of accumulator bit-line processors. Each accumulator bit-line processor accumulates output of a row of the bit-wise multiplier.

    Debug Trace Circuitry Configured to Generate a Record Including an Address Pair and a Counter Value

    公开(公告)号:US20240192960A1

    公开(公告)日:2024-06-13

    申请号:US18502270

    申请日:2023-11-06

    申请人: SiFive, Inc.

    发明人: Bruce Ableidinger

    IPC分类号: G06F9/38 G06F9/30

    CPC分类号: G06F9/3861 G06F9/30058

    摘要: Systems and methods are disclosed for debug path profiling. For example, a processor pipeline may execute instructions. A debug trace circuitry may, responsive to an indication of a non-sequential execution of an instruction by the processor pipeline, generate a record including an address pair and one or more counter values. The address pair may include a first address corresponding to a first instruction before the non-sequential execution and a second address corresponding to a second instruction resulting in the non-sequential execution. The one or more counter values may indicate, for example, a count of instructions executed, a type of instruction executed, cache misses, cycles consumed by cache misses, translation lookaside buffer misses, cycles consumed by translation lookaside buffer misses, and/or processor stalls.

    Information processing system and information processing method

    公开(公告)号:US12008376B2

    公开(公告)日:2024-06-11

    申请号:US17691212

    申请日:2022-03-10

    申请人: Hitachi, Ltd.

    IPC分类号: G06F9/38 G06F9/48 G06F9/50

    摘要: One or more information processing apparatuses to process information are provided. The information processing apparatus includes: a division function that divides processing information into a plurality of pieces, under a division condition that designates parallel processing among the information processing apparatuses, the processing information indicating a data processing procedure from a plurality of start points to one or more end points; a determination function that uniquely determines an assignee of each piece of the processing information divided by the division function, as any of the information processing apparatuses; and an execution function that executes a process in the information processing apparatus determined by the determination function.

    Load instruction fusion
    100.
    发明授权

    公开(公告)号:US12008369B1

    公开(公告)日:2024-06-11

    申请号:US17652501

    申请日:2022-02-25

    申请人: Apple Inc.

    IPC分类号: G06F9/30 G06F9/38

    摘要: Techniques are disclosed that relate to executing fused instructions. A processor may include a decoder circuit and a load/store circuit. The decoder circuit may detect a load/store instruction to load a value from a memory and detect a non-load/store instruction that depends on the value to be loaded. The decoder circuit may fuse the load/store instruction and the non-load/store instruction such that one or more operations that the non-load/store instruction is defined to perform are to be executed within the load/store circuit. The load/store circuit may receive an indication of the fused load/store and non-load/store instructions and then execute one or more operations of the load/store instruction and the one or more operations of the non-load/store instruction using a circuit included in the load/store circuit.