Reach matrix scheduler circuit for scheduling instructions to be executed in a processor

    公开(公告)号:US11803389B2

    公开(公告)日:2023-10-31

    申请号:US16738362

    申请日:2020-01-09

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3838 G06F9/3836

    摘要: A reach matrix scheduler circuit for scheduling instructions to be executed in a processor is disclosed. The scheduler circuit includes an N×R matrix wake-up circuit, where ‘N’ is the instruction window size of the scheduler circuit, and ‘R’ is the “reach” within the instruction window of the matrix wake-up circuit, with ‘R’ being less than ‘N’. A grant line associated with each instruction request entry in the N×R matrix wake-up circuit is coupled to ‘R’ other instruction entries among the ‘N’ instruction entries. When a producer instruction in an instruction request entry is ready for issuance, the grant line associated with the instruction request entry is activated so that any other instruction entries coupled to the grant line (i.e., within the “reach” of the instruction request entry) that consume the produced value generated by the producer instruction are “woken-up” and subsequently indicated as ready to be issued.

    Hardware node with position-dependent memories for neural network processing

    公开(公告)号:US11144820B2

    公开(公告)日:2021-10-12

    申请号:US15637426

    申请日:2017-06-29

    摘要: Processors and methods for neural network processing are provided. A method in a processor including a pipeline having a matrix vector unit (MVU), a first multifunction unit connected to receive an input from the matrix vector unit, a second multifunction unit connected to receive an output from the first multifunction unit, and a third multifunction unit connected to receive an output from the second multifunction unit is provided. The method includes decoding a chain of instructions received via an input queue, where the chain of instructions comprises a first instruction that can only be processed by the matrix vector unit and a sequence of instructions that can only be processed by a multifunction unit. The method includes processing the first instruction using the MVU and processing each of instructions in the sequence of instructions depending upon a position of the each of instructions in the sequence of instructions.

    Handling tenant requests in a system that uses hardware acceleration components

    公开(公告)号:US11099906B2

    公开(公告)日:2021-08-24

    申请号:US16128224

    申请日:2018-09-11

    IPC分类号: G06F9/46 G06F9/50 H04L29/08

    摘要: A service mapping component (SMC) is described herein for processing requests by instances of tenant functionality that execute on software-driven host components (or some other components) in a data processing system. The SMC is configured to apply at least one rule to determine whether a service requested by an instance of tenant functionality is to be satisfied by at least one of: a local host component, a local hardware acceleration component which is locally coupled to the local host component, and/or at least one remote hardware acceleration component that is indirectly accessible to the local host component via the local hardware acceleration component. In performing its analysis, the SMC can take into account various factors, such as whether or not the service corresponds to a line-rate service, latency-related considerations, security-related considerations, and so on.

    Changing between different roles at acceleration components

    公开(公告)号:US10511478B2

    公开(公告)日:2019-12-17

    申请号:US14752807

    申请日:2015-06-26

    IPC分类号: G06F9/50 H04L12/24

    摘要: Aspects extend to methods, systems, and computer program products for changing between different roles at acceleration components. Changing roles at an acceleration component can be facilitated without loading an image file to configure or partially reconfigure the acceleration component. At configuration time, an acceleration component can be configured with a framework and a plurality of selectable roles. The framework also provides a mechanism for loading different selectable roles for execution at the acceleration component (e.g., the framework can include a superset of instructions for providing any of a plurality of different roles). The framework can receive requests for specified roles from other components and switch to a subset of instructions for the specified roles. Switching between subsets of instructions at an acceleration component is a lower overhead operation relative to reconfiguring or partially reconfiguring an acceleration component by loading an image file.

    DECOUPLED PROCESSOR INSTRUCTION WINDOW AND OPERAND BUFFER

    公开(公告)号:US20190310852A1

    公开(公告)日:2019-10-10

    申请号:US16450172

    申请日:2019-06-24

    摘要: A processor core in an instruction block-based microarchitecture is configured so that an instruction window and operand buffers are decoupled for independent operation in which instructions in the block are not tied to resources such as control bits and operands that are maintained in the operand buffers. Instead, pointers are established among instructions in the block and the resources so that control state can be established for a refreshed instruction block (i.e., an instruction block that is reused without re-fetching it from an instruction cache) by following the pointers. Such decoupling of the instruction window from the operand space can provide greater processor efficiency, particularly in multiple core arrays where refreshing is utilized (for example when executing program code that uses tight loops), because the operands and control bits are pre-validated.

    Verifying branch targets
    10.
    发明授权

    公开(公告)号:US10409606B2

    公开(公告)日:2019-09-10

    申请号:US14752356

    申请日:2015-06-26

    IPC分类号: G06F9/30 G06F9/38 G06F9/32

    摘要: Apparatus and methods are disclosed for implementing bad jump detection in block-based processor architectures. In one example of the disclosed technology, a block-based processor includes one or more block-based processing cores configured to fetch and execute atomic blocks of instructions and a control unit configured to, based at least in part on receiving a branch signal indicating a target location is received from one of the instruction blocks, verify that the target location is a valid branch target.