EARLY PREDICATE LOOK-UP
    1.
    发明申请

    公开(公告)号:US20180307491A1

    公开(公告)日:2018-10-25

    申请号:US15493492

    申请日:2017-04-21

    Applicant: ARM Limited

    CPC classification number: G06F9/384 G06F9/30036 G06F9/3016 G06F9/3867

    Abstract: A processing pipeline has at least one front end stage for issuing micro-operations for execution in response to program instructions, and an execute stage for performing data processing in response to the micro-operations. At least one predicate register stores at least one predicate value. In response to a predicated vector instruction for triggering execution of two or more lanes of processing, the at least one front end stage issues at least one micro-operation to control the execute stage to mask an effect of a lane of processing indicated as disabled by a target predicate value. One of the front end stages may perform an early predicate lookup of the target predicate value to vary in dependence on the early predicate lookup, which micro-operations are issued to the execute store for a predicated vector instruction.

    APPARATUS AND METHOD OF MODIFICATION OF STORED DATA

    公开(公告)号:US20200142826A1

    公开(公告)日:2020-05-07

    申请号:US16182741

    申请日:2018-11-07

    Applicant: Arm Limited

    Abstract: Aspects of the present disclosure relate to an apparatus comprising a requester master processing device having an associated private cache storage to store data for access by the requester master processing device. The requester master processing device is arranged to issue a request to modify data that is associated with a given memory address and stored in a private cache storage associated with a recipient master processing device. The private cache storage associated with the recipient master processing device is arranged to store data for access by the recipient master processing device. The apparatus further comprises the recipient master processing device having its private cache storage. One of the recipient master processing device and its associated private cache storage is arranged to perform the requested modification of the data while the data is stored in the cache storage associated with the recipient master processing device.

    APPARATUS AND METHOD IN WHICH CONTROL FUNCTIONS AND SYNCHRONIZATION EVENTS ARE PERFORMED

    公开(公告)号:US20230385127A1

    公开(公告)日:2023-11-30

    申请号:US17824438

    申请日:2022-05-25

    Applicant: Arm Limited

    CPC classification number: G06F9/52 G06F9/30087 G06F9/3818 G06F9/3802

    Abstract: Apparatus comprises a plurality of processing elements; and control circuitry to communicate with the plurality of processing elements by a data communication path; the control circuitry being configured, in response to a request issued by a given processing element of the plurality of processing elements, to initiate a hybrid operation by issuing a command defining the hybrid operation to a group of processing elements comprising at least a subset of the plurality of processing elements, the hybrid operation comprising performance of a control function selected from a predetermined set of one or more control functions and initiation of performance of a synchronization event, the synchronization event comprising each of the group of processing elements providing confirmation that any control functions pending at that processing element have reached at least a predetermined stage of execution; in which the given processing element is configured to inhibit the issuance of any further requests to the control circuitry until each of the group of processing elements has provided such confirmation.

    DEFER BUFFER
    5.
    发明申请
    DEFER BUFFER 审中-公开

    公开(公告)号:US20180253299A1

    公开(公告)日:2018-09-06

    申请号:US15450430

    申请日:2017-03-06

    Applicant: ARM Limited

    Abstract: An apparatus comprises processing circuitry for executing instructions of two or more threads of processing, hardware registers to store context data for the two or more threads concurrently, and commit circuitry to commit results of executed instructions of the threads, where for each thread the commit circuitry commits the instructions of that thread in program order. At least one defer buffer is provided to buffer at least one blocked instruction for which execution by the processing circuitry is complete but execution of an earlier instruction of the same thread in the program order is incomplete. This can help to resolve inter-thread blocking and hence improve performance.

Patent Agency Ranking