Intermodal calling branch instruction

    公开(公告)号:US12067400B2

    公开(公告)日:2024-08-20

    申请号:US17757197

    申请日:2020-11-05

    Applicant: Arm Limited

    CPC classification number: G06F9/3861 G06F9/30054 G06F9/30189

    Abstract: Processing circuitry has a handler mode and a thread mode. In response to an exception condition, a switch to handler mode is made. In response to an intermodal calling branch instruction specifying a branch target address when the processing circuitry is in the handler mode, an instruction decoder controls the processing circuitry to save a function return address to a function return address storage location; switch a current mode of the processing circuitry to the thread mode; and branch to an instruction identified by the branch target address. This can be useful for deprivileging of exceptions.

    DATA COMPUTING SYSTEM
    3.
    发明公开

    公开(公告)号:US20240264836A1

    公开(公告)日:2024-08-08

    申请号:US18624870

    申请日:2024-04-02

    Abstract: The present disclosure provides a data computing system. The data computing system comprises: a memory, a processor and an accelerator, wherein the memory is communicatively coupled to the processor and configured to store data to be computed and a computed result, the data being written by the processor; the processor is communicatively coupled to the accelerator and configured to control the accelerator; and the accelerator is communicatively coupled to the memory and configured to access the memory according to pre-configured control information, implement a computing process to produce the computed result and write the computed result back to the memory. The present disclosure also provides an accelerator and a method performed by an accelerator of a data computing system. The present disclosure can improve the execution efficiency of the processor and reduce the computing overhead of the processor.

    REMOTE ATOMIC OPERATIONS FOR CLUSTERED PROCESSING ARCHITECTURE

    公开(公告)号:US20240211258A1

    公开(公告)日:2024-06-27

    申请号:US18145770

    申请日:2022-12-22

    CPC classification number: G06F9/30047 G06F9/30189 G06F11/3409 G06F12/0246

    Abstract: Remote atomics for clustered processing operations are described. An example of a graphics processor includes a clustered processing architecture including multiple clusters and one or more memory elements, including a first memory element containing a home agent, the apparatus to receive, at a first caching agent for a first cluster, a request for performance of an atomic operation requiring a data stored in a cacheline at a memory address associated with the home agent; evaluate one or more factors including a current ownership of the memory address; and, based at least in part on the factors, determine whether to perform the atomic operation at the first caching agent or to forward the atomic operation to the home agent for performance of the atomic operation.

    Data computing system
    5.
    发明授权

    公开(公告)号:US11972262B2

    公开(公告)日:2024-04-30

    申请号:US17648659

    申请日:2022-01-21

    Abstract: The present disclosure provides a data computing system. The data computing system comprises: a memory, a processor and an accelerator, wherein the memory is communicatively coupled to the processor and configured to store data to be computed and a computed result, the data being written by the processor; the processor is communicatively coupled to the accelerator and configured to control the accelerator; and the accelerator is communicatively coupled to the memory and configured to access the memory according to pre-configured control information, implement a computing process to produce the computed result and write the computed result back to the memory. The present disclosure also provides an accelerator and a method performed by an accelerator of a data computing system. The present disclosure can improve the execution efficiency of the processor and reduce the computing overhead of the processor.

    Saving and restoring registers
    6.
    发明授权

    公开(公告)号:US11907720B2

    公开(公告)日:2024-02-20

    申请号:US17759978

    申请日:2020-11-26

    Applicant: ARM LIMITED

    Abstract: There is provided a data processing apparatus comprising a plurality of registers, each of the registers having data bits to store data and metadata bits to store metadata. Each of the registers is adapted to operate in a metadata mode in which the metadata bits and the data bits are valid, and a data mode in which the data bits are valid and the metadata bits are invalid. Mode bit storage circuitry indicates whether each of the registers is in the data mode or the metadata mode. Execution circuitry is responsive to a memory operation that is a store operation on one or more given registers.

    Out-of-order input / output write

    公开(公告)号:US11847461B2

    公开(公告)日:2023-12-19

    申请号:US17748066

    申请日:2022-05-19

    Abstract: A System-On-Chip (SoC) includes a set of registers, a processor, and Out-Of-Order Write (OOOW) circuitry. The processor is to execute instructions including write instructions. After issuing a first write instruction to any of the registers in the set, the processor is to await an acknowledgement for the first write instruction before issuing a second write instruction to any of the registers in the set. The OOOW circuitry is to identify the write instructions issued by the processor to the registers in the set, to perform the identified write instructions in the registers irrespective of acknowledgements from the registers, and to send to the processor imitated acknowledgements for the identified write instructions.

    LOW POWER HARDWARE ARCHITECTURE FOR HANDLING ACCUMULATION OVERFLOWS IN A CONVOLUTION OPERATION

    公开(公告)号:US20230401433A1

    公开(公告)日:2023-12-14

    申请号:US17806143

    申请日:2022-06-09

    Applicant: Recogni Inc.

    CPC classification number: G06N3/063 G06F9/30189

    Abstract: In a low power hardware architecture for handling accumulation overflows in a convolver unit, an accumulator of the convolver unit computes a running total by successively summing dot products from a dot product computation module during an accumulation cycle. In response to the running total overflowing the maximum or minimum value of a data storage element, the accumulator transmits an overflow indicator to a controller and sets its output equal to a positive or negative overflow value. In turn, the controller disables the dot product computation module by clock gating, clamping one of its inputs to zero and/or holding its inputs to constant values. At the end of the accumulation cycle, the output of the accumulator is sampled. In response to a clear signal being asserted, the dot product computation module is enabled, and the running total is set to zero for the start of the next accumulation cycle.

Patent Agency Ranking