Decoupled access-execute processing

    公开(公告)号:US12001845B2

    公开(公告)日:2024-06-04

    申请号:US17755130

    申请日:2020-10-15

    Applicant: Arm Limited

    Abstract: An apparatus comprises first instruction execution circuitry, second instruction execution circuitry, and a decoupled access buffer. Instructions of an ordered sequence of instructions are issued to one of the first and second instruction execution circuitry for execution in dependence on whether the instruction has a first type label or a second type label. An instruction with the first type label is an access-related instruction which determines at least one characteristic of a load operation to retrieve a data value from a memory address. Instruction execution by the first instruction execution circuitry of instructions having the first type label is prioritised over instruction execution by the second instruction execution circuitry of instructions having the second type label. Data values retrieved from memory as a result of execution of the first type instructions are stored in the decoupled access buffer.

    Replicate elements instruction
    42.
    发明授权

    公开(公告)号:US11977884B2

    公开(公告)日:2024-05-07

    申请号:US16468108

    申请日:2017-11-10

    Applicant: ARM LIMITED

    CPC classification number: G06F9/30032 G06F9/30018 G06F9/30036 G06F9/30109

    Abstract: A replicate elements instruction defining a plurality of variable length segments in a result vector controls processing circuitry (80) to generate a result vector in which, in each respective segment, a repeating value is repeated throughout that segment of the result vector, the repeating value comprising a data value or element index of a selected data element of a source vector. This instructions is useful for accelerating processing of data structures smaller than the vector length.

    CERTAINTY-BASED CLASSIFICATION NETWORKS
    43.
    发明公开

    公开(公告)号:US20230289654A1

    公开(公告)日:2023-09-14

    申请号:US18016914

    申请日:2021-07-19

    Applicant: Arm Limited

    CPC classification number: G06N20/00 G06N7/01

    Abstract: A certainty-based prediction apparatus and method are provided. A plurality of main classifier (MC) modules each predict an MC predicted class based on input data, and determine an MC certainty. Each MC module processes a pre-trained, machine learning main classifier having at least one expert class and a plurality of non-expert classes. An expert classifier (EC) module associated with each expert class predicts an EC predicted class based on the input data. Each EC module processes a pre-trained, machine learning expert classifier having two classes including an associated expert class and a residual class that includes any non-associated expert classes and the plurality of non-expert classes. A final predicted class decision module determines a final predicted class and a final certainty based on each MC predicted class, each MC certainty and each EC predicted class. The final predicted class and the final certainty are output.

    SYSTEM, CIRCUIT, DEVICE AND/OR PROCESSES FOR NEURAL NETWORK TRAINING

    公开(公告)号:US20230087612A1

    公开(公告)日:2023-03-23

    申请号:US17481871

    申请日:2021-09-22

    Applicant: Arm Limited

    Inventor: Mbou Eyole

    Abstract: Example methods, devices and/or circuits to be implemented in a processing device to perform operations based, at least in part, on machine-learning. According to an embodiment, one or more parameters of a neural network node may be altered based, at least in part, on one or more error signals that are based, at least in part, on one or more errors generated by a local operational circuit.

    Error detection using vector processing circuitry

    公开(公告)号:US11507475B2

    公开(公告)日:2022-11-22

    申请号:US16475487

    申请日:2017-12-12

    Applicant: Arm Limited

    Abstract: A data processing apparatus (2) has scalar processing circuitry (32-42) and vector processing circuitry (38, 40, 42). When executing main scalar processing on the scalar processing circuitry (32-42), or main vector processing using a subset of said plurality of lanes on the vector processing circuitry (38, 40, 42), checker processing is executed using at least one lane of the plurality of lanes on the vector processing circuitry (38, 40, 42), the checker processing comprising operations corresponding to at least part of the main scalar/vector processing. Errors can then be detected based on a comparison of an outcome of the main processing and an outcome of the checker processing. This provides a technique for achieving functional safety in a high end processor with better performance and reduced hardware cost compared to a dual/triple core lockstep approach.

    Neural network architecture
    46.
    发明授权

    公开(公告)号:US11501150B2

    公开(公告)日:2022-11-15

    申请号:US16879587

    申请日:2020-05-20

    Applicant: Arm Limited

    Abstract: Various implementations are related to an apparatus with memory cells arranged in columns and rows, and the memory cells are accessible with a column control voltage for accessing the memory cells via the columns and a row control voltage for accessing the memory cells via the rows. The apparatus may include neural network circuitry having neuronal junctions that are configured to receive, record, and provide information related to incoming voltage spikes associated with input signals based on resistance through the neuronal junctions. The apparatus may include stochastic re-programmer circuitry that receives the incoming voltage spikes, receives the information provided by the neuronal junctions, and reconfigure the information recorded in the neuronal junctions based on the incoming voltage spikes associated with the input signals along with a programming control signal provided by the memory circuitry.

    Element by vector operations in a data processing apparatus

    公开(公告)号:US11327752B2

    公开(公告)日:2022-05-10

    申请号:US16487256

    申请日:2018-02-02

    Applicant: ARM LIMITED

    Abstract: A data processing apparatus, a method of operating a data processing apparatus, a non-transitory computer readable storage medium, and an instruction are provided. The instruction specifies a first source register, a second source register, and an index. In response to the instruction control signals are generated, causing processing circuitry to perform a data processing operation with respect to each data group in the first source register and the second source register to generate respective result data groups forming a result of the data processing operation. Each of the first source register and the second source register has a size which is an integer multiple at least twice a predefined size of the data group, and each data group comprises a plurality of data elements. The operands of the data processing operation for each data group are a selected data element identified in the data group of the first source register by the index and each data element in the data group of the second source register. A technique for element-by-vector operation which is readily scalable as the register width grows.

    Instruction scheduling patterns on decoupled systems

    公开(公告)号:US11269646B2

    公开(公告)日:2022-03-08

    申请号:US17215394

    申请日:2021-03-29

    Applicant: Arm Limited

    Abstract: Apparatuses and methods for instruction scheduling in an out-of-order decoupled access-execute processor are disclosed. The instructions for the decoupled access-execute processor comprises access instructions and execute instructions, where access instructions comprise load instructions and instructions which provide operand values to load instructions. Schedule patterns of groups of linked execute instructions are monitored, where the execute instructions in a group of linked execute instructions are linked by data dependencies. On the basis of an identified repeating schedule pattern configurable execution circuitry adopts a configuration to perform the operations defined by the group of linked execute instructions of the repeating schedule pattern.

    Devices and headsets
    49.
    发明授权

    公开(公告)号:US11222394B2

    公开(公告)日:2022-01-11

    申请号:US16824040

    申请日:2020-03-19

    Applicant: Arm Limited

    Abstract: A device has a content processing component operable in first and second content processing states, a display, at least one sensor operable to output sensor data indicative of at least one eye positional characteristic of a user, and a processor. The processor is configured to process the data, and in the first processing state, determine a region of the display corresponding to a foveal region of an eye of a user, and perform foveated processing of content to be displayed on the display such that a relatively high-quality video content is generated for display in the region and a relatively low-quality video content is generated for display outside the region. The second processing state is entered in response to a trigger. In the second processing state, the foveated processing used is overridden such that relatively low-quality video content is generated for display in at least a portion of the region.

    Register-based complex number processing

    公开(公告)号:US11210090B2

    公开(公告)日:2021-12-28

    申请号:US16630614

    申请日:2018-07-02

    Applicant: ARM LIMITED

    Abstract: Apparatuses, methods, programs, and complex number processing instructions are provided to support vector processing operations on input data vectors comprising a plurality of input data items at respective positions in the input data vectors. In response to the instructions at least one first set of data items is extracted from alternating positions in a first source register and at least one second set of data items is extracted from alternating positions in the second source register, wherein consecutive data items in the first and second source registers comprise alternating real and imaginary components of respective sets of complex numbers. A result set of complex number components is generated using the two sets of data items as operands, and the result set of complex number components is one of a real part and an imaginary part of a complex number result of the complex number operation applied to the two sets of complex numbers. The result set of complex number components is applied to the destination register.

Patent Agency Ranking