Coprocessors with bypass optimization, variable grid architecture, and fused vector operations

    公开(公告)号:US12174785B2

    公开(公告)日:2024-12-24

    申请号:US17869620

    申请日:2022-07-20

    Applicant: Apple Inc.

    Abstract: In an embodiment, a coprocessor may include a bypass indication which identifies execution circuitry that is not used by a given processor instruction, and thus may be bypassed. The corresponding circuitry may be disabled during execution, preventing evaluation when the output of the circuitry will not be used for the instruction. In another embodiment, the coprocessor may implement a grid of processing elements in rows and columns, where a given coprocessor instruction may specify an operation that causes up to all of the processing elements to operate on vectors of input operands to produce results. Implementations of the coprocessor may implement a portion of the processing elements. The coprocessor control circuitry may be designed to operate with the full grid or partial grid, reissuing instructions in the partial grid case to perform the requested operation. In still another embodiment, the coprocessor may be able to fuse vector mode operations.

    Coprocessors with bypass optimization, variable grid architecture, and fused vector operations

    公开(公告)号:US12135681B2

    公开(公告)日:2024-11-05

    申请号:US17869617

    申请日:2022-07-20

    Applicant: Apple Inc.

    Abstract: In an embodiment, a coprocessor may include a bypass indication which identifies execution circuitry that is not used by a given processor instruction, and thus may be bypassed. The corresponding circuitry may be disabled during execution, preventing evaluation when the output of the circuitry will not be used for the instruction. In another embodiment, the coprocessor may implement a grid of processing elements in rows and columns, where a given coprocessor instruction may specify an operation that causes up to all of the processing elements to operate on vectors of input operands to produce results. Implementations of the coprocessor may implement a portion of the processing elements. The coprocessor control circuitry may be designed to operate with the full grid or partial grid, reissuing instructions in the partial grid case to perform the requested operation. In still another embodiment, the coprocessor may be able to fuse vector mode operations.

    Coprocessor register renaming using registers associated with an inactive context to store results from an active context

    公开(公告)号:US11775301B2

    公开(公告)日:2023-10-03

    申请号:US17644016

    申请日:2021-12-13

    Applicant: Apple Inc.

    Abstract: A coprocessor with register renaming is disclosed. An apparatus includes a plurality of processors and a coprocessor respectively configured to execute processor instructions and coprocessor instructions. The coprocessor receives coprocessor instructions from ones of the processors. The coprocessor includes an array of processing elements and a result register set comprising storage elements respectively distributed within the array of processing elements. For a given member of the array of processing elements, a corresponding storage element is configured to store coprocessor instruction results generated by the given member. The result register set implements a plurality of contexts to store respective coprocessor states corresponding to coprocessor instructions received from different processors. Based on a determination that one of the contexts is inactive, the coprocessor is configured to store coprocessor instruction results corresponding to an active context within storage elements of the result register set corresponding to the inactive context.

    Coprocessors with Bypass Optimization, Variable Grid Architecture, and Fused Vector Operations

    公开(公告)号:US20220350776A1

    公开(公告)日:2022-11-03

    申请号:US17869617

    申请日:2022-07-20

    Applicant: Apple Inc.

    Abstract: In an embodiment, a coprocessor may include a bypass indication which identifies execution circuitry that is not used by a given processor instruction, and thus may be bypassed. The corresponding circuitry may be disabled during execution, preventing evaluation when the output of the circuitry will not be used for the instruction. In another embodiment, the coprocessor may implement a grid of processing elements in rows and columns, where a given coprocessor instruction may specify an operation that causes up to all of the processing elements to operate on vectors of input operands to produce results. Implementations of the coprocessor may implement a portion of the processing elements. The coprocessor control circuitry may be designed to operate with the full grid or partial grid, reissuing instructions in the partial grid case to perform the requested operation. In still another embodiment, the coprocessor may be able to fuse vector mode operations.

    Coprocessor with Distributed Register
    7.
    发明申请

    公开(公告)号:US20200272467A1

    公开(公告)日:2020-08-27

    申请号:US16286213

    申请日:2019-02-26

    Applicant: Apple Inc.

    Abstract: In an embodiment, a coprocessor includes multiple processing elements arranged in a grid of one or more rows and one or more columns. A given processing element includes an arithmetic/logic unit (ALU) circuit configured to perform an ALU operation specified by an instruction executable by the coprocessor, wherein the ALU circuit is configured to produce a result. The given processing element further comprises a first memory coupled to the execute circuit. The first memory is configured to store results generated by the given processing element. The first memory includes a portion of a result memory implemented by the coprocessor, wherein locations in the result memory are specifiable as destination operands of instructions executable by the coprocessor. The portion of the result memory implemented by the first memory is the portion of the result memory that the given processing element is capable of updating.

    Coprocessors with Bypass Optimization, Variable Grid Architecture, and Fused Vector Operations

    公开(公告)号:US20250094381A1

    公开(公告)日:2025-03-20

    申请号:US18959080

    申请日:2024-11-25

    Applicant: Apple Inc.

    Abstract: In an embodiment, a coprocessor may include a plurality of processing element circuits arranged in a first grid, where a given coprocessor instruction of an instruction set for the coprocessor is defined to cause evaluation of a second plurality of processing element circuits arranged in a second grid, where the second grid includes more processing element circuits than the first grid. The coprocessor may further include a scheduler circuit configured to issue instruction operations to the plurality of processing element circuits, where the scheduler circuit is configured to issue a given instruction operation corresponding to the given coprocessor instruction a plurality of times to complete the given coprocessor instruction, wherein different issuances of the given instruction operation are configured to cause respective different portions of the evaluation defined by the given coprocessor instruction to be performed.

    Coprocessor Prefetcher
    9.
    发明申请

    公开(公告)号:US20250094174A1

    公开(公告)日:2025-03-20

    申请号:US18783937

    申请日:2024-07-25

    Applicant: Apple Inc.

    Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.

    Coprocessor Prefetcher
    10.
    发明申请

    公开(公告)号:US20230092898A1

    公开(公告)日:2023-03-23

    申请号:US17643765

    申请日:2021-12-10

    Applicant: Apple Inc.

    Abstract: A prefetcher for a coprocessor is disclosed. An apparatus includes a processor and a coprocessor that are configured to execute processor and coprocessor instructions, respectively. The processor and coprocessor instructions appear together in code sequences fetched by the processor, with the coprocessor instructions being provided to the coprocessor by the processor. The apparatus further includes a coprocessor prefetcher configured to monitor a code sequence fetched by the processor and, in response to identifying a presence of coprocessor instructions in the code sequence, capture the memory addresses, generated by the processor, of operand data for coprocessor instructions. The coprocessor is further configured to issue, for a cache memory accessible to the coprocessor, prefetches for data associated with the memory addresses prior to execution of the coprocessor instructions by the coprocessor.

Patent Agency Ranking