Acceleration of In-Memory-Compute Arrays

    公开(公告)号:US20230059200A1

    公开(公告)日:2023-02-23

    申请号:US17406817

    申请日:2021-08-19

    Applicant: Apple Inc.

    Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.

    MAPPABLE FILTER FOR NEURAL PROCESSOR CIRCUIT

    公开(公告)号:US20220108155A1

    公开(公告)日:2022-04-07

    申请号:US17065428

    申请日:2020-10-07

    Applicant: Apple Inc.

    Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.

    MAPPABLE FILTER FOR NEURAL PROCESSOR CIRCUIT

    公开(公告)号:US20250021808A1

    公开(公告)日:2025-01-16

    申请号:US18903466

    申请日:2024-10-01

    Applicant: Apple Inc.

    Abstract: Embodiments relate to a neural processor circuit that may include a fetch circuit that fetches coefficient data of a machine learning model from a memory source. The neural processor circuit may also include one or more neural engine circuits that are coupled to the fetch circuit. A neural engine circuit may include a buffer circuit that stores the coefficient data. The neural engine circuit may also include a coefficient organizing circuit that generates at least a first mapping and a second mapping of the stored coefficient data according to one or more control signals. The neural engine may also include a computation circuit that receives and processes at least a portion of input data with the coefficient data as mapped according to the first mapping or process at least the portion of the input data with the coefficient data as mapped according to the second mapping.

    DYNAMIC VARIABLE BIT WIDTH NEURAL PROCESSOR
    6.
    发明公开

    公开(公告)号:US20230206050A1

    公开(公告)日:2023-06-29

    申请号:US18114169

    申请日:2023-02-24

    Applicant: Apple Inc.

    CPC classification number: G06N3/063 G06N3/08 G06N3/04

    Abstract: Embodiments relate to an electronic device that includes a neural processor having multiple neural engine circuits that operate in multiple modes of different bit width. A neural engine circuit may include a first multiply circuit and a second multiply circuit. The first and second multiply circuits may be combined to work as a part of a combined computation circuit. In a first mode, the first multiply circuit generates first output data of a first bit width by multiplying first input data with a first kernel coefficient. The second multiply circuit generates second output data of the first bit width by multiplying second input data with a second kernel coefficient. In a second mode, the combined computation circuit generates third output data of a second bit width by multiplying third input data with a third kernel coefficient.

    Acceleration of in-memory-compute arrays

    公开(公告)号:US12230361B2

    公开(公告)日:2025-02-18

    申请号:US18346565

    申请日:2023-07-03

    Applicant: Apple Inc.

    Abstract: An apparatus includes an in-memory compute circuit that includes a memory circuit configured to generate a set of products by combining received input values with respective weight values stored in rows of the memory circuit, and to combine the set of products to generate an accumulated output value. The in-memory compute circuit may further include a control circuit and a plurality of routing circuits, including a first routing circuit coupled to a first set of rows of the memory circuit. The control circuit may be configured to cause the first routing circuit to route groups of input values to different ones of the first set of rows over a plurality of clock cycles, and the memory circuit to generate, on a clock cycle following the plurality of clock cycles, a particular accumulated output value that is computed based on the routed groups of input values.

    Dynamic variable bit width neural processor

    公开(公告)号:US12050987B2

    公开(公告)日:2024-07-30

    申请号:US18114169

    申请日:2023-02-24

    Applicant: Apple Inc.

    CPC classification number: G06N3/063 G06N3/04 G06N3/08

    Abstract: Embodiments relate to an electronic device that includes a neural processor having multiple neural engine circuits that operate in multiple modes of different bit width. A neural engine circuit may include a first multiply circuit and a second multiply circuit. The first and second multiply circuits may be combined to work as a part of a combined computation circuit. In a first mode, the first multiply circuit generates first output data of a first bit width by multiplying first input data with a first kernel coefficient. The second multiply circuit generates second output data of the first bit width by multiplying second input data with a second kernel coefficient. In a second mode, the combined computation circuit generates third output data of a second bit width by multiplying third input data with a third kernel coefficient.

    Display tracking systems and methods

    公开(公告)号:US11954885B2

    公开(公告)日:2024-04-09

    申请号:US17476312

    申请日:2021-09-15

    Applicant: Apple Inc.

    Abstract: A tracked device may be used in an extended reality system in coordination with a tracking device. The tracked device may be ordinarily difficult to track, for example due to changing appearances or relatively small surface areas of unchanging features, as may be the case with an electronic device with a relatively large display surrounded by a thin physical outer boundary. In these cases, the tracked device may periodically present an image to the tracking device that the tracking device stores as an indication to permit tracking of a known, unchanging feature despite the image not being presented continuously on the display of the tracked device. The image may include a static image, designated tracking data overlaid on an image frame otherwise scheduled for presentation, or extracted image features from the image frame otherwise scheduled for presentation. Additional power saving methods and known marker generation methods are also described.

    CROSSBAR CIRCUIT FOR UNALIGNED MEMORY ACCESS IN NEURAL NETWORK PROCESSOR

    公开(公告)号:US20230135306A1

    公开(公告)日:2023-05-04

    申请号:US17518059

    申请日:2021-11-03

    Applicant: Apple Inc.

    Abstract: Embodiments of the present disclosure relate to an unaligned memory access in a neural processor circuit. The neural processor circuit includes a crossbar circuit and a neural engine circuit coupled to the crossbar circuit. During each operating cycle of the neural processor circuit, the crossbar circuit receives a portion of input data, and re-aligns or bypasses the portion of input data. The neural engine circuit receives at least a portion of the re-aligned or bypassed portion of the input data, and performs a convolution operation on the received portion of re-aligned or bypassed portion of input data to generate output data.

Patent Agency Ranking