Patent search ap:("Apple Inc.") AND inv:"Karl D. Mann" Page 2

11.

发明申请
On-demand Memory Allocation 有权

公开(公告)号：US20210271606A1

公开(公告)日：2021-09-02

申请号：US16804128

申请日：2020-02-28

Applicant: Apple Inc.

Inventor： Justin A. Hensley , Karl D. Mann , Yoong Chert Foo , Terence M. Potter , Frank W. Liljeros , Ralph C. Taylor

IPC: G06F12/1018 , G06F12/084

Abstract: Techniques are disclosed relating to dynamically allocating and mapping private memory for requesting circuitry. Disclosed circuitry may receive a private address and translate the private address to a virtual address (which an MMU may then translate to physical address to actually access a storage element). In some embodiments, private memory allocation circuitry is configured to generate page table information and map private memory pages for requests if the page table information is not already setup. In various embodiments, this may advantageously allow dynamic private memory allocation, e.g., to efficiently allocate memory for graphics shaders with different types of workloads. Disclosed caching techniques for page table information may improve performance relative to traditional techniques. Further, disclosed embodiments may facilitate memory consolidation across a device such as a graphics processor.

12.

发明申请
Compression Techniques for Pixel Write Data 有权

公开(公告)号：US20210134052A1

公开(公告)日：2021-05-06

申请号：US16673883

申请日：2019-11-04

Applicant: Apple Inc.

Inventor： Anthony P. DeLaurier , Karl D. Mann , Tyson J. Bergland , Winnie W. Yeung

IPC: G06T15/80 , G06T15/00 , G06T15/04 , G09G5/36

Abstract: Techniques are disclosed relating to compression of data stored at different cache levels. In some embodiments, programmable shader circuitry is configured to execute program instructions of compute kernels that write pixel data. In some embodiments, a first cache is configured to store pixel write data from the programmable shader circuitry and first compression circuitry is configured to compress a first block of pixel write data in response to full accumulation of the first block in the first cache circuitry. In some embodiments, second cache circuitry is configured to store pixel write data from the programmable shader circuitry at a higher level in a storage hierarchy than the first cache circuitry and second compression circuitry is configured to compress a second block of pixel write data in response to full accumulation of the second block in the second cache circuitry. In some embodiments, write circuitry is configured to write the first and second compressed blocks of pixel data in a combined write to a higher level in the storage hierarchy.

13.

发明申请
Dependency Scheduling for Control Stream in Parallel Processor 审中-公开

公开(公告)号：US20200301753A1

公开(公告)日：2020-09-24

申请号：US16361910

申请日：2019-03-22

Applicant: Apple Inc.

Inventor： Andrew M. Havlir , Jason D. Carroll , Karl D. Mann

IPC: G06F9/52

Abstract: Techniques are disclosed relating to processing a control stream such as a compute control stream. In some embodiments, the control stream includes kernels and commands for multiple substreams. In some embodiments, multiple substream processors are each configured to: fetch and parse portions of the control stream corresponding to an assigned substream and, in response to a neighbor barrier command in the assigned substream that identifies another substream, communicate the identified other substream to a barrier clearing circuitry. In some embodiments, the barrier clearing circuitry is configured to determine whether to allow the assigned substream to proceed past the neighbor barrier command based on communication of a most-recently-completed command from a substream processor to which the other substream is assigned (e.g., based on whether the most-recently-completed command meets a command identifier communicated in the neighbor barrier command). The disclosed techniques may facilitate parallel control stream parsing and substream synchronization.

14.

发明授权
Punch-through techniques for graphics processing 有权

公开(公告)号：US10074210B1

公开(公告)日：2018-09-11

申请号：US15659188

申请日：2017-07-25

Applicant: Apple Inc.

Inventor： Christopher L. Spencer , Karl D. Mann , Ralph C. Taylor , Dinesh D. Kuwar

IPC: G06T15/50 , G06T15/60 , G06T15/40 , G06T15/00 , G06T15/08 , G06T15/80 , G06T17/20 , G09G5/00

CPC classification number: G06T15/40 , G06T11/40 , G06T15/005 , G06T15/08 , G06T15/80 , G06T17/20 , G09G5/00 , G09G5/363

Abstract: Techniques are disclosed relating to rendering graphics objects that require shader operations to determine visibility. In some embodiments, a graphics unit is configured to process feedback objects, which may require shading to determine whether they are visible relative to previously-processed objects, out of draw order. For example, in embodiments where a buffer is used to store fragment data for deferred rendering, the graphics unit may bypass the buffer and shade feedback objects ahead of earlier non-feedback objects whose fragment data is stored in the buffer. This may allow a determination of whether to remove occluded non-feedback fragment data from the buffer, which may reduce graphics overdraw. In disclosed two-pass techniques, data for feedback objects is first allowed to bypass the buffer for visibility shading, but is then stored in the buffer for a second pass to perform fragment shading to actually determine pixel attributes, which may further reduce overdraw.

15.

发明公开
Compute Kernel Parsing with Limits in one or more Dimensions 审中-公开

公开(公告)号：US20240345892A1

公开(公告)日：2024-10-17

申请号：US18673959

申请日：2024-05-24

Applicant: Apple Inc.

Inventor： Andrew M. Havlir , Ajay Simha Modugala , Karl D. Mann

IPC: G06F9/50 , G06T1/20

CPC classification number: G06F9/505 , G06T1/20

Abstract: Techniques are disclosed relating to dispatching compute work from a compute stream. In some embodiments, a graphics processor executes instructions of compute kernels. Workload parser circuitry may determine, for distribution to the graphics processor circuitry, a set of workgroups from a compute kernel that includes workgroups organized in multiple dimensions, including a first number of workgroups in a first dimension and a second number of workgroups in a second dimension. This may include determining multiple sub-kernels for the compute kernel, wherein a first sub-kernel includes, in the first dimension, a limited number of workgroups that is smaller than the first number of workgroups. The parser circuitry may iterate through workgroups in both the first and second dimensions to generate the set of workgroups, proceeding through the first sub-kernel before iterating through any of the other sub-kernels. Disclosed techniques may provide desirable shapes for batches of workgroups.

16.

发明授权
Compute kernel parsing with limits in one or more dimensions with iterating through workgroups in the one or more dimensions for execution 有权

公开(公告)号：US12020075B2

公开(公告)日：2024-06-25

申请号：US17018913

申请日：2020-09-11

Applicant: Apple Inc.

Inventor： Andrew M. Havlir , Ajay Simha Modugala , Karl D. Mann

IPC: G06F9/50 , G06T1/20

CPC classification number: G06F9/505 , G06T1/20

Abstract: Techniques are disclosed relating to dispatching compute work from a compute stream. In some embodiments, a graphics processor executes instructions of compute kernels. Workload parser circuitry may determine, for distribution to the graphics processor circuitry, a set of workgroups from a compute kernel that includes workgroups organized in multiple dimensions, including a first number of workgroups in a first dimension and a second number of workgroups in a second dimension. This may include determining multiple sub-kernels for the compute kernel, wherein a first sub-kernel includes, in the first dimension, a limited number of workgroups that is smaller than the first number of workgroups. The parser circuitry may iterate through workgroups in both the first and second dimensions to generate the set of workgroups, proceeding through the first sub-kernel before iterating through any of the other sub-kernels. Disclosed techniques may provide desirable shapes for batches of workgroups.

17.

发明公开
On-demand Memory Allocation 审中-公开

公开(公告)号：US20240045808A1

公开(公告)日：2024-02-08

申请号：US18490588

申请日：2023-10-19

Applicant: Apple Inc.

Inventor： Justin A. Hensley , Karl D. Mann , Yoong Chert Foo , Terence M. Potter , Frank W. Liljeros , Ralph C. Taylor

IPC: G06F12/1018 , G06F12/084

CPC classification number: G06F12/1018 , G06F12/084 , G06F30/392

Abstract: Techniques are disclosed relating to dynamically allocating and mapping private memory for requesting circuitry. Disclosed circuitry may receive a private address and translate the private address to a virtual address (which an MMU may then translate to physical address to actually access a storage element). In some embodiments, private memory allocation circuitry is configured to generate page table information and map private memory pages for requests if the page table information is not already setup. In various embodiments, this may advantageously allow dynamic private memory allocation, e.g., to efficiently allocate memory for graphics shaders with different types of workloads. Disclosed caching techniques for page table information may improve performance relative to traditional techniques. Further, disclosed embodiments may facilitate memory consolidation across a device such as a graphics processor.

18.

发明授权
Compression techniques for pixel write data 有权

公开(公告)号：US11062507B2

公开(公告)日：2021-07-13

申请号：US16673883

申请日：2019-11-04

Applicant: Apple Inc.

Inventor： Anthony P. DeLaurier , Karl D. Mann , Tyson J. Bergland , Winnie W. Yeung

IPC: G06T15/80 , G09G5/36 , G06T15/04 , G06T15/00

Abstract: Techniques are disclosed relating to compression of data stored at different cache levels. In some embodiments, programmable shader circuitry is configured to execute program instructions of compute kernels that write pixel data. In some embodiments, a first cache is configured to store pixel write data from the programmable shader circuitry and first compression circuitry is configured to compress a first block of pixel write data in response to full accumulation of the first block in the first cache circuitry. In some embodiments, second cache circuitry is configured to store pixel write data from the programmable shader circuitry at a higher level in a storage hierarchy than the first cache circuitry and second compression circuitry is configured to compress a second block of pixel write data in response to full accumulation of the second block in the second cache circuitry. In some embodiments, write circuitry is configured to write the first and second compressed blocks of pixel data in a combined write to a higher level in the storage hierarchy.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification