-
公开(公告)号:US12190164B2
公开(公告)日:2025-01-07
申请号:US17399808
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Steven Fishwick , Fergus W. MacGarry , Jonathan M. Redshaw , David A. Gotwalt , Ali Rabbani Rankouhi , Benjamin Bowman
Abstract: Disclosed embodiments relate to controlling sets of graphics work (e.g., kicks) assigned to graphics processor circuitry. In some embodiments, tracking slot circuitry implements entries for multiple tracking slots. Slot manager circuitry may store, using an entry of the tracking slot circuitry, software-specified information for a set of graphics work, where the information includes: type of work, dependencies on other sets of graphics work, and location of data for the set of graphics work. The slot manager circuitry may prefetch, from the location and prior to allocating shader core resources for the set of graphics work, configuration register data for the set of graphics work. Control circuitry may program configuration registers for the set of graphics work using the prefetched data and initiate processing of the set of graphics work by the graphics processor circuitry according to the dependencies. Disclosed techniques may reduce kick-to-kick transition time, in some embodiments.
-
公开(公告)号:US20190266102A1
公开(公告)日:2019-08-29
申请号:US16410828
申请日:2019-05-13
Applicant: Apple Inc.
Inventor: Wolfgang H. Klingauf , Kenneth C. Dyke , Karthik Ramani , Winnie W. Yeung , Anthony P. DeLaurier , Luc R. Semeria , David A. Gotwalt , Srinivasa Rangan Sridharan , Muditha Kanchana
IPC: G06F12/123 , G06F12/0815 , G06F12/0804 , G06F12/0808
Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.
-
公开(公告)号:US12086644B2
公开(公告)日:2024-09-10
申请号:US17399711
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Andrew M. Havlir , Steven Fishwick , David A. Gotwalt , Benjamin Bowman , Ralph C. Taylor , Melissa L. Velez , Mladen Wilder , Ali Rabbani Rankouhi , Fergus W. MacGarry
CPC classification number: G06F9/5044 , G06F9/4881 , G06F9/505 , G06T1/20 , G06T1/60
Abstract: Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.
-
公开(公告)号:US12026098B1
公开(公告)日:2024-07-02
申请号:US17660148
申请日:2022-04-21
Applicant: Apple Inc.
Inventor: Arjun Thottappilly , David A. Gotwalt , Frank W. Liljeros
IPC: G06F12/08 , G06F12/02 , G06F12/0871 , G06F12/0882
CPC classification number: G06F12/0882 , G06F12/0246 , G06F12/0871
Abstract: Techniques are disclosed relating to updating page pools in the context of cached page pool descriptors. In some embodiments, a processor is configured to assign a set of processing work to a first page pool of memory pages. Page manager circuitry may cache page pool descriptor entries in cache circuitry, where a given page pool descriptor entry indicates a set of pages assigned to a page pool. In response to a determination to grow the first page pool, the processor may communicate a grow list to the page manager circuitry, that identifies a set of memory blocks from the memory to be added to the first page pool. The page manager circuitry may then update a cached page pool descriptor entry for the first page pool to indicate the added memory blocks and generate a signal to inform the processor that the cached page pool descriptor entry is updated.
-
公开(公告)号:US20230048951A1
公开(公告)日:2023-02-16
申请号:US17399808
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Steven Fishwick , Fergus W. MacGarry , Jonathan M. Redshaw , David A. Gotwalt , Ali Rabbani Rankouhi , Benjamin Bowman
Abstract: Disclosed embodiments relate to controlling sets of graphics work (e.g., kicks) assigned to graphics processor circuitry. In some embodiments, tracking slot circuitry implements entries for multiple tracking slots. Slot manager circuitry may store, using an entry of the tracking slot circuitry, software-specified information for a set of graphics work, where the information includes: type of work, dependencies on other sets of graphics work, and location of data for the set of graphics work. The slot manager circuitry may prefetch, from the location and prior to allocating shader core resources for the set of graphics work, configuration register data for the set of graphics work. Control circuitry may program configuration registers for the set of graphics work using the prefetched data and initiate processing of the set of graphics work by the graphics processor circuitry according to the dependencies. Disclosed techniques may reduce kick-to-kick transition time, in some embodiments.
-
公开(公告)号:US20180349291A1
公开(公告)日:2018-12-06
申请号:US15610008
申请日:2017-05-31
Applicant: Apple Inc.
Inventor: Wolfgang H. Klingauf , Kenneth C. Dyke , Karthik Ramani , Winnie W. Yeung , Anthony P. DeLaurier , Luc R. Semeria , David A. Gotwalt , Srinivasa Rangan Sridharan , Muditha Kanchana
IPC: G06F12/123 , G06F12/0808 , G06F12/0815 , G06F12/0804
CPC classification number: G06F12/123 , G06F12/0804 , G06F12/0808 , G06F12/0815 , G06F2212/608 , G06F2212/621 , G06F2212/69
Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.
-
公开(公告)号:US20240273666A1
公开(公告)日:2024-08-15
申请号:US18450910
申请日:2023-08-16
Applicant: Apple Inc.
Inventor: Steven Fishwick , David A. Gotwalt , Pratik Chandresh Shah , Jackson Dsouza , Subodh Asthana , Jairaj Dave , Piotr A. Dittrich , David E. Roberts
CPC classification number: G06T1/20 , G06F9/485 , G06F9/4881
Abstract: Disclosed techniques relate to scheduling sets of graphics work using queues. In some embodiments, tracking circuitry implements entries for multiple tracking slots for a graphics processor. Queue access circuitry may access a data structure in memory that specifies multiple queues, where each queue enqueues control information for multiple sets of graphics work. Queue select circuitry may select sets of graphics work from the data structure based on one or more selection parameters and store control information for selected sets of graphics work in tracking slots of the tracking slot circuitry. Distribution circuitry may assign portions of respective sets of graphics work from the tracking slots to graphics processor circuitry for execution.
-
公开(公告)号:US10970223B2
公开(公告)日:2021-04-06
申请号:US16410828
申请日:2019-05-13
Applicant: Apple Inc.
Inventor: Wolfgang H. Klingauf , Kenneth C. Dyke , Karthik Ramani , Winnie W. Yeung , Anthony P. DeLaurier , Luc R. Semeria , David A. Gotwalt , Srinivasa Rangan Sridharan , Muditha Kanchana
IPC: G06F12/08 , G06F12/0891 , G06F12/0804 , G06F12/0808 , G06F12/0815 , G06F12/123 , G06F12/0895 , G06F12/126 , G06F12/12
Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.
-
公开(公告)号:US20230050061A1
公开(公告)日:2023-02-16
申请号:US17399711
申请日:2021-08-11
Applicant: Apple Inc.
Inventor: Andrew M. Havlir , Steven Fishwick , David A. Gotwalt , Benjamin Bowman , Ralph C. Taylor , Melissa L. Velez , Mladen Wilder , Ali Rabbani Rankouhi , Fergus W. MacGarry
Abstract: Disclosed techniques relate to work distribution in graphics processors. In some embodiments, an apparatus includes circuitry that implements a plurality of logical slots and a set of graphics processor sub-units that each implement multiple distributed hardware slots. The circuitry may determine different distribution rules for first and second sets of graphics work and map logical slots to distributed hardware slots based on the distribution rules. In various embodiments, disclosed techniques may advantageously distribute work efficiently across distributed shader processors for graphics kicks of various sizes.
-
公开(公告)号:US10289565B2
公开(公告)日:2019-05-14
申请号:US15610008
申请日:2017-05-31
Applicant: Apple Inc.
Inventor: Wolfgang H. Klingauf , Kenneth C. Dyke , Karthik Ramani , Winnie W. Yeung , Anthony P. DeLaurier , Luc R. Semeria , David A. Gotwalt , Srinivasa Rangan Sridharan , Muditha Kanchana
IPC: G06F12/08 , G06F12/12 , G06F12/123 , G06F12/0808 , G06F12/0815 , G06F12/0804
Abstract: Systems, apparatuses, and methods for efficiently allocating data in a cache are described. In various embodiments, a processor decodes an indication in a software application identifying a temporal data set. The data set is flagged with a data set identifier (DSID) indicating temporal data to drop after consumption. When the data set is allocated in a cache, the data set is stored with a non-replaceable attribute to prevent a cache replacement policy from evicting the data set before it is dropped. A drop command with an indication of the DSID of the data set is later issued after the data set is read (consumed). A copy of the data set is not written back to the lower-level memory although the data set is removed from the cache. An interrupt is generated to notify firmware or other software of the completion of the drop command.
-
-
-
-
-
-
-
-
-