-
公开(公告)号:US20220269620A1
公开(公告)日:2022-08-25
申请号:US17666974
申请日:2022-02-08
IPC分类号: G06F12/1027 , G06F12/0893
摘要: A processor maintains an access log indicating a stream of cache misses at a cache of the processor. In response to each of at least a subset of cache misses at the cache, the processor records a corresponding entry in the access log, indicating a physical memory address of the memory access request that resulted in the corresponding miss. In addition, the processor maintains an address translation log that indicates a mapping of physical memory addresses to virtual memory addresses. In response to an address translation (e.g., a page walk) that translates a virtual address to a physical address, the processor stores a mapping of the physical address to the corresponding virtual address at an entry of the address translation log. Software executing at the processor can use the two logs for memory management.
-
公开(公告)号:US11335052B2
公开(公告)日:2022-05-17
申请号:US16179376
申请日:2018-11-02
发明人: Michael Mantor , Laurent Lefebvre , Mark Fowler , Timothy Kelley , Mikko Alho , Mika Tuomi , Kiia Kallio , Patrick Klas Rudolf Buss , Jari Antero Komppa , Kaj Tuomi
IPC分类号: G06T15/00
摘要: A system, method and a non-transitory computer readable storage medium are provided for hybrid rendering with deferred primitive batch binning. A primitive batch is generated from one or more primitives. A bin is identified for processing the primitive batch. At least a portion of each primitive intersecting the identified bin is processed and a next bin for processing the primitive batch is identified based on an intercept walk order. The processing is iteratively repeated for the one or more primitives in the primitive batch for successive bins until all primitives of the primitive batch are completely processed. Then, the one or more primitives in the primitive batch are further processed.
-
公开(公告)号:US11263044B2
公开(公告)日:2022-03-01
申请号:US16692856
申请日:2019-11-22
摘要: A graphics processing unit (GPU) adjusts a frequency of clock based on identifying a program thread executing at the processing unit, wherein the program thread is detected based on a workload to be executed. By adjusting the clock frequency based on the identified program thread, the processing unit adapts to different processing demands of different program threads. Further, by identifying the program thread based on workload, the processing unit adapts the clock frequency based on processing demands, thereby conserving processing resources.
-
公开(公告)号:US10546365B2
公开(公告)日:2020-01-28
申请号:US15843968
申请日:2017-12-15
发明人: Michael Mantor , Laurent Lefebvre , Mika Tuomi , Kiia Kallio
摘要: An apparatus, such as a head mounted device (HMD), includes one or more processors configured to implement a graphics pipeline that renders pixels in window space with a nonuniform pixel spacing. The apparatus also includes a first distortion function that maps the non-uniformly spaced pixels in window space to uniformly spaced pixels in raster space. The apparatus further includes a scan converter configured to sample the pixels in window space through the first distortion function. The scan converter is configured to render display pixels used to generate an image for display to a user based on the uniformly spaced pixels in raster space. In some cases, the pixels in the window space are rendered such that a pixel density per subtended area is constant across the user's field of view.
-
公开(公告)号:US10360177B2
公开(公告)日:2019-07-23
申请号:US15189054
申请日:2016-06-22
发明人: Syed Zohaib M. Gilani , Jiasheng Chen , QingCheng Wang , YunXiao Zou , Michael Mantor , Bin He , Timour T. Paltashev
IPC分类号: G06F15/80 , G06F1/3234 , G06T15/00
摘要: Described is a method and processing apparatus to improve power efficiency by gating redundant threads processing. In particular, the method for gating redundant threads in a graphics processor includes determining if data for a thread and data for at least another thread are within a predetermined similarity threshold, gating execution of the at least another thread if the data for the thread and the data for the at least another thread are within the predetermined similarity threshold, and using an output data from the thread as an output data for the at least another thread.
-
公开(公告)号:US20190122417A1
公开(公告)日:2019-04-25
申请号:US16179376
申请日:2018-11-02
发明人: Michael Mantor , Laurent Lefebvre , Mark Fowler , Timothy Kelley , Mikko Alho , Mika Tuomi , Kiia Kallio , Patrick Klas Rudolf Buss , Jari Antero Komppa , Kaj Tuomi
IPC分类号: G06T15/00
摘要: A system, method and a non-transitory computer readable storage medium are provided for hybrid rendering with deferred primitive batch binning. A primitive batch is generated from one or more primitives. A bin is identified for processing the primitive batch. At least a portion of each primitive intersecting the identified bin is processed and a next bin for processing the primitive batch is identified based on an intercept walk order. The processing is iteratively repeated for the one or more primitives in the primitive batch for successive bins until all primitives of the primitive batch are completely processed. Then, the one or more primitives in the primitive batch are further processed.
-
公开(公告)号:US20210209831A1
公开(公告)日:2021-07-08
申请号:US17208730
申请日:2021-03-22
发明人: Michael Mantor , Laurent Lefebvre , Mikko Alho , Mika Tuomi , Kiia Kallio
摘要: A method, system, and non-transitory computer readable storage medium for rasterizing primitives are disclosed. The method, system, and non-transitory computer readable storage medium includes: generating a primitive batch from a sequence of one or more primitives, wherein the primitive batch includes primitives sorted into one or more row groups based on which row of a plurality of rows each primitive intersects; and processing each row group, the processing for each row group including: identifying one or more primitive column intercepts for each of the one or more primitives in the row group, wherein each combination of primitive column intercept and row identifies a bin; and rasterizing the one or more primitives that intersect the bin.
-
公开(公告)号:US10957094B2
公开(公告)日:2021-03-23
申请号:US15250357
申请日:2016-08-29
发明人: Michael Mantor , Laurent Lefebvre , Mikko Alho , Mika Tuomi , Kiia Kallio
摘要: A system, method and a computer program product are provided for hybrid rendering with deferred primitive batch binning A primitive batch is generated from a sequence of primitives. Initial bin intercepts are identified for primitives in the primitive batch. A bin for processing is identified. The bin corresponds to a region of a screen space. Pixels of the primitives intercepting the identified bin are processed. Next bin intercepts are identified while the primitives intercepting the identified bin are processed.
-
公开(公告)号:US20180165872A1
公开(公告)日:2018-06-14
申请号:US15374752
申请日:2016-12-09
发明人: Laurent Lefebvre , Michael Mantor , Mark Fowler , Mikko Alho , Mika Tuomi , Kiia Kallio , Patrick Klas Rudolf Buss , Jari Antero Komppa , Kaj Tuomi , Christopher J. Brennan
CPC分类号: G06T15/405 , G06T11/40 , G06T15/005 , G06T15/80
摘要: Techniques for removing or identifying overlapping fragments in a fragment stream after z-culling are disclosed. The techniques include maintaining a first-in-first-out buffer that stores post-z-cull fragments. Each time a new fragment is received at the buffer, the screen position of the fragment is checked against all other fragments in the buffer. If the screen position of the fragment matches the screen position of a fragment in the buffer, then the fragment in the buffer is removed or marked as overlapping. If the screen position of the fragment does not match the screen position of any fragment in the buffer, then no modification is performed to fragments already in the buffer. In either case, he fragment is added to the buffer. The contents of the buffer are transmitted to the pixel shader for pixel shading at a later time.
-
公开(公告)号:US11379941B2
公开(公告)日:2022-07-05
申请号:US15415823
申请日:2017-01-25
摘要: Improvements in the graphics processing pipeline are disclosed. More specifically, a new primitive shader stage performs tasks of the vertex shader stage or a domain shader stage if tessellation is enabled, a geometry shader if enabled, and a fixed function primitive assembler. The primitive shader stage is compiled by a driver from user-provided vertex or domain shader code, geometry shader code, and from code that performs functions of the primitive assembler. Moving tasks of the fixed function primitive assembler to a primitive shader that executes in programmable hardware provides many benefits, such as removal of a fixed function crossbar, removal of dedicated parameter and position buffers that are unusable in general compute mode, and other benefits.
-
-
-
-
-
-
-
-
-