-
公开(公告)号:US12153958B2
公开(公告)日:2024-11-26
申请号:US18045128
申请日:2022-10-07
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Anirudh R. Acharya , Michael J. Mantor , Rex Eldon McCrary , Anthony Asaro , Jeffrey Gongxian Cheng , Mark Fowler
Abstract: Systems, apparatuses, and methods for abstracting tasks in virtual memory identifier (VMID) containers are disclosed. A processor coupled to a memory executes a plurality of concurrent tasks including a first task. Responsive to detecting one or more instructions of the first task which correspond to a first operation, the processor retrieves a first identifier (ID) which is used to uniquely identify the first task, wherein the first ID is transparent to the first task. Then, the processor maps the first ID to a second ID and/or a third ID. The processor completes the first operation by using the second ID and/or the third ID to identify the first task to at least a first data structure. In one implementation, the first operation is a memory access operation and the first data structure is a set of page tables. Also, in one implementation, the second ID identifies a first application of the first task and the third ID identifies a first operating system (OS) of the first task.
-
公开(公告)号:US20220277508A1
公开(公告)日:2022-09-01
申请号:US17745410
申请日:2022-05-16
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Michael Mantor , Laurent Lefebvre , Mark Fowler , Timothy Kelley , Mikko Alho , Mika Tuomi , Kiia Kallio , Patrick Klas Rudolf Buss , Jari Antero Komppa , Kaj Tuomi
IPC: G06T15/00
Abstract: A method, computer system, and a non-transitory computer-readable storage medium for performing primitive batch binning are disclosed. The method, computer system, and non-transitory computer-readable storage medium include techniques for generating a primitive batch from a plurality of primitives, computing respective bin intercepts for each of the plurality of primitives in the primitive batch, and shading the primitive batch by iteratively processing each of the respective bin intercepts computed until all of the respective bin intercepts are processed.
-
公开(公告)号:US11100004B2
公开(公告)日:2021-08-24
申请号:US14747944
申请日:2015-06-23
Applicant: ATI Technologies ULC , Advanced Micro Devices, Inc.
Inventor: Gongxian Jeffrey Cheng , Mark Fowler , Philip J. Rogers , Benjamin T. Sander , Anthony Asaro , Mike Mantor , Raja Koduri
IPC: G06F12/1009
Abstract: A processor uses the same virtual address space for heterogeneous processing units of the processor. The processor employs different sets of page tables for different types of processing units, such as a CPU and a GPU, wherein a memory management unit uses each set of page tables to translate virtual addresses of the virtual address space to corresponding physical addresses of memory modules associated with the processor. As data is migrated between memory modules, the physical addresses in the page tables can be updated to reflect the physical location of the data for each processing unit.
-
公开(公告)号:US11074075B2
公开(公告)日:2021-07-27
申请号:US15442412
申请日:2017-02-24
Applicant: Advanced Micro Devices, Inc.
Inventor: Mark Fowler , Brian D. Emberling
Abstract: Systems, apparatuses, and methods for maintaining separate pending load and store counters are disclosed herein. In one embodiment, a system includes at least one execution unit, a memory subsystem, and a pair of counters for each thread of execution. In one embodiment, the system implements a software based approach for managing dependencies between instructions. In one embodiment, the execution unit(s) maintains counters to support the software-based approach for managing dependencies between instructions. The execution unit(s) are configured to execute instructions that are used to manage the dependencies during run-time. In one embodiment, the execution unit(s) execute wait instructions to wait until a given counter is equal to a specified value before continuing to execute the instruction sequence.
-
公开(公告)号:US10943389B2
公开(公告)日:2021-03-09
申请号:US15374752
申请日:2016-12-09
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Laurent Lefebvre , Michael Mantor , Mark Fowler , Mikko Alho , Mika Tuomi , Kiia Kallio , Patrick Klas Rudolf Buss , Jari Antero Komppa , Kaj Tuomi , Christopher J. Brennan
Abstract: Techniques for removing or identifying overlapping fragments in a fragment stream after z-culling are disclosed. The techniques include maintaining a first-in-first-out buffer that stores post-z-cull fragments. Each time a new fragment is received at the buffer, the screen position of the fragment is checked against all other fragments in the buffer. If the screen position of the fragment matches the screen position of a fragment in the buffer, then the fragment in the buffer is removed or marked as overlapping. If the screen position of the fragment does not match the screen position of any fragment in the buffer, then no modification is performed to fragments already in the buffer. In either case, he fragment is added to the buffer. The contents of the buffer are transmitted to the pixel shader for pixel shading at a later time.
-
公开(公告)号:US10540280B2
公开(公告)日:2020-01-21
申请号:US15390080
申请日:2016-12-23
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Mark Fowler , Jimshed Mirza , Anthony Asaro
IPC: G06F12/1009 , G06T1/20 , G06F12/0804 , G06F12/0891
Abstract: Techniques for performing cache invalidates and write-backs in an accelerated processing device (e.g., a graphics processing device that renders three-dimensional graphics) are disclosed. The techniques involve receiving requests from a “master” (e.g., the central processing unit). The techniques involve invalidating virtual-to-physical address translations in an address translation request. The techniques include splitting up the requests based on whether the requests target virtually or physically tagged caches. Addresses for the portions of a request that target physically tagged caches are translated using invalidated virtual-to-physical address translations for speed. The split up request is processed to generate micro-transactions for individual caches targeted by the request. Micro-transactions for physically and virtually tagged caches are processed in parallel. Once all micro-transactions for a request have been processed, the unit that made the request is notified.
-
公开(公告)号:US09996478B1
公开(公告)日:2018-06-12
申请号:US15374788
申请日:2016-12-09
Applicant: Advanced Micro Devices, Inc.
Inventor: Mark Fowler
IPC: G06F12/08 , G06F12/128 , G06F12/122
CPC classification number: G06F12/128 , G06F12/0888 , G06F12/122 , G06F12/126 , G06F2212/1024 , G06F2212/455 , G06F2212/621 , G06F2212/69 , G06F2212/70
Abstract: A system and method for efficiently performing data allocation in a cache memory are described. A lookup is performed in a cache responsive to detecting an access request. If the targeted data is found in the cache and the targeted data is of a no allocate data type indicating the targeted data is not expected to be reused, then the targeted data is read from the cache without updating cache replacement policy information for the targeted data responsive to the access. If the lookup results in a miss, the targeted data is prevented from being allocated in the cache.
-
公开(公告)号:US20180019006A1
公开(公告)日:2018-01-18
申请号:US15211887
申请日:2016-07-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Kevin M. Brandl , Thomas Hamilton , Hideki Kanayama , Kedarnath Balakrishnan , James R. Magro , Guanhao Shen , Mark Fowler
IPC: G11C7/10 , G11C11/408
CPC classification number: G11C7/1063 , G06F12/1018 , G06F2212/1041 , G11C7/10 , G11C7/1072 , G11C11/408
Abstract: A memory controller includes a host interface for receiving memory access requests including access addresses, a memory interface for providing memory accesses to a memory system, and an address decoder coupled to the host interface for programmably mapping the access addresses to selected ones of a plurality of regions. The address decoder is programmable to map the access addresses to a first region having a non-power-of-two size using a primary decoder and a secondary decoder each having power-of-two sizes, and providing a first region mapping signal in response. A command queue stores the memory access requests and region mapping signals. An arbiter picks the memory access requests from the command queue based on a plurality of criteria, which are evaluated based in part on the region mapping signals, and provides corresponding memory accesses to the memory interface in response.
-
公开(公告)号:US20170018053A1
公开(公告)日:2017-01-19
申请号:US15282336
申请日:2016-09-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Mark Fowler
CPC classification number: G06T1/60 , G06T1/20 , G06T11/40 , G06T2200/12 , G06T2200/28
Abstract: Embodiments of the present invention are directed to improving the performance of anti-aliased image rendering. One embodiment is a method of rendering a pixel from an anti-aliased image. The method includes: storing a first set and a second set of samples from a plurality of anti-aliased samples of the pixel respectively in a first memory and a second memory; and rendering a determined number of said samples from one of only the first set or the first and second sets. Corresponding system and computer program product embodiments are also disclosed.
Abstract translation: 本发明的实施例旨在提高抗锯齿图像渲染的性能。 一个实施例是从抗锯齿图像渲染像素的方法。 该方法包括:将来自多个像素的抗锯齿样本的第一组和第二组样本分别存储在第一存储器和第二存储器中; 以及从仅第一组或第一组和第二组中的一个呈现确定数量的所述样本。 还公开了相应的系统和计算机程序产品实施例。
-
公开(公告)号:US20250117330A1
公开(公告)日:2025-04-10
申请号:US18617092
申请日:2024-03-26
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Dana Schaa , Mark Fowler , Saurabh Sharma , Noah Fredriks
IPC: G06F12/0815
Abstract: As part of rendering a scene including at least one graphics object in a display space, the display space is divided into a plurality of tiles. A determination is made that contents of at least two of the plurality of tiles are no longer used after a current render pass. A write back memory address associated with a second tile is changed to match a write back memory address associated with a first tile. As a result, data is overwritten on a same physical page.
-
-
-
-
-
-
-
-
-