Patent search ap:("ADVANCED MICRO DEVICES Page INC." OR "ATI TECHNOLOGIES ULC") AND inv:"Jimshed Mirza"

11.

发明授权
High-speed selective cache invalidates and write-backs on GPUS 有权

公开(公告)号：US10540280B2

公开(公告)日：2020-01-21

申请号：US15390080

申请日：2016-12-23

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： Mark Fowler , Jimshed Mirza , Anthony Asaro

IPC: G06F12/1009 , G06T1/20 , G06F12/0804 , G06F12/0891

Abstract: Techniques for performing cache invalidates and write-backs in an accelerated processing device (e.g., a graphics processing device that renders three-dimensional graphics) are disclosed. The techniques involve receiving requests from a “master” (e.g., the central processing unit). The techniques involve invalidating virtual-to-physical address translations in an address translation request. The techniques include splitting up the requests based on whether the requests target virtually or physically tagged caches. Addresses for the portions of a request that target physically tagged caches are translated using invalidated virtual-to-physical address translations for speed. The split up request is processed to generate micro-transactions for individual caches targeted by the request. Micro-transactions for physically and virtually tagged caches are processed in parallel. Once all micro-transactions for a request have been processed, the unit that made the request is notified.

12.

发明申请
SHADER WRITES TO COMPRESSED RESOURCES 审中-公开

公开(公告)号：US20180182155A1

公开(公告)日：2018-06-28

申请号：US15389075

申请日：2016-12-22

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： Jimshed Mirza , Christopher J. Brennan , Anthony Chan , Leon Lai

IPC: G06T15/00 , G06T15/04 , G06T15/80 , G06F12/0875

Abstract: Systems, apparatuses, and methods for performing shader writes to compressed surfaces are disclosed. In one embodiment, a processor includes at least a memory and one or more shader units. In one embodiment, a shader unit of the processor is configured to receive a write request targeted to a compressed surface. The shader unit is configured to identify a first block of the compressed surface targeted by the write request. Responsive to determining the data of the write request targets less than the entirety of the first block, the first shader unit reads the first block from the cache and decompress the first block. Next, the first shader unit merges the data of the write request with the decompressed first block. Then, the shader unit compresses the merged data and writes the merged data to the cache.

13.

发明授权
Graphics discard engine 有权

公开(公告)号：US12236529B2

公开(公告)日：2025-02-25

申请号：US17562653

申请日：2021-12-27

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： Christopher J. Brennan , Randy Wayne Ramsey , Nishank Pathak , Ricky Wai Yeung Iu , Jimshed Mirza , Anthony Chan

IPC: G06T17/20 , G06T1/60 , G06T15/00 , G06T17/10

Abstract: Systems, apparatuses, and methods for implementing a discard engine in a graphics pipeline are disclosed. A system includes a graphics pipeline with a geometry engine launching shaders that generate attribute data for vertices of each primitive of a set of primitives. The attribute data is consumed by pixel shaders, with each pixel shader generating a deallocation message when the pixel shader no longer needs the attribute data. A discard engine gathers deallocations from multiple pixel shaders and determines when the attribute data is no longer needed. Once a block of attributes has been consumed by all potential pixel shader consumers, the discard engine deallocates the given block of attributes. The discard engine sends a discard command to the caches so that the attribute data can be invalidated and not written back to memory.

14.

发明授权
Data driven scheduler on multiple computing cores 有权

公开(公告)号：US10649810B2

公开(公告)日：2020-05-12

申请号：US14981257

申请日：2015-12-28

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： Jimshed Mirza , YunPeng Zhu

IPC: G06F9/50

Abstract: Methods, devices, and systems for data driven scheduling of a plurality of computing cores of a processor. A plurality of threads may be executed on the plurality of computing cores, according to a default schedule. The plurality of threads may be analyzed, based on the execution, to determine correlations among the plurality of threads. A data driven schedule may be generated based on the correlations. The plurality of threads may be executed on the plurality of computing cores according to the data driven schedule.

15.

发明授权
Register allocation modes in a GPU based on total, maximum concurrent, and minimum number of registers needed by complex shaders 有权

公开(公告)号：US10353859B2

公开(公告)日：2019-07-16

申请号：US15432173

申请日：2017-02-14

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： YunPeng Zhu , Jimshed Mirza

IPC: G06F9/38 , G06F15/78 , G06F9/50 , G06F9/46

Abstract: A method for allocating registers in a compute unit of a vector processor includes determining a maximum number of registers that are to be used concurrently by a plurality of threads of a kernel at the compute unit. The method further includes setting a mode of register allocation at the compute unit based on a comparison of the determined maximum number of registers and a total number of physical registers implemented at the compute unit.

16.

发明授权
Input/output memory map unit and northbridge 有权

公开(公告)号：US10223280B2

公开(公告)日：2019-03-05

申请号：US16025449

申请日：2018-07-02

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： Vydhyanathan Kalyanasundharam , Yaniv Adiri , Philip Ng , Maggie Chan , Vincent Cueva , Anthony Asaro , Jimshed Mirza , Greggory D. Donley , Bryan Broussard , Benjamin Tsien

IPC: G06F3/14 , G06F13/38 , G06F12/1009 , G06F12/12 , G06F12/1045

Abstract: A system including a gasket communicatively coupled between a unified northbridge (UNB) having a cache coherent interconnect (CCI) interface and a processor having an Advanced eXtensible Interface (AXI) coherency extension (ACE). The gasket is configured to translate requests from the processor that include ACE commands into equivalent CCI commands, wherein each request from the processor maps onto a specific CCI request type. The gasket is further configured to translate ACE tags into CCI tags. The gasket is further configured to translate CCI encoded probes from a system resource interface (SRI) into equivalent ACE snoop transactions. The gasket is further configured to translate the memory map to inter-operate with a UNB/coherent HyperTransport (cHT) environment. The gasket is further configured to receive a barrier transaction that is used to provide ordering for transactions.

17.

发明申请
FLEXIBLE SHADER EXPORT DESIGN IN MULTIPLE COMPUTING CORES 审中-公开

公开(公告)号：US20180314528A1

公开(公告)日：2018-11-01

申请号：US15607118

申请日：2017-05-26

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： Yunpeng Zhu , Jimshed Mirza

IPC: G06F9/38 , G06F9/48

Abstract: Systems, apparatuses, and methods for generating flexibly addressed memory requests are disclosed. In one embodiment, a system includes a processor, control unit, and memory subsystem. The processor launches a plurality of threads on a plurality of compute units, wherein each thread generates memory requests without specifying target memory addresses. The threads executing on the plurality of compute units convey a plurality of memory requests to the control unit. The control unit generates target memory addresses for the plurality of received memory requests. In one embodiment, the memory requests are write requests, and the control unit interleaves write requests from the plurality of threads into a single output buffer stored in the memory subsystem. The control unit can be located in a cache, in a memory controller, or in another location within the system.

18.

发明申请
INPUT/OUTPUT MEMORY MAP UNIT AND NORTHBRIDGE 审中-公开

公开(公告)号：US20180307619A1

公开(公告)日：2018-10-25

申请号：US16025449

申请日：2018-07-02

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： Vydhyanathan Kalyanasundharam , Philip Ng , Maggie Chan , Vincent Cueva , Anthony Asaro , Jimshed Mirza , Greggory D. Donley , Bryan Broussard , Benjamin Tsien , Yaniv Adiri

IPC: G06F12/1009 , G06F12/1045 , G06F12/12

CPC classification number: G06F12/1009 , G06F12/1045 , G06F12/12 , G06F2212/684

Abstract: A system including a gasket communicatively coupled between a unified northbridge (UNB) having a cache coherent interconnect (CCI) interface and a processor having an Advanced eXtensible Interface (AXI) coherency extension (ACE). The gasket is configured to translate requests from the processor that include ACE commands into equivalent CCI commands, wherein each request from the processor maps onto a specific CCI request type. The gasket is further configured to translate ACE tags into CCI tags. The gasket is further configured to translate CCI encoded probes from a system resource interface (SRI) into equivalent ACE snoop transactions. The gasket is further configured to translate the memory map to inter-operate with a UNB/coherent HyperTransport (cHT) environment. The gasket is further configured to receive a barrier transaction that is used to provide ordering for transactions.

19.

发明授权
Input/output memory map unit and northbridge 有权

公开(公告)号：US10025721B2

公开(公告)日：2018-07-17

申请号：US14523705

申请日：2014-10-24

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： Vydhyanathan Kalyanasundharam , Philip Ng , Maggie Chan , Vincent Cueva , Anthony Asaro , Jimshed Mirza , Greggory D. Donley , Bryan Broussard , Benjamin Tsien , Yaniv Adiri

IPC: G06F12/10 , G06F12/12 , G06F12/1009 , G06F12/1045 , G06F13/38

Abstract: The present invention provides for page table access and dirty bit management in hardware via a new atomic test[0] and OR and Mask. The present invention also provides for a gasket that enables ACE to CCI translations. This gasket further provides request translation between ACE and CCI, deadlock avoidance for victim and probe collision, ARM barrier handling, and power management interactions. The present invention also provides a solution for ARM victim/probe collision handling which deadlocks the unified northbridge. These solutions includes a dedicated writeback virtual channel, probes for IO requests using 4-hop protocol, and a WrBack Reorder Ability in MCT where victims update older requests with data as they pass the requests.

20.

发明申请
METHOD AND APPARATUS FOR TRANSLATION LOOKASIDE BUFFER WITH MULTIPLE COMPRESSED ENCODINGS 审中-公开

公开(公告)号：US20170315927A1

公开(公告)日：2017-11-02

申请号：US15139902

申请日：2016-04-27

Applicant: ATI Technologies ULC , Advanced Micro Devices, Inc.

Inventor： Gabriel H. Loh , Jimshed Mirza

IPC: G06F12/1027 , G06F12/1009

CPC classification number: G06F12/1027 , G06F12/1009 , G06F2212/1021 , G06F2212/401 , G06F2212/502 , G06F2212/656 , G06F2212/684 , Y02D10/13

Abstract: Methods and apparatus obtain one or more system page table entries that represent virtual system (e.g., memory) page to physical system page translations. A number of the obtained system page table entries that can be encoded in each of a plurality of translation lookaside buffer (TLB) entry encoding formats are determined. The method and apparatus may select one of the TLB entry encoding formats that encode a number of the obtained system page table entries. The method and apparatus may encode a number of obtained system page table entries in the TLB entry encoding format selected into a compressed encoding format TLB entry. The method and apparatus may associate the compressed encoding format TLB entry with an encoding format indication of the encoding format selected. The method and apparatus may decode a compressed encoding format TLB entry based on a determined TLB entry encoding format.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification