Patent search aee:"Advanced Micro Devices Inc." Page 29

281.

发明公开
STOCHASTIC OPTIMIZATION OF SURFACE CACHEABILITY IN PARALLEL PROCESSING UNITS 审中-公开

公开(公告)号：US20230195639A1

公开(公告)日：2023-06-22

申请号：US17557475

申请日：2021-12-21

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Saurabh Sharma , Jeremy Lukacs , Hashem Hashemi , Gianpaolo Tommasi , Christopher J. Brennan

IPC: G06F12/0893

CPC classification number: G06F12/0893 , G06F2212/6042

Abstract: A processing system selectively allocates storage at a local cache of a parallel processing unit for cache lines of a repeating pattern of data that exceeds the storage capacity of the cache. The processing system identifies repeating patterns of data having cache lines that have a reuse distance that exceeds the storage capacity of the cache. A cache controller allocates storage for only a subset of cache lines of the repeating pattern of data at the cache and excludes the remainder of cache lines of the repeating pattern of data from the cache. By restricting the cache to store only a subset of cache lines of the repeating pattern of data, the cache controller increases the hit rate at the cache for the subset of cache lines.

282.

发明公开
VARIABLE DISPATCH WALK FOR SUCCESSIVE CACHE ACCESSES 审中-公开

公开(公告)号：US20230195626A1

公开(公告)日：2023-06-22

申请号：US17558008

申请日：2021-12-21

Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC

Inventor： Saurabh Sharma , Jeremy Lukacs , Hashem Hashemi , Gianpaolo Tommasi , Guennadi Riguer , Mark Fowler , Randy Ramsey

IPC: G06F12/0806 , G06F12/10

CPC classification number: G06F12/0806 , G06F12/10 , G06F2212/1016

Abstract: A processing system is configured to translate a first cache access pattern of a dispatch of work items to a cache access pattern that facilitates consumption of data stored at a cache of a parallel processing unit by a subsequent access before the data is evicted to a more remote level of the memory hierarchy. For consecutive cache accesses having read-after-read data locality, in some embodiments the processing system translates the first cache access pattern to a space-filling curve. In some embodiments, for consecutive accesses having read-after-write data locality, the processing system translates a first typewriter cache access pattern that proceeds in ascending order for a first access to a reverse typewriter cache access pattern that proceeds in descending order for a subsequent cache access. By translating the cache access pattern based on data locality, the processing system increases the hit rate of the cache.

283.

发明公开
APPROACH FOR PERFORMING EFFICIENT MEMORY OPERATIONS USING NEAR-MEMORY COMPUTE ELEMENTS 审中-公开

公开(公告)号：US20230195618A1

公开(公告)日：2023-06-22

申请号：US17557568

申请日：2021-12-21

Applicant: Advanced Micro Devices, Inc.

Inventor： Shaizeen Aga , Johnathan Alsop , Nuwan Jayasena

IPC: G06F12/06

CPC classification number: G06F12/06

Abstract: Near-memory compute elements perform memory operations and temporarily store at least a portion of address information for the memory operations in local storage. A broadcast memory command is then issued to the near-memory compute elements that causes the near-memory compute elements to perform a subsequent memory operation using their respective address information stored in the local storage. This allows a single broadcast memory command to be used to perform memory operations across multiple memory elements, such as DRAM banks, using bank-specific address information. In one implementation, the approach is used to process workloads with irregular updates to memory while consuming less command bus bandwidth than conventional approaches. Implementations include using conditional flags to selectively designate address information in local storage that is to be processed with the broadcast memory command.

284.

发明授权
Memory context restore, reduction of boot time of a system on a chip by reducing double data rate memory training 有权

公开(公告)号：US11682445B2

公开(公告)日：2023-06-20

申请号：US17526429

申请日：2021-11-15

Applicant: Advanced Micro Devices, Inc.

Inventor： Kevin M. Brandl , Naveen Davanam , Oswin E. Housty

IPC: G11C11/406 , G06F12/02 , G06F13/40 , G06F13/16 , G06F1/3234

CPC classification number: G11C11/40622 , G06F1/3275 , G06F12/0238 , G06F13/1689 , G06F13/4072 , G06F13/4086

Abstract: A system and method for use in dynamic random-access memory (DRAM) comprising entering into a self-refresh mode of operation, exiting the self-refresh mode of operation in response to commands from a self-refresh state machine memory operation (MOP) array, and updating a device state of the DRAM for a target power management state in response to commands from the MOP array.

285.

发明公开
DISTRIBUTION OF DATA AND MEMORY TIMING PARAMETERS ACROSS MEMORY MODULES BASED ON MEMORY ACCESS PATTERNS 审中-公开

公开(公告)号：US20230185742A1

公开(公告)日：2023-06-15

申请号：US18103240

申请日：2023-01-30

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Max RUTTENBERG , Vendula Venkata Srikant BHARADWAJ , Yasuko ECKERT , Anthony GUTIERREZ , Mark H. OSKIN

IPC: G06F13/16 , G11C11/4076

CPC classification number: G06F13/1668 , G11C11/4076

Abstract: A processor distributes memory timing parameters and data among different memory modules based upon memory access patterns. The memory access patterns indicate different types, or classes, of data for an executing workload, with each class associated with different memory access characteristics, such as different row buffer hit rate levels, different frequencies of access, different criticalities, and the like. The processor assigns each memory module to a data class and sets the memory timing parameters for each memory module according to the module’s assigned data class, thereby tailoring the memory timing parameters for efficient access of the corresponding data.

286.

发明公开
HARDWARE ACCELERATED DYNAMIC WORK CREATION ON A GRAPHICS PROCESSING UNIT 审中-公开

公开(公告)号：US20230185607A1

公开(公告)日：2023-06-15

申请号：US17993490

申请日：2022-11-23

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Anthony GUTIERREZ , Sooraj PUTHOOR

IPC: G06F9/48 , G06F9/54 , G06F9/38

CPC classification number: G06F9/4881 , G06F9/542 , G06F9/545 , G06F9/546 , G06F9/3877

Abstract: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.

287.

发明公开
Alleviating Interconnect Traffic in a Disaggregated Memory System 审中-公开

公开(公告)号：US20230185478A1

公开(公告)日：2023-06-15

申请号：US17552015

申请日：2021-12-15

Applicant: Advanced Micro Devices, Inc.

Inventor： Vamsee Reddy Kommareddy , SeyedMohammad SeyedzadehDelcheh , Sergey Blagodurov

IPC: G06F3/06 , G06F12/0802 , G06F9/50

CPC classification number: G06F3/0655 , G06F3/0653 , G06F3/0602 , G06F12/0802 , G06F3/065 , G06F9/5027 , G06F2212/60

Abstract: One or both of read and write accesses to a fabric-attached memory module via a fabric interconnect are monitored. In one or more implementations, offloading of one or more tasks accessing the fabric-attached memory module to a processor of a routing system associated with the fabric-attached memory module is initiated based on the read and write accesses to the fabric-attached memory module. Additionally or alternatively, replicating memory of the fabric-attached memory module to a cache memory of a computing node in the disaggregated memory system executing one or more tasks of a host application is initiated based on the write accesses to the fabric-attached memory module.

288.

发明授权
Hybrid bonded interconnect bridging 有权

公开(公告)号：US11676940B2

公开(公告)日：2023-06-13

申请号：US17003113

申请日：2020-08-26

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Lei Fu , Brett P. Wilkerson , Rahul Agarwal

IPC: H01L25/065 , H01L23/538 , H01L23/00

CPC classification number: H01L25/0655 , H01L23/5381 , H01L23/5389 , H01L24/13 , H01L2225/06541

Abstract: A chip for hybrid bonded interconnect bridging for chiplet integration, the chip comprising: a first chiplet; a second chiplet; an interconnecting die coupled to the first chiplet and the second chiplet through a hybrid bond.

289.

发明授权
Dual vector arithmetic logic unit 有权

公开(公告)号：US11675568B2

公开(公告)日：2023-06-13

申请号：US17121354

申请日：2020-12-14

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Bin He , Brian Emberling , Mark Leather , Michael Mantor

IPC: G06F7/57 , G06F17/16 , G06T1/20 , G06F9/38 , G06F15/80

CPC classification number: G06F7/57 , G06F9/3867 , G06F17/16 , G06T1/20 , G06F15/8015

Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.

290.

发明公开
READ CLOCK START AND STOP FOR SYNCHRONOUS MEMORIES 审中-公开

公开(公告)号：US20230176608A1

公开(公告)日：2023-06-08

申请号：US17850299

申请日：2022-06-27

Applicant: Advanced Micro Devices, Inc.

Inventor： Aaron John Nygren , Karthik Gopalakrishnan , Tsun Ho Liu

IPC: G06F1/08 , G06F1/10

CPC classification number: G06F1/08 , G06F1/10

Abstract: A memory includes a read clock state machine and a read clock driver circuit. The read clock state machine has a first input for receiving a read command signal, a second input for receiving a read clock mode signal, and an output for providing a drive enable signal. The read clock driver circuit has an output for providing a read clock signal in response to a clock signal when the drive enable signal is active. When the read clock mode signal indicates a read-only mode, the read clock state machine starts toggling the read clock signal during a read preamble period before a data transmission of a first read command, and continues toggling the read clock signal for at least a read postamble period following the data transmission of the first read command.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification