-
公开(公告)号:US20230195626A1
公开(公告)日:2023-06-22
申请号:US17558008
申请日:2021-12-21
Applicant: ADVANCED MICRO DEVICES, INC. , ATI TECHNOLOGIES ULC
Inventor: Saurabh Sharma , Jeremy Lukacs , Hashem Hashemi , Gianpaolo Tommasi , Guennadi Riguer , Mark Fowler , Randy Ramsey
IPC: G06F12/0806 , G06F12/10
CPC classification number: G06F12/0806 , G06F12/10 , G06F2212/1016
Abstract: A processing system is configured to translate a first cache access pattern of a dispatch of work items to a cache access pattern that facilitates consumption of data stored at a cache of a parallel processing unit by a subsequent access before the data is evicted to a more remote level of the memory hierarchy. For consecutive cache accesses having read-after-read data locality, in some embodiments the processing system translates the first cache access pattern to a space-filling curve. In some embodiments, for consecutive accesses having read-after-write data locality, the processing system translates a first typewriter cache access pattern that proceeds in ascending order for a first access to a reverse typewriter cache access pattern that proceeds in descending order for a subsequent cache access. By translating the cache access pattern based on data locality, the processing system increases the hit rate of the cache.
-
282.
公开(公告)号:US20230195618A1
公开(公告)日:2023-06-22
申请号:US17557568
申请日:2021-12-21
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Johnathan Alsop , Nuwan Jayasena
IPC: G06F12/06
CPC classification number: G06F12/06
Abstract: Near-memory compute elements perform memory operations and temporarily store at least a portion of address information for the memory operations in local storage. A broadcast memory command is then issued to the near-memory compute elements that causes the near-memory compute elements to perform a subsequent memory operation using their respective address information stored in the local storage. This allows a single broadcast memory command to be used to perform memory operations across multiple memory elements, such as DRAM banks, using bank-specific address information. In one implementation, the approach is used to process workloads with irregular updates to memory while consuming less command bus bandwidth than conventional approaches. Implementations include using conditional flags to selectively designate address information in local storage that is to be processed with the broadcast memory command.
-
283.
公开(公告)号:US11682445B2
公开(公告)日:2023-06-20
申请号:US17526429
申请日:2021-11-15
Applicant: Advanced Micro Devices, Inc.
Inventor: Kevin M. Brandl , Naveen Davanam , Oswin E. Housty
IPC: G11C11/406 , G06F12/02 , G06F13/40 , G06F13/16 , G06F1/3234
CPC classification number: G11C11/40622 , G06F1/3275 , G06F12/0238 , G06F13/1689 , G06F13/4072 , G06F13/4086
Abstract: A system and method for use in dynamic random-access memory (DRAM) comprising entering into a self-refresh mode of operation, exiting the self-refresh mode of operation in response to commands from a self-refresh state machine memory operation (MOP) array, and updating a device state of the DRAM for a target power management state in response to commands from the MOP array.
-
284.
公开(公告)号:US20230185742A1
公开(公告)日:2023-06-15
申请号:US18103240
申请日:2023-01-30
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Max RUTTENBERG , Vendula Venkata Srikant BHARADWAJ , Yasuko ECKERT , Anthony GUTIERREZ , Mark H. OSKIN
IPC: G06F13/16 , G11C11/4076
CPC classification number: G06F13/1668 , G11C11/4076
Abstract: A processor distributes memory timing parameters and data among different memory modules based upon memory access patterns. The memory access patterns indicate different types, or classes, of data for an executing workload, with each class associated with different memory access characteristics, such as different row buffer hit rate levels, different frequencies of access, different criticalities, and the like. The processor assigns each memory module to a data class and sets the memory timing parameters for each memory module according to the module’s assigned data class, thereby tailoring the memory timing parameters for efficient access of the corresponding data.
-
公开(公告)号:US20230185607A1
公开(公告)日:2023-06-15
申请号:US17993490
申请日:2022-11-23
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Anthony GUTIERREZ , Sooraj PUTHOOR
CPC classification number: G06F9/4881 , G06F9/542 , G06F9/545 , G06F9/546 , G06F9/3877
Abstract: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.
-
公开(公告)号:US20230185478A1
公开(公告)日:2023-06-15
申请号:US17552015
申请日:2021-12-15
Applicant: Advanced Micro Devices, Inc.
IPC: G06F3/06 , G06F12/0802 , G06F9/50
CPC classification number: G06F3/0655 , G06F3/0653 , G06F3/0602 , G06F12/0802 , G06F3/065 , G06F9/5027 , G06F2212/60
Abstract: One or both of read and write accesses to a fabric-attached memory module via a fabric interconnect are monitored. In one or more implementations, offloading of one or more tasks accessing the fabric-attached memory module to a processor of a routing system associated with the fabric-attached memory module is initiated based on the read and write accesses to the fabric-attached memory module. Additionally or alternatively, replicating memory of the fabric-attached memory module to a cache memory of a computing node in the disaggregated memory system executing one or more tasks of a host application is initiated based on the write accesses to the fabric-attached memory module.
-
公开(公告)号:US11676940B2
公开(公告)日:2023-06-13
申请号:US17003113
申请日:2020-08-26
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Lei Fu , Brett P. Wilkerson , Rahul Agarwal
IPC: H01L25/065 , H01L23/538 , H01L23/00
CPC classification number: H01L25/0655 , H01L23/5381 , H01L23/5389 , H01L24/13 , H01L2225/06541
Abstract: A chip for hybrid bonded interconnect bridging for chiplet integration, the chip comprising: a first chiplet; a second chiplet; an interconnecting die coupled to the first chiplet and the second chiplet through a hybrid bond.
-
公开(公告)号:US11675568B2
公开(公告)日:2023-06-13
申请号:US17121354
申请日:2020-12-14
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Bin He , Brian Emberling , Mark Leather , Michael Mantor
CPC classification number: G06F7/57 , G06F9/3867 , G06F17/16 , G06T1/20 , G06F15/8015
Abstract: A processing system executes wavefronts at multiple arithmetic logic unit (ALU) pipelines of a single instruction multiple data (SIMD) unit in a single execution cycle. The ALU pipelines each include a number of ALUs that execute instructions on wavefront operands that are collected from vector general process register (VGPR) banks at a cache and output results of the instructions executed on the wavefronts at a buffer. By storing wavefronts supplied by the VGPR banks at the cache, a greater number of wavefronts can be made available to the SIMD unit without increasing the VGPR bandwidth, enabling multiple ALU pipelines to execute instructions during a single execution cycle.
-
公开(公告)号:US20230176608A1
公开(公告)日:2023-06-08
申请号:US17850299
申请日:2022-06-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Aaron John Nygren , Karthik Gopalakrishnan , Tsun Ho Liu
Abstract: A memory includes a read clock state machine and a read clock driver circuit. The read clock state machine has a first input for receiving a read command signal, a second input for receiving a read clock mode signal, and an output for providing a drive enable signal. The read clock driver circuit has an output for providing a read clock signal in response to a clock signal when the drive enable signal is active. When the read clock mode signal indicates a read-only mode, the read clock state machine starts toggling the read clock signal during a read preamble period before a data transmission of a first read command, and continues toggling the read clock signal for at least a read postamble period following the data transmission of the first read command.
-
公开(公告)号:US11669274B2
公开(公告)日:2023-06-06
申请号:US17218676
申请日:2021-03-31
Applicant: Advanced Micro Devices, Inc.
Inventor: Kedarnath Balakrishnan
IPC: G06F3/06
CPC classification number: G06F3/0659 , G06F3/0604 , G06F3/0614 , G06F3/0653 , G06F3/0679
Abstract: A memory controller includes an arbiter for selecting memory requests from a command queue for transmission to a dynamic random access memory (DRAM) memory. The arbiter includes a bank group tracking circuit that tracks bank group numbers of three or more prior write requests selected by the arbiter. The arbiter also includes a selection circuit that selects requests to be issued from the command queue, and prevents selection of write requests and associated activate commands to the tracked bank group numbers unless no other write request is eligible in the command queue. The bank group tracking circuit indicates that a prior write request and the associated activate commands are eligible to be issued after a number of clock cycles has passed corresponding to a minimum write-to-write timing period for a bank group of the prior write request.
-
-
-
-
-
-
-
-
-