-
381.
公开(公告)号:US20220317926A1
公开(公告)日:2022-10-06
申请号:US17219446
申请日:2021-03-31
Applicant: Advanced Micro Devices, Inc.
Inventor: Shaizeen Aga , Nuwan Jayasena , Johnathan Alsop
IPC: G06F3/06
Abstract: Ordering between memory-centric memory operations, referred to hereinafter as “MC-Mem-Ops,” and core-centric memory operations, referred to hereinafter as “CC-Mem-Ops,” is enforced using inter-centric fences, referred to hereinafter as an “IC-fences.” IC-fences are implemented by an ordering primitive or ordering instruction, that cause a memory controller, a cache controller, etc., to enforce ordering of MC-Mem-Ops and CC-Mem-Ops throughout the memory pipeline and at the memory controller by not reordering MC-Mem-Ops (or sometimes CC-Mem-Ops) that arrive before the IC-fence to after the IC-fence. Processing of an IC-fence also causes the memory controller to issue an ordering acknowledgment to the thread that issued the IC-fence instruction. IC-fences are tracked at the core and designated as complete when the ordering acknowledgment is received. Embodiments include a completion level-specific cache flush operation which, when used with an IC-fence, provides proper ordering between cached CC-Mem-Ops and MC-Mem-ops with reduced data transfer and completion times.
-
公开(公告)号:US20220317876A1
公开(公告)日:2022-10-06
申请号:US17218700
申请日:2021-03-31
Applicant: Advanced Micro Devices, Inc.
Inventor: Johnathan Alsop , Nuwan Jayasena , Shaizeen Aga , Andrew McCrabb
IPC: G06F3/06
Abstract: Methods and apparatuses to control digital data transfer via a memory channel between a memory module and a processor are disclosed. At least one of the memory module or the processor coalesces a plurality of short data words into multicast coalesced block data comprising a single data block for transfer via the memory channel. Each of the plurality of short data words pertains to one of at least two partitioned memory submodules in the memory module. The multicast coalesced block data is communicated over the memory channel.
-
公开(公告)号:US20220317755A1
公开(公告)日:2022-10-06
申请号:US17219407
申请日:2021-03-31
Applicant: Advanced Micro Devices, Inc.
Inventor: James R. Magro , Christopher Weaver , Abhishek Kumar Verma
IPC: G06F1/3234 , G06F1/3225 , G06F1/3287 , G06F1/08 , G06F3/06
Abstract: A memory controller couples to a data fabric clock domain, and to a physical layer interface circuit PHY clock domain. A first interface circuit adapts transfers between the data fabric clock domain (FCLK) and the memory controllers clock domain, and a second interface circuit couples the memory controller to the PHY clock domain. A power controller responds to a power state change request by sending commands to the second interface circuit to change parameters of a memory system and to update a set of timing parameters of the memory controller according to a selected power state of a plurality of power states. The power controller further responds to a request to synchronize with a new frequency on the FCLK domain by changing a set of timing parameters of the clock interface circuit without changing the set of timing parameters of the memory system or the selected power state.
-
公开(公告)号:US11461137B2
公开(公告)日:2022-10-04
申请号:US16721456
申请日:2019-12-19
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Rex Eldon McCrary
IPC: G06F9/54 , G06F9/48 , G06F9/38 , G06F12/0831 , G06F9/30
Abstract: A first processing unit such as a graphics processing unit (GPU) pipelines that execute commands and a scheduler to schedule one or more first commands for execution by one or more of the pipelines. The one or more first commands are received from a user mode driver in a second processing unit such as a central processing unit (CPU). The scheduler schedules one or more second commands for execution in response to completing execution of the one or more first commands and without notifying the second processing unit. In some cases, the first processing unit includes a direct memory access (DMA) engine that writes blocks of information from the first processing unit to a memory. The one or more second commands program the DMA engine to write a block of information including results generated by executing the one or more first commands.
-
公开(公告)号:US20220309729A1
公开(公告)日:2022-09-29
申请号:US17565394
申请日:2021-12-29
Applicant: Advanced Micro Devices, Inc.
Inventor: Ruijin Wu , Mika Tuomi , Paavo Sampo Ilmari Pessi , Anirudh R. Acharya
Abstract: A method of tiled rendering is provided which comprises dividing a frame to be rendered, into a plurality of tiles, receiving commands to execute a plurality of subpasses of the tiles and interleaving execution of same subpasses of multiple tiles of the frame. Interleaving execution of same subpasses of multiple tiles comprises executing a previously ordered first subpass of a second tile between execution of the previously ordered first subpass of a first tile and execution of a subsequently ordered second subpass of the first tile. The interleaving is performed, for example, by executing the plurality of subpasses in an order different from the order in which the commands to execute the plurality of subpasses are stored and issued. Alternatively, interleaving is performed by executing one or more subpasses as skip operations such that the plurality of subpasses are executed in the same order.
-
公开(公告)号:US20220309606A1
公开(公告)日:2022-09-29
申请号:US17214762
申请日:2021-03-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Pramod Vasant Argade , Martin G. Sarov , Milind N. Nemlekar
Abstract: Techniques for managing register allocation are provided. The techniques include detecting a first request to allocate first registers for a first wavefront; first determining, based on allocation information, that allocating the first registers to the first wavefront would result in a condition in which a deadlock is possible; in response to the first determining, refraining from allocating the first registers to the first wavefront; detecting a second request to allocate second registers for a second wavefront; second determining, based on the allocation information, that allocating the second registers to the second wavefront would result in a condition in which deadlock is not possible; and in response to the second determining, allocating the second registers to the second wavefront.
-
公开(公告)号:US20220309013A1
公开(公告)日:2022-09-29
申请号:US17214771
申请日:2021-03-26
Applicant: Advanced Micro Devices, Inc.
Inventor: Eric Christopher Morton , Pravesh Gupta , Bryan P Broussard , Li Ou
Abstract: A computing system may implement a low priority arbitration interrupt method that includes receiving a message signaled interrupt (MSI) message from an input output hub (I/O hub) transmitted over an interconnect fabric, selecting a processor to interrupt from a cluster of processors based on arbitration parameters, and communicating an interrupt service routine to the selected processor, wherein the I/O hub and the cluster of processors are located within a common domain.
-
388.
公开(公告)号:US11455252B2
公开(公告)日:2022-09-27
申请号:US16454027
申请日:2019-06-26
Applicant: Advanced Micro Devices, Inc.
Inventor: John Kalamatianos , Paul S. Keltcher , Mayank Chhablani , Alok Garg , Furkan Eris
IPC: G06F12/0862 , G06F16/22 , G06N20/20
Abstract: Techniques for generating a model for predicting when different hybrid prefetcher configurations should be used are disclosed. Techniques for using the model to predict when different hybrid prefetcher configurations should be used are also disclosed. The techniques for generating the model include obtaining a set of input data, and generating trees based on the training data. Each tree is associated with a different hybrid prefetcher configuration and the trees output certainty scores for the associated hybrid prefetcher configuration based on hardware feature measurements. To decide on a hybrid prefetcher configuration to use, a prefetcher traverses multiple trees to obtain certainty scores for different hybrid prefetcher configurations and identifies a hybrid prefetcher configuration to used based on a comparison of the certainty scores.
-
公开(公告)号:US11455153B2
公开(公告)日:2022-09-27
申请号:US16544796
申请日:2019-08-19
Applicant: Advanced Micro Devices, Inc.
Inventor: Nicolai Haehnle
IPC: G06F8/41
Abstract: A computing system includes a processor and a memory storing instructions for a compiler that, when executed by the processor, cause the processor to generate a control flow graph of program source code by receiving the program source code in the compiler, in the compiler, generating a structure point representation based on the received program source code by inserting into the program source code a set of structure points including an anchor structure point and a join structure point associated with the anchor structure point, and based on the structure point representation, generating the control flow graph including a plurality of blocks each representing a portion of the program source code. In the control flow graph, a block A between the anchor structure point and the join structure point post-dominates each of the one or more divergent branches between the anchor structure point and the join structure point.
-
公开(公告)号:US11443051B2
公开(公告)日:2022-09-13
申请号:US16228349
申请日:2018-12-20
Applicant: ATI TECHNOLOGIES ULC , ADVANCED MICRO DEVICES, INC.
Inventor: Benjamin Koon Pan Chan , William Lloyd Atkinson , Tung Chuen Kwong , Guhan Krishnan
Abstract: A computer vision processor in an image cluster defines a fenced memory region (FMR) that controls access to image data stored in a first portion of a trusted memory region (TMR). The computer vision processor receives FMR requests from an application implemented in a processing cluster. The FMR requests are to access the image data in the first portion of the TMR. The computer vision processor selectively allows the requesting application to access the image data. In some cases, the computer vision processor acquires the image data and stores the image data in the first portion of the TMR, such as buffers in the TMR. A data fabric selectively permits the image processing application to access the data stored in the TMR based on whether the image cluster has opened or closed the FMR for the portion of the TMR.
-
-
-
-
-
-
-
-
-