-
公开(公告)号:US20210272229A1
公开(公告)日:2021-09-02
申请号:US16804345
申请日:2020-02-28
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: Rex Eldon MCCRARY
Abstract: An apparatus such as a graphics processing unit (GPU) includes shader engines and front end (FE) circuits. Subsets of the FE circuits are configured to schedule commands for execution on corresponding subsets of the shader engines. The apparatus also includes a set of physical paths configured to convey information from the FE circuits to a memory via the shader engines. Subsets of the physical paths are allocated to the subsets of the FE circuits and the corresponding subsets of the shader engines. The apparatus further includes a scheduler configured to receive a reconfiguration request and modify the set of physical paths based on the reconfiguration request. In some cases, the reconfiguration request is provided by a central processing unit (CPU) that requests the modification based on characteristics of applications generating the commands.
-
公开(公告)号:US11106596B2
公开(公告)日:2021-08-31
申请号:US15389955
申请日:2016-12-23
Applicant: Advanced Micro Devices, Inc.
Inventor: John M. King , Michael T. Clark
IPC: G06F12/1027 , G06F12/123 , G06F12/127
Abstract: Methods, devices, and systems for determining an address in a physical memory which corresponds to a virtual address using a skewed-associative translation lookaside buffer (TLB) are described. A virtual address and a configuration indication are received using receiver circuitry. A physical address corresponding to the virtual address is output if a TLB hit occurs. A first subset of a plurality of ways of the TLB is configured to hold a first page size. The first subset includes a number of the ways based on the configuration indication. A physical address corresponding to the virtual address is retrieved from a page table if a TLB miss occurs, and at least a portion of the physical address is installed in a least recently used way of a subset of a plurality of ways the TLB, determined according to a replacement policy based on the configuration indication.
-
公开(公告)号:US11100604B2
公开(公告)日:2021-08-24
申请号:US16263709
申请日:2019-01-31
Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC
Inventor: Jeffrey Gongxian Cheng , Ahmed M. Abdelkhalek , Yinan Jiang , Xingsheng Wan , Anthony Asaro , David Martinez Nieto
Abstract: Systems, apparatuses, and methods for scheduling jobs for multiple frame-based applications are disclosed. A computing system executes a plurality of frame-based applications for generating pixels for display. The applications convey signals to a scheduler to notify the scheduler of various events within a given frame being rendered. The scheduler adjusts the priorities of applications based on the signals received from the applications. The scheduler attempts to adjust priorities of applications and schedule jobs from these applications so as to minimize the perceived latency of each application. When an application has enqueued the last job for the current frame, the scheduler raises the priority of the application to high. This results in the scheduler attempting to schedule all remaining jobs for the application back-to-back. Once all jobs of the application have been completed, the priority of the application is reduced, permitting jobs of other applications to be executed.
-
公开(公告)号:US11068458B2
公开(公告)日:2021-07-20
申请号:US16202082
申请日:2018-11-27
Applicant: Advanced Micro Devices, Inc.
Inventor: Mohamed Assem Ibrahim , Onur Kayiran , Yasuko Eckert
IPC: G06F16/22 , G06F16/901
Abstract: A portion of a graph dataset is generated for each computing node in a distributed computing system by, for each subject vertex in a graph, recording for the computing node an offset for the subject vertex, where the offset references a first position in an edge array for the computing node, and for each edge of a set of edges coupled with the subject vertex in the graph, calculating an edge value for the edge based on a connected vertex identifier identifying a vertex coupled with the subject vertex via the edge. When the edge value is assigned to the first position, the edge value is determined by a first calculation, and when the edge value is assigned to position subsequent to the first position, the edge value is determined by a second calculation. In the computing node, the edge value is recorded in the edge array.
-
公开(公告)号:US11064019B2
公开(公告)日:2021-07-13
申请号:US15265402
申请日:2016-09-14
Applicant: Advanced Micro Devices, Inc.
Inventor: Sergey Blagodurov
IPC: H04L29/08 , H04L12/24 , H04L12/725 , H04L12/927
Abstract: A server includes a plurality of nodes that are connected by a network that includes an on-chip network or an inter-chip network that connects the nodes. The server also includes a controller to configure the network based on relative priorities of workloads that are executing on the nodes. Configuring the network can include allocating buffers to virtual channels supported by the network based on the relative priorities of the workloads associated with the virtual channels, configuring routing tables that route the packets over the network based on the relative priorities of the workloads that generate the packets, or modifying arbitration weights to favor granting access to the virtual channels to packets generated by higher priority workloads.
-
公开(公告)号:US11062680B2
公开(公告)日:2021-07-13
申请号:US16227588
申请日:2018-12-20
Applicant: Advanced Micro Devices, Inc.
Inventor: Pazhani Pillai , Christopher J. Brennan
Abstract: Systems, apparatuses, and methods for implementing raster order view enforcement techniques are disclosed. A processor includes a plurality of compute units coupled to one or more memories. A plurality of waves are launched in parallel for execution on the plurality of compute units, where each wave comprises a plurality of threads. A dependency chain is generated for each wave of the plurality of waves. The compute units wait for all older waves to complete dependency chain generation prior to executing any threads with dependencies. Responsive to all older waves completing dependency chain generation, a given thread with a dependency is executed only if all other threads upon which the given thread is dependent have become inactive. When executed, the plurality of waves generate a plurality of pixels to be driven to a display.
-
497.
公开(公告)号:US11061583B2
公开(公告)日:2021-07-13
申请号:US16525971
申请日:2019-07-30
Applicant: Advanced Micro Devices, Inc.
Inventor: Andrew G. Kegel , Steven E. Raasch
IPC: G06F3/06 , G06F16/907 , G06F12/0891
Abstract: An electronic device includes a non-volatile memory and a controller. The controller receives data to be written to the non-volatile memory and determines a type of the data. Based on the type of the data, the controller selects a given duration of the data from among multiple durations of the data in the non-volatile memory. The controller sets values of one or more parameters for writing the data to the non-volatile memory based on the given duration. The controller writes the data to the non-volatile memory using the values of the one or more write parameters.
-
公开(公告)号:US11061572B2
公开(公告)日:2021-07-13
申请号:US15136851
申请日:2016-04-22
Applicant: Advanced Micro Devices, Inc.
Inventor: David A. Roberts , Michael Ignatowski
IPC: G06F3/06 , G06F9/50 , G06F12/1027 , G06F12/1045
Abstract: Described are a method and processing apparatus to tag and track objects related to memory allocation calls. An application or software adds a tag to a memory allocation call to enable object level tracking. An entry is made into an object tracking table, which stores the tag and a variety of statistics related to the object and associated memory devices. The object statistics may be queried by the application to tune power/performance characteristics either by the application making runtime placement decisions, or by off-line code tuning based on a previous run. The application may add a tag to a memory allocation call to specify the type of memory characteristics requested based on the object statistics.
-
公开(公告)号:US11055895B2
公开(公告)日:2021-07-06
申请号:US16554793
申请日:2019-08-29
Applicant: Advanced Micro Devices, Inc.
Inventor: David Ronald Oldcorn
Abstract: Described herein are techniques for reducing control flow divergence. The method includes identifying two or more shader programs having commonalities, generating a merged shader program that implements functionality of the identified two or more shader programs, wherein the functionality implemented includes a first execution option for a first shader program of the two or more shader programs and a second execution option for a second shader program of the two or more shader programs, modifying shader programs that call the first shader program to instead call the merged shader program and select the first execution option, modifying shader programs that call the second shader program to instead call the merged shader program and select the second execution option.
-
公开(公告)号:US20210200694A1
公开(公告)日:2021-07-01
申请号:US16728152
申请日:2019-12-27
Applicant: ADVANCED MICRO DEVICES, INC.
Inventor: JAMES R. MAGRO , KEDARNATH BALAKRISHNAN , RAVINDRA N. BHARGAVA , GUANHAO SHEN
IPC: G06F13/16 , G06F12/0882 , G06F12/0879 , G06F9/54
Abstract: Staging buffer arbitration includes: storing a plurality of memory access requests in a staging buffer; selecting a memory access request of the plurality of memory access requests from the staging buffer based on one or more arbitration rules; and moving the memory access request from the staging buffer to a command queue.
-
-
-
-
-
-
-
-
-