Patent search ap:"Advanced Micro Devices Page Inc."

581.

发明授权
Multiple-table branch target buffer 有权

公开(公告)号：US10713054B2

公开(公告)日：2020-07-14

申请号：US16030031

申请日：2018-07-09

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Thomas Cloqueur , Anthony Jarvis

IPC: G06F9/30 , G06F9/38

Abstract: A processor includes two or more branch target buffer (BTB) tables for branch prediction, each BTB table storing entries of a different target size or width or storing entries of a different branch type. Each BTB entry includes at least a tag and a target address. For certain branch types that only require a few target address bits, the respective BTB tables are narrower thereby allowing for more BTB entries in the processor separated into respective BTB tables by branch instruction type. An increased number of available BTB entries are stored in a same or a less space in the processor thereby increasing a speed of instruction processing. BTB tables can be defined that do not store any target address and rely on a decode unit to provide it. High value BTB entries have dedicated storage and are therefore less likely to be evicted than low value BTB entries.

582.

发明授权
Region based split-directory scheme to adapt to large cache sizes 有权

公开(公告)号：US10705959B2

公开(公告)日：2020-07-07

申请号：US16119438

申请日：2018-08-31

Applicant: Advanced Micro Devices, Inc.

Inventor： Vydhyanathan Kalyanasundharam , Kevin M. Lepak , Amit P. Apte , Ganesh Balakrishnan

IPC: G06F12/0817

Abstract: Systems, apparatuses, and methods for maintaining region-based cache directories split between node and memory are disclosed. The system with multiple processing nodes includes cache directories split between the nodes and memory to help manage cache coherency among the nodes' cache subsystems. In order to reduce the number of entries in the cache directories, the cache directories track coherency on a region basis rather than on a cache line basis, wherein a region includes multiple cache lines. Each processing node includes a node-based cache directory to track regions which have at least one cache line cached in any cache subsystem in the node. The node-based cache directory includes a reference count field in each entry to track the aggregate number of cache lines that are cached per region. The memory-based cache directory includes entries for regions which have an entry stored in any node-based cache directory of the system.

583.

发明授权
Method and device for determining branch prediction history for branch prediction by partially combining shifted branch prediction history with branch signature 有权

公开(公告)号：US10698691B2

公开(公告)日：2020-06-30

申请号：US15252168

申请日：2016-08-30

Applicant: Advanced Micro Devices, Inc.

Inventor： Steven R. Havlir

IPC: G06F9/38 , G06F9/30

Abstract: Disclosed are a method and a processing device directed to determining global branch history for branch prediction. The method includes shifting first bits of a branch signature into a current global branch history and performing a bitwise exclusive-or (XOR) function on second bits of the branch signature and shifted bits of the current global branch history. In this way, the current global branch history is updated. The processing device implements the method using a shift logic configured to store and shift bits representing a current global branch history, a register configured to store the current global branch history, decision circuitry configured to determine whether or not a branch is taken, and XOR gates.

584.

发明申请
VIRTUAL SPACE MEMORY BANDWIDTH REDUCTION 审中-公开

公开(公告)号：US20200183833A1

公开(公告)日：2020-06-11

申请号：US16215298

申请日：2018-12-10

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Swapnil SAKHARSHETE , Samuel Lawrence WASMUNDT

IPC: G06F12/06 , G06F12/109 , G06F17/16 , G06T1/20

Abstract: A processing system includes a central processing unit (CPU) and a graphics processing unit (GPU) that has a plurality of compute units. The GPU receives an image from the CPU and determines a total result area in a virtual-matrix-multiplication space of a virtual matrix-multiplication output matrix based on convolutional parameters associated with the image in an image space. The GPU partitions the total result area of the virtual matrix-multiplication output matrix into a plurality of virtual segments. The GPU allocates convolution operations to the plurality of compute units based on each virtual segment of the plurality of virtual segments.

585.

发明申请
PIPELINED MATRIX MULTIPLICATION AT A GRAPHICS PROCESSING UNIT 审中-公开

公开(公告)号：US20200183734A1

公开(公告)日：2020-06-11

申请号：US16211954

申请日：2018-12-06

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Milind N. NEMLEKAR

IPC: G06F9/48 , G06F17/16 , G06N3/04 , G06T1/20

Abstract: A graphics processing unit (GPU) schedules recurrent matrix multiplication operations at different subsets of CUs of the GPU. The GPU includes a scheduler that receives sets of recurrent matrix multiplication operations, such as multiplication operations associated with a recurrent neural network (RNN). The multiple operations associated with, for example, an RNN layer are fused into a single kernel, which is scheduled by the scheduler such that one work group is assigned per compute unit, thus assigning different ones of the recurrent matrix multiplication operations to different subsets of the CUs of the GPU. In addition, via software synchronization of the different workgroups, the GPU pipelines the assigned matrix multiplication operations so that each subset of CUs provides corresponding multiplication results to a different subset, and so that each subset of CUs executes at least a portion of the multiplication operations concurrently.

586.

发明申请
DYNAMIC VOLTAGE AND FREQUENCY SCALING BASED ON MEMORY CHANNEL SLACK 审中-公开

公开(公告)号：US20200183597A1

公开(公告)日：2020-06-11

申请号：US16212388

申请日：2018-12-06

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Shomit N. DAS , Kishore PUNNIYAMURTHY

IPC: G06F3/06

Abstract: A processing system scales power to memory and memory channels based on identifying causes of stalls of threads of a wavefront. If the cause is other than an outstanding memory request, the processing system throttles power to the memory to save power. If the stall is due to memory stalls for a subset of the memory channels servicing memory access requests for threads of a wavefront, the processing system adjusts power of the memory channels servicing memory access request for the wavefront based on the subset. By boosting power to the subset of channels, the processing system enables the wavefront to complete processing more quickly, resulting in increased processing speed. Conversely, by throttling power to the remainder of channels, the processing system saves power without affecting processing speed.

587.

发明授权
Single pass prefix sum in a vertex shader 有权

公开(公告)号：US10679316B2

公开(公告)日：2020-06-09

申请号：US16007893

申请日：2018-06-13

Applicant: Advanced Micro Devices, Inc.

Inventor： Sean M. O'Connell

IPC: G06T1/20 , G06T15/00 , G06T11/00 , G06T11/20

Abstract: Systems, apparatuses, and methods for implementing a single pass stipple pattern generation process are disclosed. A processor initiates parallel execution of a first and second plurality of wavefronts. A first wavefront of the first plurality of wavefronts converts a first local coordinate into a first global coordinate, wherein the first local coordinate corresponds to a first portion of a primitive. Also, a first wavefront of the second plurality of wavefronts applies a first attribute to the first global coordinate prior to a second wavefront, of the first plurality of wavefronts, converting a second local coordinate of a second portion of the primitive into a second global coordinate. The second plurality of wavefronts generate image data based on applying the first attribute to global coordinates generated by the first plurality of wavefronts, and the image data is conveyed for display on a display device.

588.

发明申请
DELIBERATE CONDITIONAL POISON TRAINING FOR GENERATIVE MODELS 审中-公开

公开(公告)号：US20200175329A1

公开(公告)日：2020-06-04

申请号：US16208384

申请日：2018-12-03

Applicant: Advanced Micro Devices, Inc.

Inventor： Nicholas Malaya

IPC: G06K9/62 , G06N3/08 , G06N3/04

Abstract: A generator for generating artificial data, and training for the same. Data corresponding to a first label is altered within a reference labeled data set. A discriminator is trained based on the reference labeled data set to create a selectively poisoned discriminator. A generator is trained based on the selectively poisoned discriminator to create a selectively poisoned generator. The selectively poisoned generator is tested for the first label and tested for the second label to determine whether the generator is sufficiently poisoned for the first label and sufficiently accurate for the second label. If it is not, the generator is retrained based on the data set including the further altered data. The generator includes a first ANN to input first information and output a set of artificial data that is classifiable using a first label and not classifiable using a second label of the set of labeled data.

589.

发明申请
METHOD AND APPARATUS FOR PHYSICAL LAYER BYPASS 审中-公开

公开(公告)号：US20200174962A1

公开(公告)日：2020-06-04

申请号：US16204751

申请日：2018-11-29

Applicant: Advanced Micro Devices, Inc. , ATI Technologies ULC

Inventor： Michael J. Tresidder , Yanfeng Wang , Shiqi Sun

IPC: G06F13/42 , G06F13/20

Abstract: A method and apparatus for physical layer bypass data transmission between physical coding sub-layers (PCS) includes encoding the data for transmission over a serial low-speed link. The data is transmitted from a first PCS via a serial connection over a serializer/deserializer (SERDES) transmission bypass path The data is received by a second PCS via a SERDES receive bypass path.

590.

发明授权
Credit based flow control mechanism for use in multiple link width interconnect systems 有权

公开(公告)号：US10671554B1

公开(公告)日：2020-06-02

申请号：US16271371

申请日：2019-02-08

Applicant: Advanced Micro Devices, Inc.

Inventor： Srikant Bharadwaj

IPC: G06F13/20 , G06F13/40

Abstract: Flow control credit management is provided when converting traffic from a first parallel link width on a first link to a second parallel link width on a second link A current value is calculated for a variable flow control credit exchange rate (R) associated with the first and second links. A first flow control credit indicator is received on the second link, and a credit amount calculated based on the first flow control credit indicator and R. A second flow control credit indicator for the credit amount is then transmitted on the first link.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification