Patent search ap:("Advanced Micro Devices Page Inc.") AND inv:"Kai Troester"

11.

发明申请
STACK ACCESS TRACKING 有权
Title translation: 堆栈访问跟踪

公开(公告)号：US20140379986A1

公开(公告)日：2014-12-25

申请号：US13922296

申请日：2013-06-20

Applicant: Advanced Micro Devices, Inc.

Inventor： Kai Troester , Luke Yen

IPC: G06F12/08

CPC classification number: G06F9/38 , G06F9/30043 , G06F9/3826 , G06F9/3838

Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the entry of the stack accessed by the instruction stack relative to a base entry. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.

Abstract translation: 处理器在其指令流水线的前端采用预测表，由此预测表存储用于存储指令的地址寄存器和偏移信息; 和堆栈访问指令的堆栈偏移信息。相应指令的堆栈偏移信息指示由指令栈相对于基本条目访问的堆栈的条目。处理器使用模式匹配来识别加载/存储指令之间的预测依赖性以及堆栈访问指令之间的预测依赖性。指令流水线的调度器单元使用预测的依赖性来执行存储到负载转发或提高处理系统的效率并降低功耗的其他操作。

12.

发明授权
Masked multi-lane instruction memory fault handling using fast and slow execution paths 有权

公开(公告)号：US11847463B2

公开(公告)日：2023-12-19

申请号：US16585973

申请日：2019-09-27

Applicant: ADVANCED MICRO DEVICES, INC.

Inventor： Kai Troester , Scott Thomas Bingham , John M. King , Michael Estlick , Erik Swanson , Robert Weidner

IPC: G06F9/38 , G06F9/30

CPC classification number: G06F9/3861 , G06F9/30036 , G06F9/30038 , G06F9/30043 , G06F9/3887 , G06F9/30018

Abstract: A processor includes a load/store unit and an execution pipeline to execute an instruction that represents a single-instruction-multiple-data (SIMD) operation, and which references a memory block storing operand data for one or more lanes of a plurality of lanes and a mask vector indicating which lanes of a plurality of lanes are enabled and which are disabled for the operation. The execution pipeline executes an instruction in a first execution mode unless a memory fault is generated during execution of the instruction in the first execution mode. In response to the memory fault, the execution pipeline re-executes the instruction in a second execution mode. In the first execution mode, a single load operation is attempted to access the memory block via the load/store unit. In the second execution mode, a separate load operation is performed by the load/store unit for each enabled lane of the plurality of lanes prior to executing the SIMD operation.

13.

发明授权
Soft watermarking in thread shared resources implemented through thread mediation 有权

公开(公告)号：US11144353B2

公开(公告)日：2021-10-12

申请号：US16585586

申请日：2019-09-27

Applicant: Advanced Micro Devices, Inc.

Inventor： Kai Troester

IPC: G06F9/50

Abstract: Techniques for use in a microprocessor core for soft watermarking in thread shared resources implemented through thread mediation. A thread is removed from a thread mediation decision involving multiple threads competing or requesting to use a shared resource at a current clock cycle based on a number of entries in the shared resource that the thread is estimated to have allocated to it at the current clock cycle. By removing the thread from the thread mediation decision, the thread is stalled from allocating additional entries in the shared resource.

14.

发明授权
Tracking stores and loads by bypassing load store units 有权

公开(公告)号：US11048506B2

公开(公告)日：2021-06-29

申请号：US16450897

申请日：2019-06-24

Applicant: Advanced Micro Devices, Inc.

Inventor： Krishnan V. Ramani , Kai Troester , Frank C. Galloway , David N. Suggs , Michael D. Achenbach , Betty Ann McDaniel , Marius Evers

IPC: G06F9/28 , G06F9/30 , G06F9/38

Abstract: A system and method for tracking stores and loads to reduce load latency when forming the same memory address by bypassing a load store unit within an execution unit is disclosed. Store-load pairs which have a strong history of store-to-load forwarding are identified. Once identified, the load is memory renamed to the register stored by the store. The memory dependency predictor may also be used to detect loads that are dependent on a store but cannot be renamed. In such a configuration, the dependence is signaled to the load store unit and the load store unit uses the information to issue the load after the identified store has its physical address.

15.

发明申请
RETIRE QUEUE COMPRESSION 有权

公开(公告)号：US20210096874A1

公开(公告)日：2021-04-01

申请号：US16586642

申请日：2019-09-27

Applicant: Advanced Micro Devices, Inc.

Inventor： Matthew T. Sobel , Joshua James Lindner , Neil N. Marketkar , Kai Troester , Emil Talpes , Ashok Tirupathy Venkatachar

IPC: G06F9/38

Abstract: Systems, apparatuses, and methods for compressing multiple instruction operations together into a single retire queue entry are disclosed. A processor includes at least a scheduler, a retire queue, one or more execution units, and control logic. When the control logic detects a given instruction operation being dispatched by the scheduler to an execution unit, the control logic determines if the given instruction operation meets one or more conditions for being compressed with one or more other instruction operations into a single retire queue entry. If the one or more conditions are met, two or more instruction operations are stored together in a single retire queue entry. By compressing multiple instruction operations together into an individual retire queue entry, the retire queue is able to be used more efficiently, and the processor can speculatively execute more instructions without the retire queue exhausting its supply of available entries.

16.

发明授权
Stack access tracking 有权
Title translation: 堆栈访问跟踪

公开(公告)号：US09292292B2

公开(公告)日：2016-03-22

申请号：US13922296

申请日：2013-06-20

Applicant: Advanced Micro Devices, Inc.

Inventor： Kai Troester , Luke Yen

IPC: G06F12/06 , G06F9/38

CPC classification number: G06F9/38 , G06F9/30043 , G06F9/3826 , G06F9/3838

Abstract: A processor employs a prediction table at a front end of its instruction pipeline, whereby the prediction table stores address register and offset information for store instructions; and stack offset information for stack access instructions. The stack offset information for a corresponding instruction indicates the entry of the stack accessed by the instruction stack relative to a base entry. The processor uses pattern matching to identify predicted dependencies between load/store instructions and predicted dependencies between stack access instructions. A scheduler unit of the instruction pipeline uses the predicted dependencies to perform store-to-load forwarding or other operations that increase efficiency and reduce power consumption at the processing system.

Abstract translation: 处理器在其指令流水线的前端采用预测表，由此预测表存储用于存储指令的地址寄存器和偏移信息; 和堆栈访问指令的堆栈偏移信息。相应指令的堆栈偏移信息指示由指令栈相对于基本条目访问的堆栈的条目。处理器使用模式匹配来识别加载/存储指令之间的预测依赖性以及堆栈访问指令之间的预测依赖性。指令流水线的调度器单元使用预测的依赖性来执行存储到负载转发或提高处理系统的效率并降低功耗的其他操作。

17.

发明授权
Throttling while managing upstream resources 有权

公开(公告)号：US12032965B2

公开(公告)日：2024-07-09

申请号：US17519902

申请日：2021-11-05

Applicant: Advanced Micro Devices, Inc.

Inventor： Paul James Moyer , Douglas Benson Hunt , Kai Troester

IPC: G06F9/38

CPC classification number: G06F9/3856 , G06F9/384 , G06F9/3885

Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.

18.

发明授权
Fastpath microcode sequencer 有权

公开(公告)号：US11467838B2

公开(公告)日：2022-10-11

申请号：US15986626

申请日：2018-05-22

Applicant: Advanced Micro Devices, Inc.

Inventor： Kai Troester , Magiting Talisayon , Hongwen Gao , Benjamin Floering , Emil Talpes

IPC: G06F9/30

Abstract: Systems, apparatuses, and methods for implementing a fastpath microcode sequencer are disclosed. A processor includes at least an instruction decode unit and first and second microcode units. For each received instruction, the instruction decode unit forwards the instruction to the first microcode unit if the instruction satisfies at least a first condition. In one implementation, the first condition is the instruction being classified as a frequently executed instruction. If a received instruction satisfies at least a second condition, the instruction decode unit forwards the received instruction to a second microcode unit. In one implementation, the first microcode unit is a smaller, faster structure than the second microcode unit. In one implementation, the second condition is the instruction being classified as an infrequently executed instruction. In other implementations, the instruction decode unit forwards the instruction to another microcode unit responsive to determining the instruction satisfies one or more other conditions.

19.

发明授权
Shared resource allocation in a multi-threaded microprocessor 有权

公开(公告)号：US11294724B2

公开(公告)日：2022-04-05

申请号：US16585424

申请日：2019-09-27

Applicant: Advanced Micro Devices, Inc.

Inventor： Kai Troester , Neil Marketkar , Matthew T. Sobel , Srinivas Keshav

IPC: G06F9/46 , G06F9/50 , G06F9/48

Abstract: An approach is provided for allocating a shared resource to threads in a multi-threaded microprocessor based upon the usefulness of the shared resource to each of the threads. The usefulness of a shared resource to a thread is determined based upon the number of entries in the shared resource that are allocated to the thread and the number of active entries that the thread has in the shared resource. Threads that are allocated a large number of entries in the shared resource and have a small number of active entries in the shared resource, indicative of a low level of parallelism, can operate efficiently with fewer entries in the shared resource, and have their allocation limit in the shared resource reduced.

20.

发明申请
THROTTLING WHILE MANAGING UPSTREAM RESOURCES 有权

公开(公告)号：US20210096873A1

公开(公告)日：2021-04-01

申请号：US16584701

申请日：2019-09-26

Applicant: Advanced Micro Devices, Inc.

Inventor： Paul James Moyer , Douglas Benson Hunt , Kai Troester

IPC: G06F9/38

Abstract: Systems, apparatuses, and methods for arbitrating threads in a computing system are disclosed. A computing system includes a processor with multiple cores, each capable of simultaneously processing instructions of multiple threads. When a thread throttling unit receives an indication that a shared cache has resource contention, the throttling unit sets a threshold number of cache misses for the cache. If the number of cache misses exceeds this threshold, then the throttling unit notifies a particular upstream computation unit to throttle the processing of instructions for the thread. After a time period elapses, if the cache continues to exceed the threshold, then the throttling unit notifies the upstream computation unit to more restrictively throttle the thread by performing one or more of reducing the selection rate and increasing the time period. Otherwise, the unit notifies the upstream computation unit to less restrictively throttle the thread.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification