Patent search ap:("QUALCOMM Incorporated") AND inv:"Brian Michael Stempel" Page 1

1.

发明申请
AGGREGATING CACHE MAINTENANCE INSTRUCTIONS IN PROCESSOR-BASED DEVICES 审中-公开

公开(公告)号：US20180285269A1

公开(公告)日：2018-10-04

申请号：US15943130

申请日：2018-04-02

Applicant: QUALCOMM Incorporated

Inventor： William James McAvoy , Thomas Philip Speier , Brian Michael Stempel

IPC: G06F12/0815 , G06F12/0811 , H04L29/08

Abstract: Aggregating cache maintenance instructions in processor-based devices is disclosed. In this regard, a processor-based device comprises one or more processing elements (PEs), each providing an aggregation circuit configured to detect a first cache maintenance instruction in an instruction stream. The aggregation circuit then aggregates one or more subsequent, consecutive cache maintenance instructions in the instruction stream with the first cache maintenance instruction until an end condition is detected (e.g., detection of a data synchronization barrier instruction or a cache maintenance instruction targeting a non-consecutive memory address or a different memory page than a previous cache maintenance instruction, and/or detection that an aggregation limit has been exceeded). After detecting the end condition, the aggregation circuit generates a single cache maintenance request representing the aggregated cache maintenance instructions. In this manner, multiple cache maintenance instructions may be represented by and processed as a single request, thus minimizing the impact on system performance.

2.

发明申请
Eliminating Redundant Masking Operations Instruction Processing Circuits, And Related Processor Systems, Methods, And Computer-Readable Media 有权
Title translation: 消除冗余掩蔽操作指令处理电路，以及相关处理器系统，方法和计算机可读介质

公开(公告)号：US20130290683A1

公开(公告)日：2013-10-31

申请号：US13655622

申请日：2012-10-19

Applicant: QUALCOMM INCORPORATED

Inventor： Melinda J. Brown , Michael William Morrow , James Norris Dieffenderfer , Brian Michael Stempel , Michael Scott McIlvaine

IPC: G06F9/30

CPC classification number: G06F9/3017 , G06F9/30018 , G06F9/3838

Abstract: Eliminating redundant masking operations in instruction processing circuits and related processor systems, methods, and computer-readable media are disclosed. In one embodiment, a first instruction in an instruction stream indicating an operation writing a value to a first register is detected by an instruction processing circuit, the value having a value size less than a size of the first register. The circuit also detects a second instruction in the instruction stream indicating a masking operation on the first register. The masking operation is eliminated upon a determination that the masking operation indicates a read operation and a write operation on the first register and has an identity mask size equal to or greater than the value size. in this manner, the elimination of the masking operation avoids potential read-after-write hazards and improves performance of a CPU by removing redundant operations from an execution pipeline.

Abstract translation: 公开了在指令处理电路和相关处理器系统，方法和计算机可读介质中消除冗余掩蔽操作。在一个实施例中，由指令处理电路检测指示将值写入第一寄存器的操作的指令流中的第一指令，该值具有小于第一寄存器的大小的值。电路还检测指示流中指示在第一寄存器上的屏蔽操作的第二指令。在确定掩蔽操作指示对第一寄存器的读取操作和写入操作并且具有等于或大于值大小的身份掩码大小的情况下，屏蔽操作被消除。以这种方式，消除掩蔽操作避免了潜在的写后危害，并且通过从执行流水线中移除冗余操作来提高CPU的性能。

3.

发明授权
Method, apparatus, and system for memory bandwidth aware data prefetching 有权

公开(公告)号：US11550723B2

公开(公告)日：2023-01-10

申请号：US16113185

申请日：2018-08-27

Applicant: QUALCOMM Incorporated

Inventor： Niket Choudhary , David Scott Ray , Thomas Philip Speier , Eric Robinson , Harold Wade Cain, III , Nikhil Narendradev Sharma , Joseph Gerald McDonald , Brian Michael Stempel , Garrett Michael Drapala

IPC: G06F12/0862 , G06F12/0811

Abstract: An apparatus, method, and system for memory bandwidth aware data prefetching is presented. The method may comprise monitoring a number of request responses received in an interval at a current prefetch request generation rate, comparing the number of request responses received in the interval to at least a first threshold, and adjusting the current prefetch request generation rate to an updated prefetch request generation rate by selecting the updated prefetch request generation rate from a plurality of prefetch request generation rates, based on the comparison. The request responses may be NACK or RETRY responses. The method may further comprise either retaining a current prefetch request generation rate or selecting a maximum prefetch request generation rate as the updated prefetch request generation rate in response to an indication that prefetching is accurate.

4.

发明授权
Optimizing performance for context-dependent instructions 有权

公开(公告)号：US09823929B2

公开(公告)日：2017-11-21

申请号：US13841576

申请日：2013-03-15

Applicant: QUALCOMM Incorporated

Inventor： Daren Eugene Streett , Brian Michael Stempel , Thomas Philip Speier , Rodney Wayne Smith , Michael Scott McIlvaine , Kenneth Alan Dockser , James Norris Dieffenderfer

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30098 , G06F9/30189 , G06F9/3842 , G06F9/3863

Abstract: A processor includes a queue for storing instructions processed within the context of a current value of a register field, where for some embodiments the instruction is undefined or defined, depending upon the register field at time of processing. After a write instruction (an instruction that writes to the register field) executes, the queue is searched for any entries that contain instructions that depend upon the executed write instruction. Each such entry stores the value of the register field at the time the instruction in the entry was processed. If such an entry is found in the queue and its stored value of the register field does not match the value that the write instruction wrote to the register field, then the processor flushes the pipeline and restarts at a state so as to correctly execute the instruction.

5.

发明授权
Eliminating redundant masking operations instruction processing circuits, and related processor systems, methods, and computer-readable media 有权
Title translation: 消除冗余掩蔽操作指令处理电路，以及相关的处理器系统，方法和计算机可读介质

公开(公告)号：US09146741B2

公开(公告)日：2015-09-29

申请号：US13655622

申请日：2012-10-19

Applicant: QUALCOMM Incorporated

Inventor： Melinda J. Brown , Michael William Morrow , James Norris Dieffenderfer , Brian Michael Stempel , Michael Scott McIlvaine

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/3017 , G06F9/30018 , G06F9/3838

Abstract: Eliminating redundant masking operations in instruction processing circuits and related processor systems, methods, and computer-readable media are disclosed. In one embodiment, a first instruction in an instruction stream indicating an operation writing a value to a first register is detected by an instruction processing circuit, the value having a value size less than a size of the first register. The circuit also detects a second instruction in the instruction stream indicating a masking operation on the first register. The masking operation is eliminated upon a determination that the masking operation indicates a read operation and a write operation on the first register and has an identity mask size equal to or greater than the value size. In this manner, the elimination of the masking operation avoids potential read-after-write hazards and improves performance of a CPU by removing redundant operations from an execution pipeline.

Abstract translation: 公开了在指令处理电路和相关处理器系统，方法和计算机可读介质中消除冗余掩蔽操作。在一个实施例中，由指令处理电路检测指示将值写入第一寄存器的操作的指令流中的第一指令，该值具有小于第一寄存器的大小的值。电路还检测指示流中指示在第一寄存器上的屏蔽操作的第二指令。在确定掩蔽操作指示对第一寄存器的读取操作和写入操作并且具有等于或大于值大小的身份掩码大小的情况下，屏蔽操作被消除。以这种方式，消除掩蔽操作可避免潜在的写后危害，并通过从执行流水线中删除冗余操作来提高CPU性能。

6.

发明申请
ELIMINATING REDUNDANT SYNCHRONIZATION BARRIERS IN INSTRUCTION PROCESSING CIRCUITS, AND RELATED PROCESSOR SYSTEMS, METHODS, AND COMPUTER-READABLE MEDIA 审中-公开
Title translation: 消除指令处理电路中的冗余同步障碍，以及相关处理器系统，方法和计算机可读介质

公开(公告)号：US20140281429A1

公开(公告)日：2014-09-18

申请号：US13829315

申请日：2013-03-14

Applicant: QUALCOMM INCORPORATED

Inventor： Melinda J. Brown , James Norris Dieffenderfer , Michael Scott McIlvaine , Brian Michael Stempel , Daren Eugene Streett

IPC: G06F9/30

CPC classification number: G06F9/30087

Abstract: Embodiments disclosed herein include eliminating redundant synchronization barriers from execution pipelines in instruction processing circuits. Related processor systems, methods, and computer-readable media are also disclosed. By tracking the occurrence of synchronization events, unnecessary software synchronization operations may be identified and eliminated, thus improving performance of a central processing unit (CPU). In one embodiment, a method for eliminating redundant synchronization barriers in an instruction stream is provided. The method comprises determining whether a next instruction comprises a synchronization barrier of a type corresponding to a first synchronization event. The method also comprises eliminating the next instruction from the instruction stream, responsive to determining that the next instruction comprises a synchronization barrier of a type corresponding to the first synchronization event. In this manner, the average number of instructions executed during each CPU clock cycle may be increased by avoiding unnecessary synchronization operations.

Abstract translation: 本文公开的实施例包括从指令处理电路中的执行管线消除冗余同步障碍。还公开了相关处理器系统，方法和计算机可读介质。通过跟踪同步事件的发生，可以识别和消除不必要的软件同步操作，从而提高中央处理单元（CPU）的性能。在一个实施例中，提供了用于消除指令流中的冗余同步障碍的方法。该方法包括确定下一条指令是否包括与第一同步事件相对应的类型的同步屏障。响应于确定下一条指令包括与第一同步事件对应的类型的同步屏障，该方法还包括从指令流中消除下一条指令。以这种方式，可以通过避免不必要的同步操作来增加在每个CPU时钟周期期间执行的平均指令数。

7.

发明授权
Methods and apparatus for managing page crossing instructions with different cacheability 有权
Title translation: 用于管理具有不同缓存性能的页面交叉指令的方法和装置

公开(公告)号：US08819342B2

公开(公告)日：2014-08-26

申请号：US13626916

申请日：2012-09-26

Applicant: QUALCOMM Incorporated

Inventor： Leslie Mark DeBruyne , James Norris Dieffenderfer , Michael Scott Mcilvaine , Brian Michael Stempel

IPC: G06F12/08 , G06F9/38

CPC classification number: G06F12/0886 , G06F9/30149 , G06F9/3816 , G06F12/0888 , G06F12/10 , G06F2212/452

Abstract: An instruction in an instruction cache line having a first portion that is cacheable, a second portion that is from a page that is non-cacheable, and crosses a cache line is prevented from executing from the instruction cache. An attribute associated with the non-cacheable second portion is tracked separately from the attributes of the rest of the instructions in the cache line. If the page crossing instruction is reached for execution, the page crossing instruction and instructions following are flushed and a non-cacheable request is made to memory for at least the second portion. Once the second portion is received, the whole page crossing instruction is reconstructed from the first portion saved in the previous fetch group. The page crossing instruction or portion thereof is returned with the proper attribute for a non-cached fetched instruction and the reconstructed instruction can be executed without being cached.

Abstract translation: 具有可高速缓存的第一部分的指令高速缓存行中的指令，来自不可缓存的页面的第二部分和跨越高速缓存行的指令被禁止从指令高速缓存执行。与不可缓存的第二部分相关联的属性与高速缓存行中的其余指令的属性分开跟踪。如果到达执行页面交叉指令，则刷新页面交叉指令和指令，并且对至少第二部分对存储器进行不可缓存请求。一旦接收到第二部分，则从保存在先前取出组中的第一部分重构整个页面交叉指令。返回页面交叉指令或其一部分具有用于非缓存取出指令的适当属性，并且重建的指令可以被执行而不被缓存。

8.

发明申请
ESTABLISHING A BRANCH TARGET INSTRUCTION CACHE (BTIC) ENTRY FOR SUBROUTINE RETURNS TO REDUCE EXECUTION PIPELINE BUBBLES, AND RELATED SYSTEMS, METHODS, AND COMPUTER-READABLE MEDIA 有权
Title translation: 建立分支目标指导缓存（BTIC）进入SUBRONTINE返回以减少执行管道泡沫以及相关系统，方法和计算机可读介质

公开(公告)号：US20140149726A1

公开(公告)日：2014-05-29

申请号：US13792335

申请日：2013-03-11

Applicant: QUALCOMM INCORPORATED

Inventor： James Norris Dieffenderfer , Michael William Morrow , Michael Scott McIlvaine , Daren Eugene Streett , Vimal K. Reddy , Brian Michael Stempel

IPC: G06F9/38

CPC classification number: G06F9/3808 , G06F9/30054

Abstract: Establishing a branch target instruction cache (BTIC) entry for subroutine returns to reduce pipeline bubbles, and related systems, methods, and computer-readable media are disclosed. In one embodiment, a method of establishing a BTIC entry includes detecting a subroutine call in an execution pipeline. In response, at least one instruction fetched sequential to the subroutine call is written as a branch target instruction in a BTIC entry for a subroutine return. A next instruction fetch address is calculated, and is written into a next instruction fetch address field in the BTIC entry. In this manner, the BTIC may provide correct branch target instruction and next instruction fetch address data for the subroutine return, even if the subroutine return is encountered for the first time or the subroutine is called from different calling locations.

Abstract translation: 建立用于子程序的分支目标指令缓存（BTIC）条目返回以减少管道气泡，以及相关系统，方法和计算机可读介质。在一个实施例中，建立BTIC条目的方法包括检测执行流水线中的子程序调用。作为响应，在子程序返回的BTIC条目中写入与子程序调用顺序取得的至少一个指令作为分支目标指令。计算下一个指令提取地址，并将其写入BTIC条目中的下一个指令获取地址字段。以这种方式，即使第一次遇到子程序返回或从不同的呼叫位置调用子程序，BTIC可以为子程序返回提供正确的分支目标指令和下一个指令获取地址数据。

9.

发明申请
Fusing Immediate Value, Write-Based Instructions in Instruction Processing Circuits, and Related Processor Systems, Methods, and Computer-Readable Media 有权
Title translation: 指令处理电路中的立即值，基于写入的指令，以及相关处理器系统，方法和计算机可读介质

公开(公告)号：US20140149722A1

公开(公告)日：2014-05-29

申请号：US13686229

申请日：2012-11-27

Applicant: QUALCOMM INCORPORATED

Inventor： Melinda J. Brown , Michael William Morrow , James Norris Dieffenderfer , Brian Michael Stempel , Michael Scott McIlvaine , Rodney Wayne Smith , Jeffrey M. Schottmiller , Andrew S. Irwin

IPC: G06F9/30

CPC classification number: G06F9/3017 , G06F9/30167

Abstract: Fusing immediate value, write-based instructions in instruction processing circuits, and related processor systems, methods, and computer-readable media are disclosed. In one embodiment, a first instruction indicating an operation writing an immediate value to a register is detected by an instruction processing circuit. The circuit also detects at least one subsequent instruction indicating an operation that overwrites at least one first portion of the register while maintaining a value of a second portion of the register. The at least one subsequent instruction is converted (or replaced) with a fused instruction(s), which indicates an operation writing the at least one first portion and the second portion of the register. In this manner, conversion of multiple instructions for generating a constant into the fused instruction(s) removes the potential for a read-after-write hazard and associated consequences caused by dependencies between certain instructions, while reducing a number of clock cycles required to process the instructions.

Abstract translation: 公开了立即值的融合，指令处理电路中的基于写入的指令以及相关的处理器系统，方法和计算机可读介质。在一个实施例中，指令处理电路检测指示向寄存器写入立即值的操作的第一指令。电路还检测至少一个后续指令，指示在保持寄存器的第二部分的值的同时重写寄存器的至少一个第一部分的操作。所述至少一个后续指令被转换（或替代）与一个融合指令，其指示写入寄存器的至少一个第一部分和第二部分的操作。以这种方式，将用于产生常数的多个指令转换为融合指令消除了读写后危险和由特定指令之间的依赖性引起的相关后果的可能性，同时减少了处理所需的时钟周期数说明。

10.

发明申请
PREVENTING EXECUTION OF PARITY-ERROR-INDUCED UNPREDICTABLE INSTRUCTIONS, AND RELATED PROCESSOR SYSTEMS, METHODS, AND COMPUTER-READABLE MEDIA 审中-公开
Title translation: 防止执行异常错误诱导的不可预测的指令和相关处理器系统，方法和计算机可读介质

公开(公告)号：US20130326195A1

公开(公告)日：2013-12-05

申请号：US13787907

申请日：2013-03-07

Applicant: QUALCOMM INCORPORATED

Inventor： Michael Scott McIlvaine , James Norris Dieffenderfer , Brian Michael Stempel , Leslie Mark DeBruyne , Melinda J. Brown

IPC: G06F9/30

CPC classification number: G06F9/30196 , G06F9/30145 , G06F9/3017 , G06F11/1064

Abstract: Preventing execution of parity-error-induced unpredictable instructions, and related processor systems, methods, and computer-readable media are disclosed. In this regard, a method for processing instructions in a central processing unit (CPU) is provided. The method comprises decoding an instruction comprising a plurality of bits, and generating a parity error indicator indicating whether a parity error exists in the plurality of bits prior to execution of the instruction. If the parity error indicator indicates that the parity error exists in the plurality of bits, one or more of the plurality of bits are modified to indicate a no execution operation (NOP), without effecting a roll back of a program counter of the CPU and without re-decoding the instruction. In this manner, the possibility of the parity error causing an inadvertent execution of an unpredictable instruction is reduced.

Abstract translation: 公开了防止奇偶校验错误引起的不可预测指令的执行以及相关的处理器系统，方法和计算机可读介质。在这方面，提供了一种用于处理中央处理单元（CPU）中的指令的方法。该方法包括对包含多个比特的指令进行解码，并产生一个奇偶校验错误指示符，该奇偶校验错误指示符指示执行指令之前多个比特中是否存在奇偶校验错误。如果奇偶校验错误指示符表示多个比特中存在奇偶校验错误，则修改多个比特中的一个或多个，以指示不执行操作（NOP），而不影响CPU的程序计数器的回退，并且无需重新解码指令。以这种方式，减少了导致无意中执行不可预测指令的奇偶校验错误的可能性。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification