INSTRUCTION SEQUENCE BUFFER TO ENHANCE BRANCH PREDICTION EFFICIENCY
    61.
    发明申请
    INSTRUCTION SEQUENCE BUFFER TO ENHANCE BRANCH PREDICTION EFFICIENCY 有权
    指令序列缓冲区以提高分支预测效率

    公开(公告)号:US20130311759A1

    公开(公告)日:2013-11-21

    申请号:US13879365

    申请日:2011-10-12

    CPC classification number: G06F9/3861 G06F9/30058 G06F9/3808 G06F9/3844

    Abstract: A method for outputting alternative instruction sequences. The method includes tracking repetitive hits to determine a set of frequently hit instruction sequences for a microprocessor. A frequently miss-predicted branch instruction is identified, wherein the predicted outcome of the branch instruction is frequently wrong. An alternative instruction sequence for the branch instruction target is stored into a buffer. On a subsequent hit to the branch instruction where the predicted outcome of the branch instruction was wrong, the alternative instruction sequence is output from the buffer.

    Abstract translation: 一种用于输出替代指令序列的方法。 该方法包括跟踪重复命中以确定微处理器的一组经常命中的指令序列。 识别经常错过预测的分支指令,其中分支指令的预测结果经常是错误的。 用于分支指令目标的替代指令序列被存储到缓冲器中。 在分支指令的预测结果错误的分支指令的后续命中中,从缓冲器输出替代指令序列。

    EXECUTING INSTRUCTION SEQUENCE CODE BLOCKS BY USING VIRTUAL CORES INSTANTIATED BY PARTITIONABLE ENGINES
    62.
    发明申请
    EXECUTING INSTRUCTION SEQUENCE CODE BLOCKS BY USING VIRTUAL CORES INSTANTIATED BY PARTITIONABLE ENGINES 有权
    通过使用可分离引擎突出的虚拟指令执行指令序列块

    公开(公告)号:US20120246657A1

    公开(公告)日:2012-09-27

    申请号:US13428440

    申请日:2012-03-23

    Abstract: A method for executing instructions using a plurality of virtual cores for a processor. The method includes receiving an incoming instruction sequence using a global front end scheduler, and partitioning the incoming instruction sequence into a plurality of code blocks of instructions. The method further includes generating a plurality of inheritance vectors describing interdependencies between instructions of the code blocks, and allocating the code blocks to a plurality of virtual cores of the processor, wherein each virtual core comprises a respective subset of resources of a plurality of partitionable engines. The code blocks are executed by using the partitionable engines in accordance with a virtual core mode and in accordance with the respective inheritance vectors.

    Abstract translation: 一种用于使用用于处理器的多个虚拟核来执行指令的方法。 该方法包括使用全局前端调度器接收输入指令序列,并将输入指令序列划分成多个指令代码块。 所述方法还包括生成描述代码块的指令之间的相互依赖性并且将代码块分配给处理器的多个虚拟核心的多个继承向量,其中每个虚拟核心包括多个可分区引擎的相应资源子集 。 根据虚拟内核模式并根据各自的继承向量,通过使用可分区引擎执行代码块。

    GUEST INSTRUCTION BLOCK WITH NEAR BRANCHING AND FAR BRANCHING SEQUENCE CONSTRUCTION TO NATIVE INSTRUCTION BLOCK
    63.
    发明申请
    GUEST INSTRUCTION BLOCK WITH NEAR BRANCHING AND FAR BRANCHING SEQUENCE CONSTRUCTION TO NATIVE INSTRUCTION BLOCK 有权
    用于指定块附近的分支和远期分配序列构造到本地指令块

    公开(公告)号:US20120198209A1

    公开(公告)日:2012-08-02

    申请号:US13359817

    申请日:2012-01-27

    Abstract: A method for translating instructions for a processor. The method includes accessing a plurality of guest instructions that comprise multiple guest branch instructions comprising at least one guest far branch, and building an instruction sequence from the plurality of guest instructions by using branch prediction on the at least one guest far branch. The method further includes assembling a guest instruction block from the instruction sequence. The guest instruction block is translated to a corresponding native conversion block, wherein an at least one native far branch that corresponds to the at least one guest far branch and wherein the at least one native far branch includes an opposite guest address for an opposing branch path of the at least one guest far branch. Upon encountering a missprediction, a correct instruction sequence is obtained by accessing the opposite guest address.

    Abstract translation: 一种用于翻译处理器的指令的方法。 该方法包括:访问包括至少一个来宾远分支的多个客运分支指令的多个访客指令,以及通过在至少一个来宾远分支上使用分支预测来从多个访客指令构建指令序列。 所述方法还包括从所述指令序列组装来宾指令块。 访客指令块被转换为相应的本机转换块,其中对应于至少一个来宾远分支的至少一个本地远分支,并且其中所述至少一个本机远分支包括用于相对分支路径的相对的访客地址 的至少一个客户远分支。 在遇到错误预测时,通过访问相对的访客地址获得正确的指令序列。

    GUEST TO NATIVE BLOCK ADDRESS MAPPINGS AND MANAGEMENT OF NATIVE CODE STORAGE
    64.
    发明申请
    GUEST TO NATIVE BLOCK ADDRESS MAPPINGS AND MANAGEMENT OF NATIVE CODE STORAGE 有权
    对本地区地址映射的访问和本地代码存储的管理

    公开(公告)号:US20120198122A1

    公开(公告)日:2012-08-02

    申请号:US13359832

    申请日:2012-01-27

    Abstract: A method for managing mappings of storage on a code cache for a processor. The method includes storing a plurality of guest address to native address mappings as entries in a conversion look aside buffer, wherein the entries indicate guest addresses that have corresponding converted native addresses stored within a code cache memory, and receiving a subsequent request for a guest address at the conversion look aside buffer. The conversion look aside buffer is indexed to determine whether there exists an entry that corresponds to the index, wherein the index comprises a tag and an offset that is used to identify the entry that corresponds to the index. Upon a hit on the tag, the corresponding entry is accessed to retrieve a pointer to the code cache memory corresponding block of converted native instructions. The corresponding block of converted native instructions are fetched from the code cache memory for execution.

    Abstract translation: 一种用于管理用于处理器的代码高速缓存上的存储的映射的方法。 该方法包括将多个访客地址存储为本地地址映射作为转换看待缓冲区中的条目,其中条目指示具有存储在代码高速缓冲存储器中的相应转换的本机地址的访客地址,以及接收对访客地址的后续请求 在转换看看缓冲区。 将缓冲器的转换看起来被索引以确定是否存在对应于索引的条目,其中索引包括用于标识对应于索引的条目的标签和偏移。 在标签上点击时,访问相应的条目以检索到转换的本地指令的代码高速缓冲存储器相应块的指针。 转换的本地指令的相应块从代码高速缓冲存储器中取出以供执行。

    Executing partial-width packed data instructions
    65.
    发明授权
    Executing partial-width packed data instructions 有权
    执行部分宽度打包的数据指令

    公开(公告)号:US07467286B2

    公开(公告)日:2008-12-16

    申请号:US11126049

    申请日:2005-05-09

    Abstract: A method and apparatus are provided for executing packed data instructions. According to one aspect of the invention, a processor includes registers, a register renaming unit coupled to the registers, a decoder coupled to the register renaming unit, and a partial-width execution unit coupled to the decoder. The register renaming unit provides an architectural register file to store packed data operands that include data elements. The decoder is to decode a first and second set of instructions that each specify one or more registers in the architectural register file. Each of the instructions in the first set specify operations to be performed on all of the data elements. In contrast, each of the instructions in the second set specify operations to be performed on only a subset of the data elements. The partial-width execution unit is to execute operations specified by either the first or second set of instructions.

    Abstract translation: 提供了一种用于执行打包数据指令的方法和装置。 根据本发明的一个方面,处理器包括寄存器,耦合到寄存器的寄存器重命名单元,耦合到寄存器重命名单元的解码器以及耦合到解码器的部分宽度执行单元。 寄存器重命名单元提供架构寄存器文件来存储包括数据元素的打包数据操作数。 解码器是对第一和第二组指令进行解码,每组指令在架构寄存器文件中指定一个或多个寄存器。 第一组中的每个指令指定要对所有数据元素执行的操作。 相比之下,第二组中的每个指令指定仅对数据元素的子集执行的操作。 部分宽度执行单元是执行由第一组或第二组指令指定的操作。

    Execution unit for performing shuffle and other operations
    66.
    发明申请
    Execution unit for performing shuffle and other operations 有权
    执行洗牌和其他操作的执行单元

    公开(公告)号:US20080215855A1

    公开(公告)日:2008-09-04

    申请号:US11478884

    申请日:2006-06-30

    CPC classification number: G06F9/30032 G06F9/30036

    Abstract: In one embodiment, the present invention includes a method for receiving first and second data operands in a common execution unit and manipulating the operands responsive to an instruction to generate an output according to local control signals of a local controller of the execution unit. Various instruction types such as shuffle and shift operations may be performed in the common execution unit in a single cycle. Other embodiments are described and claimed.

    Abstract translation: 在一个实施例中,本发明包括一种用于在公共执行单元中接收第一和第二数据操作数的方法,并且响应于根据执行单元的本地控制器的本地控制信号产生输出的指令操纵操作数。 可以在单个周期中在公共执行单元中执行诸如随机播放和移位操作的各种指令类型。 描述和要求保护其他实施例。

    Integer rounding operation
    67.
    发明申请
    Integer rounding operation 有权
    整数舍入操作

    公开(公告)号:US20070282938A1

    公开(公告)日:2007-12-06

    申请号:US11447344

    申请日:2006-06-06

    Abstract: Systems, methods, processors, media, and other embodiments associated with integer rounding a floating point number in one micro-operation (uop) are described. One system embodiment includes a memory to store an integer rounding floating point instruction and a processor to perform the integer rounding floating point instruction. The processor may include a floating point unit that includes circuits and/or logics that integer round the floating point number.

    Abstract translation: 描述了在一个微操作(uop)中与整数舍入浮点数相关联的系统,方法,处理器,介质和其他实施例。 一个系统实施例包括存储整数舍入浮点指令的存储器和执行整数舍入浮点指令的处理器。 处理器可以包括浮点单元,其包括围绕浮点数整数的电路和/或逻辑。

    Optimization for 3-D graphic transformation using SIMD computations
    68.
    发明授权
    Optimization for 3-D graphic transformation using SIMD computations 失效
    使用SIMD计算优化3-D图形变换

    公开(公告)号:US06426746B2

    公开(公告)日:2002-07-30

    申请号:US09053390

    申请日:1998-03-31

    CPC classification number: G06T15/10

    Abstract: The present invention discloses a method and apparatus for optimizing three-dimensional (3-D) transformation on N vertices of a data object based on a transformation matrix of size K×K. The method comprises: storing coordinates of the N vertices in K data items, each of the K data items having N elements; and scheduling a sequence of M operations with a set of P storage elements, the sequence of M operations performing a matrix multiplication of the transformation matrix with the K data items to produce transformed K data items, the set of P storage elements storing a plurality of intermediate results produced by the sequence of M operations.

    Abstract translation: 本发明公开了一种基于大小K×K的变换矩阵优化数据对象的N个顶点的三维(3-D)变换的方法和装置。 该方法包括:将N个顶点的坐标存储在K个数据项中,每个K个数据项具有N个元素; 并且利用一组P个存储元件调度M个操作的序列,M个操作的序列执行变换矩阵与K个数据项的矩阵乘法以产生变换后的K个数据项,该P个存储元件组存储多个 由M操作序列产生的中间结果。

    Staggering execution of an instruction by dividing a full-width macro instruction into at least two partial-width micro instructions
    69.
    发明授权
    Staggering execution of an instruction by dividing a full-width macro instruction into at least two partial-width micro instructions 失效
    通过将全角宏指令划分为至少两个部分宽度微指令,使指令交错执行

    公开(公告)号:US06233671B1

    公开(公告)日:2001-05-15

    申请号:US09052825

    申请日:1998-03-31

    Abstract: A method and apparatus are disclosed for staggering execution of an instruction. According to one embodiment of the invention, a single macro instruction is received wherein the single macro instruction specifies at least two logical registers and wherein the two logical registers respectively store a first and second packed data operands having corresponding data elements. An operation specified by the single macro instruction is then performed independently on a first and second plurality of the corresponding data elements from said first and second packed data operands at different times using the same circuit to independently generate a first and second plurality of resulting data elements. The first and second plurality of resulting data elements are stored in a single logical register as a third packed data operand.

    Abstract translation: 公开了一种用于交错执行指令的方法和装置。 根据本发明的一个实施例,接收单个宏指令,其中单个宏指令指定至少两个逻辑寄存器,并且其中两个逻辑寄存器分别存储具有相应数据元素的第一和第二压缩数据操作数。 然后,使用相同电路,在来自所述第一和第二打包数据操作数的第一和第二多个相应数据元素上独立地执行由单个宏指令指定的操作,以独立地生成第一和第二多个结果数据元素 。 第一和第二多个结果数据元素作为第三打包数据操作数存储在单个逻辑寄存器中。

Patent Agency Ranking