SYSTEMS AND METHODS FOR SUPPORTING A PLURALITY OF LOAD ACCESSES OF A CACHE IN A SINGLE CYCLE
    41.
    发明申请
    SYSTEMS AND METHODS FOR SUPPORTING A PLURALITY OF LOAD ACCESSES OF A CACHE IN A SINGLE CYCLE 有权
    用于在单周期中支持高速缓存的多重负载接入的系统和方法

    公开(公告)号:US20140032845A1

    公开(公告)日:2014-01-30

    申请号:US13561528

    申请日:2012-07-30

    Abstract: A method for supporting a plurality of load accesses is disclosed. A plurality of requests to access a data cache is accessed, and in response, a tag memory is accessed that maintains a plurality of copies of tags for each entry in the data cache. Tags are identified that correspond to individual requests. The data cache is accessed based on the tags that correspond to the individual requests. A plurality of requests to access the same block of the plurality of blocks causes an access arbitration that is executed in the same clock cycle as is the access of the tag memory.

    Abstract translation: 公开了一种用于支持多个负载访问的方法。 访问多个访问数据高速缓存的请求,并且作为响应,访问维护数据高速缓存中的每个条目的多个标签副本的标签存储器。 识别符合个别请求的标签。 基于与各个请求对应的标签访问数据高速缓存。 多个访问多个块的相同块的请求导致在与标签存储器的访问相同的时钟周期中执行的访问仲裁。

    MULTILEVEL CONVERSION TABLE CACHE FOR TRANSLATING GUEST INSTRUCTIONS TO NATIVE INSTRUCTIONS
    43.
    发明申请
    MULTILEVEL CONVERSION TABLE CACHE FOR TRANSLATING GUEST INSTRUCTIONS TO NATIVE INSTRUCTIONS 有权
    用于转换用户指令到本指令的多级转换表缓存

    公开(公告)号:US20130024619A1

    公开(公告)日:2013-01-24

    申请号:US13359961

    申请日:2012-01-27

    Abstract: A method for translating instructions for a processor. The method includes accessing a guest instruction and performing a first level translation of the guest instruction using a first level conversion table. The method further includes outputting a resulting native instruction when the first level translation proceeds to completion. A second level translation of the guest instruction is performed using a second level conversion table when the first level translation does not proceed to completion, wherein the second level translation further processes the guest instruction based upon a partial translation from the first level conversion table. The resulting native instruction is output when the second level translation proceeds to completion.

    Abstract translation: 一种用于翻译处理器的指令的方法。 该方法包括访问客户指令并使用第一级转换表执行访客指令的第一级转换。 该方法还包括当第一级转换进行到完成时输出结果本地指令。 当第一级转换不进行到完成时,使用第二级转换表执行访客指令的第二级转换,其中第二级转换还基于来自第一级转换表的部分转换进一步处理客户指令。 当第二级转换进行到完成时,输出所产生的本机指令。

    REGISTER FILE SEGMENTS FOR SUPPORTING CODE BLOCK EXECUTION BY USING VIRTUAL CORES INSTANTIATED BY PARTITIONABLE ENGINES
    44.
    发明申请
    REGISTER FILE SEGMENTS FOR SUPPORTING CODE BLOCK EXECUTION BY USING VIRTUAL CORES INSTANTIATED BY PARTITIONABLE ENGINES 有权
    通过使用由可分离引擎监视的虚拟指令来支持代码块执行的注册文件部分

    公开(公告)号:US20120246450A1

    公开(公告)日:2012-09-27

    申请号:US13428438

    申请日:2012-03-23

    Abstract: A system for executing instructions using a plurality of register file segments for a processor. The system includes a global front end scheduler for receiving an incoming instruction sequence, wherein the global front end scheduler partitions the incoming instruction sequence into a plurality of code blocks of instructions and generates a plurality of inheritance vectors describing interdependencies between instructions of the code blocks. The system further includes a plurality of virtual cores of the processor coupled to receive code blocks allocated by the global front end scheduler, wherein each virtual core comprises a respective subset of resources of a plurality of partitionable engines, wherein the code blocks are executed by using the partitionable engines in accordance with a virtual core mode and in accordance with the respective inheritance vectors. A plurality register file segments are coupled to the partitionable engines for providing data storage.

    Abstract translation: 一种用于使用用于处理器的多个寄存器文件段来执行指令的系统。 该系统包括用于接收输入指令序列的全局前端调度器,其中全局前端调度器将输入指令序列划分为指令的多个代码块,并且生成描述代码块指令之间相互依赖关系的多个继承向量。 该系统还包括处理器的多个虚拟核心,其耦合以接收由全局前端调度器分配的代码块,其中每个虚拟核心包括多个可分区引擎的相应资源子集,其中通过使用 根据虚拟核心模式并根据各自的继承向量的可分割引擎。 多个寄存器文件段被耦合到可分割引擎以提供数据存储。

    VARIABLE CACHING STRUCTURE FOR MANAGING PHYSICAL STORAGE
    45.
    发明申请
    VARIABLE CACHING STRUCTURE FOR MANAGING PHYSICAL STORAGE 有权
    用于管理物理存储的可变缓存结构

    公开(公告)号:US20120198168A1

    公开(公告)日:2012-08-02

    申请号:US13359939

    申请日:2012-01-27

    Abstract: A method for managing a variable caching structure for managing storage for a processor. The method includes using a multi-way tag array to store a plurality of pointers for a corresponding plurality of different size groups of physical storage of a storage stack, wherein the pointers indicate guest addresses that have corresponding converted native addresses stored within the storage stack, and allocating a group of storage blocks of the storage stack, wherein the size of the allocation is in accordance with a corresponding size of one of the plurality of different size groups. Upon a hit on the tag, a corresponding entry is accessed to retrieve a pointer that indicates where in the storage stack a corresponding group of storage blocks of converted native instructions reside. The converted native instructions are then fetched from the storage stack for execution.

    Abstract translation: 一种用于管理用于管理处理器的存储的可变高速缓存结构的方法。 该方法包括使用多路标签阵列来存储用于存储堆栈的相应多个不同大小的物理存储组的多个指针,其中指针指示存储在存储堆栈中的相应转换的本地地址的客户地址, 以及分配所述存储堆栈的一组存储块,其中所述分配的大小根据所述多个不同大小组中的一个的相应大小。 在标签上点击时,访问相应的条目以检索指示器,该指针指示存储堆栈中转换的本地指令的相应组的存储块的位置。 然后从存储堆栈中获取转换后的本机指令以便执行。

    Executing partial-width packed data instructions
    46.
    发明申请
    Executing partial-width packed data instructions 有权
    执行部分宽度打包的数据指令

    公开(公告)号:US20050216706A1

    公开(公告)日:2005-09-29

    申请号:US11126049

    申请日:2005-05-09

    Abstract: A method and apparatus are provided for executing packed data instructions. According to one aspect of the invention, a processor includes registers, a register renaming unit coupled to the registers, a decoder coupled to the register renaming unit, and a partial-width execution unit coupled to the decoder. The register renaming unit provides an architectural register file to store packed data operands that include data elements. The decoder is to decode a first and second set of instructions that each specify one or more registers in the architectural register file. Each of the instructions in the first set specify operations to be performed on all of the data elements. In contrast, each of the instructions in the second set specify operations to be performed on only a subset of the data elements. The partial-width execution unit is to execute operations specified by either the first or second set of instructions.

    Abstract translation: 提供了一种用于执行打包数据指令的方法和装置。 根据本发明的一个方面,处理器包括寄存器,耦合到寄存器的寄存器重命名单元,耦合到寄存器重命名单元的解码器以及耦合到解码器的部分宽度执行单元。 寄存器重命名单元提供架构寄存器文件来存储包括数据元素的打包数据操作数。 解码器是对第一和第二组指令进行解码,每组指令在架构寄存器文件中指定一个或多个寄存器。 第一组中的每个指令指定要对所有数据元素执行的操作。 相比之下,第二组中的每个指令指定仅对数据元素的子集执行的操作。 部分宽度执行单元是执行由第一组或第二组指令指定的操作。

    Method and apparatus for efficient vertical SIMD computations
    47.
    发明授权
    Method and apparatus for efficient vertical SIMD computations 失效
    用于高效垂直SIMD计算的方法和装置

    公开(公告)号:US6115812A

    公开(公告)日:2000-09-05

    申请号:US53308

    申请日:1998-04-01

    CPC classification number: G06F9/30014 G06F9/30025 G06F9/30032 G06F9/30036

    Abstract: An apparatus and method for performing vertical parallel operations on packed data is described. A first set of data operands and a second set of data operands are accessed. Each of these sets of data represents graphical data stored in a first format. The first set of data operands is convereted into a converted set and the second set of data operands is replicated to generate a replicated set. A vertical matrix multiplication is performed on the converted set and the replicated set to generate transformed graphical data.

    Abstract translation: 描述了用于对打包数据执行垂直并行操作的装置和方法。 访问第一组数据操作数和第二组数据操作数。 这些数据集中的每一组表示以第一格式存储的图形数据。 第一组数据操作数被转换成转换的集合,并且第二组数据操作数被复制以生成复制集合。 对转换的集合和复制集执行垂直矩阵乘法以生成转换的图形数据。

    Method and apparatus for handling imprecise exceptions
    48.
    发明授权
    Method and apparatus for handling imprecise exceptions 失效
    处理不精确异常的方法和装置

    公开(公告)号:US6085312A

    公开(公告)日:2000-07-04

    申请号:US052994

    申请日:1998-03-31

    Abstract: A method and apparatus for updating the architectural state in a system implementing staggered execution with multiple micro-instructions. According to one aspect of the invention, a method is provided in which a macro-instruction is decoded into a first and second micro-instructions. The macro-instruction designates an operation on a pieced of data, and execution of the first and second micro-instructions separately cause the operation to be performed on different parts of the piece of data. The method also requires that the first micro-instruction is executed irrespective of the second micro-instructions (e.g., at a different time), and that it is detected that said second micro-instruction will not cause any non-recoverable exceptions. The results of the first micro-instruction are then used to update the architectural state in an earlier clock cycle than said second micro-instruction.

    Abstract translation: 一种用于利用多个微指令来实现交错执行的系统中的架构状态的更新的方法和装置。 根据本发明的一个方面,提供一种方法,其中宏指令被解码为第一和第二微指令。 宏指令指定对接头数据的操作,并且第一和第二微指令的执行分别导致在该数据段的不同部分上执行操作。 该方法还要求与第二微指令(例如,在不同的时间)无论执行第一微指令,并且检测到所述第二微指令不会引起任何不可恢复的异常。 然后,第一微指令的结果用于在比所述第二微指令更早的时钟周期内更新架构状态。

    Booth multiplier for handling variable width operands
    49.
    发明授权
    Booth multiplier for handling variable width operands 失效
    用于处理可变宽度操作数的展位乘数

    公开(公告)号:US06035318A

    公开(公告)日:2000-03-07

    申请号:US50993

    申请日:1998-03-31

    CPC classification number: G06F7/5338 G06F2207/382 G06F2207/3828

    Abstract: A circuit for generating partial products for variable width multiplication operations is provided. According to an embodiment of the present invention, the circuit includes a plurality of partial product selector groups, each partial product selector group includes a plurality of partial product selector circuits. Each partial product selector circuit receives a portion of a multiplicand as an input and outputs a partial product. The circuit also includes a plurality of Booth encoders. At least one of the Booth encoders is coupled to each partial product selector group. Each Booth encoder receives as an input a portion of a wide multiplier and outputs a Booth encoded value to at least a portion of a partial product selector group. The circuit further includes an override circuit coupled to one or more of the partial product selector circuits. The override circuit is operable to control one or more of the partial product selector circuits to output a zero to thereby avoid unwanted cross-products when performing multiple smaller multiplications using the same circuit.

    Abstract translation: 提供了用于产生用于可变宽度乘法运算的部分乘积的电路。 根据本发明的实施例,电路包括多个部分乘积选择器组,每个部分乘积选择器组包括多个部分乘积选择器电路。 每个部分积选择器电路接收被乘数的一部分作为输入并输出部分乘积。 电路还包括多个布斯编码器。 至少一个布斯编码器耦合到每个部分产品选择器组。 每个布斯编码器作为输入接收宽乘法器的一部分,并将布斯编码值输出到部分乘积选择器组的至少一部分。 电路还包括耦合到一个或多个部分积选择器电路的超控电路。 超控电路可操作以控制一个或多个部分乘积选择器电路输出零,从而在使用相同电路执行多个更小的乘法时避免不必要的交叉产物。

Patent Agency Ranking