AN INSTRUCTION DEFINITION TO IMPLEMENT LOAD STORE REORDERING AND OPTIMIZATION
    1.
    发明申请
    AN INSTRUCTION DEFINITION TO IMPLEMENT LOAD STORE REORDERING AND OPTIMIZATION 审中-公开
    实施负载存储重新定位和优化的指导定义

    公开(公告)号:WO2013188696A2

    公开(公告)日:2013-12-19

    申请号:PCT/US2013045722

    申请日:2013-06-13

    Abstract: A method for forwarding data from the store instructions to a corresponding load instruction in an out of order processor. The method includes accessing an incoming sequence of instructions, and of said sequence of instructions, splitting store instructions into a store address instruction and a store data instruction, wherein the store address performs address calculation and fetch, and wherein the store data performs a load of register contents to a memory address. The method further includes, of said sequence of instructions, splitting load instructions into a load address instruction and a load data instruction, wherein the load address performs address calculation and fetch, and wherein the load data performs a load of memory address contents into a register, and reordering the store address and load address instructions earlier and further away from LD/SD the instruction sequence to enable earlier dispatch and execution of the loads and the stores.

    Abstract translation: 一种用于将数据从商店指令转发到乱序处理器中的对应加载指令的方法。 该方法包括访问输入的指令序列和所述指令序列,将存储指令分解为存储地址指令和存储数据指令,其中存储地址执行地址计算和提取,并且其中存储数据执行负载 将内容注册到内存地址。 所述方法还包括在所述指令序列中,将加载指令分解为加载地址指令和加载数据指令,其中所述加载地址执行地址计算和提取,并且其中所述加载数据执行存储器地址内容的加载到寄存器 ,并重新排列存储地址,并将LD / SD的指令序列更早和更远的地址指令重新排序,以便能够更早地调度和执行负载和存储。

    MULTILEVEL CONVERSION TABLE CACHE FOR TRANSLATING GUEST INSTRUCTIONS TO NATIVE INSTRUCTIONS
    2.
    发明申请
    MULTILEVEL CONVERSION TABLE CACHE FOR TRANSLATING GUEST INSTRUCTIONS TO NATIVE INSTRUCTIONS 审中-公开
    用于将客人说明翻译成本地说明的多级转换表格缓存

    公开(公告)号:WO2012103253A2

    公开(公告)日:2012-08-02

    申请号:PCT/US2012022598

    申请日:2012-01-25

    Abstract: A method for translating instructions for a processor. The method includes accessing a guest instruction and performing a first level translation of the guest instruction using a first level conversion table. The method further includes outputting a resulting native instruction when the first level translation proceeds to completion. A second level translation of the guest instruction is performed using a second level conversion table when the first level translation does not proceed to completion, wherein the second level translation further processes the guest instruction based upon a partial translation from the first level conversion table. The resulting native instruction is output when the second level translation proceeds to completion.

    Abstract translation: 翻译处理器指令的方法。 该方法包括访问访客指令并使用第一级别转换表执行访客指令的第一级转换。 该方法还包括当第一级翻译进行到完成时输出所得到的本机指令。 当第一级别转换未进行到完成时,使用第二级别转换表执行访客指令的第二级别转换,其中第二级别转换基于来自第一级别转换表的部分转换进一步处理客机指令。 当第二级翻译进行到完成时,输出产生的本机指令。

    SINGLE CYCLE MULTI-BRANCH PREDICTION INCLUDING SHADOW CACHE FOR EARLY FAR BRANCH PREDICTION
    3.
    发明申请
    SINGLE CYCLE MULTI-BRANCH PREDICTION INCLUDING SHADOW CACHE FOR EARLY FAR BRANCH PREDICTION 审中-公开
    包含用于早期远期分支预测的阴影缓存的单周期多分支预测

    公开(公告)号:WO2012037491A3

    公开(公告)日:2012-05-24

    申请号:PCT/US2011051992

    申请日:2011-09-16

    Abstract: A method of identifying instructions including accessing a plurality of instructions that comprise multiple branch instructions. For each branch instruction of the multiple branch instructions, a respective first mask is generated representing instructions that are executed if a branch is taken. A respective second mask is generated representing instructions that are executed if the branch is not taken. A prediction output is received that comprises a respective branch prediction for each branch instruction. For each branch instruction, the prediction output is used to select a respective resultant mask from among the respective first and second masks. For each branch instruction, a resultant mask of a subsequent branch is invalidated if a previous branch is predicted to branch over said subsequent branch. A logical operation is performed on all resultant masks to produce a final mask. The final mask is used to select a subset of instructions for execution.

    Abstract translation: 一种识别指令的方法,包括访问包括多个分支指令的多个指令。 对于多个分支指令的每个分支指令,生成表示在采取分支时执行的指令的相应第一掩码。 生成相应的第二掩码,该第二掩码表示如果不采用分支则执行的指令。 接收到预测输出,其包括用于每个分支指令的相应分支预测。 对于每个分支指令,预测输出被用于从相应的第一掩模和第二掩模中选择相应的合成掩模。 对于每个分支指令,如果预测前一分支在所述后续分支上分支,则后续分支的结果掩码无效。 对所有合成掩模执行逻辑操作以产生最终掩模。 最终的掩码用于选择要执行的指令子集。

    DECENTRALIZED ALLOCATION OF RESOURCES AND INTERCONNECT STRUCTURES TO SUPPORT THE EXECUTION OF INSTRUCTION SEQUENCES BY A PLURALITY OF ENGINES
    4.
    发明申请
    DECENTRALIZED ALLOCATION OF RESOURCES AND INTERCONNECT STRUCTURES TO SUPPORT THE EXECUTION OF INSTRUCTION SEQUENCES BY A PLURALITY OF ENGINES 审中-公开
    资源和互连结构的分散化分配,以支持大量发动机执行指令序列

    公开(公告)号:WO2012162188A2

    公开(公告)日:2012-11-29

    申请号:PCT/US2012038711

    申请日:2012-05-18

    Abstract: A method for decentralized resource allocation in an integrated circuit. The method includes receiving a plurality of requests from a plurality of resource consumers of a plurality of partitionable engines to access a plurality resources, wherein the resources are spread across the plurality of engines and are accessed via a global interconnect structure. At each resource, a number of requests for access to said each resource are added. At said each resource, the number of requests are compared against a threshold limiter. At said each resource, a subsequent request that is received that exceeds the threshold limiter is canceled. Subsequently, requests that are not canceled within a current clock cycle are implemented.

    Abstract translation: 一种集成电路中分散资源分配的方法。 该方法包括从多个可分割引擎的多个资源使用者接收多个请求以访问多个资源,其中资源分布在多个引擎上并经由全局互连结构访问。 在每个资源处,添加对所述每个资源的访问的多个请求。 在所述每个资源中,将请求数与阈值限制器进行比较。 在所述每个资源处,接收到的超过阈值限制器的后续请求被取消。 随后,实现在当前时钟周期内未被取消的请求。

    A VIRTUAL LOAD STORE QUEUE HAVING A DYNAMIC DISPATCH WINDOW WITH A UNIFIED STRUCTURE
    6.
    发明申请
    A VIRTUAL LOAD STORE QUEUE HAVING A DYNAMIC DISPATCH WINDOW WITH A UNIFIED STRUCTURE 审中-公开
    具有统一结构的动态分配窗口的虚拟装载商店

    公开(公告)号:WO2013188705A2

    公开(公告)日:2013-12-19

    申请号:PCT/US2013045734

    申请日:2013-06-13

    Abstract: An out of order processor. The processor includes a virtual load store queue for allocating a plurality of loads and a plurality of stores, wherein more loads and more stores can be accommodated beyond an actual physical size of the load store queue of the processor; wherein the processor allocates other instructions besides loads and stores beyond the actual physical size limitation of the load/store queue; and wherein the other instructions can be dispatched and executed even though intervening loads or stores do not have spaces in the load store queue.

    Abstract translation: 一个乱序处理器。 处理器包括用于分配多个负载和多个存储的虚拟加载存储队列,其中可以在处理器的加载存储队列的实际物理大小之外容纳更多的负载和更多存储; 其中所述处理器除了加载和存储之外分配超出所述加载/存储队列的实际物理大小限制的其他指令; 并且其中即使中间加载或存储在加载存储队列中没有空格,也可以调度和执行其他指令。

    HARDWARE ACCELERATION COMPONENTS FOR TRANSLATING GUEST INSTRUCTIONS TO NATIVE INSTRUCTIONS
    7.
    发明申请
    HARDWARE ACCELERATION COMPONENTS FOR TRANSLATING GUEST INSTRUCTIONS TO NATIVE INSTRUCTIONS 审中-公开
    硬件加速组件用于转换用户指令到本指令

    公开(公告)号:WO2012103359A3

    公开(公告)日:2012-09-20

    申请号:PCT/US2012022760

    申请日:2012-01-26

    Abstract: A hardware based translation accelerator. The hardware includes a guest fetch logic component for accessing guest instructions; a guest fetch buffer coupled to the guest fetch logic component and a branch prediction component for assembling guest instructions into a guest instruction block; and conversion tables coupled to the guest fetch buffer for translating the guest instruction block into a corresponding native conversion block. The hardware further includes a native cache coupled to the conversion tables for storing the corresponding native conversion block, and a conversion look aside buffer coupled to the native cache for storing a mapping of the guest instruction block to corresponding native conversion block, wherein upon a subsequent request for a guest instruction, the conversion look aside buffer is indexed to determine whether a hit occurred, wherein the mapping indicates the guest instruction has a corresponding converted native instruction in the native cache.

    Abstract translation: 一种基于硬件的翻译加速器。 硬件包括用于访问访客指令的访客提取逻辑组件; 耦合到客户提取逻辑组件的访客提取缓冲器和用于将访客指令组装到访客指令块中的分支预测组件; 以及耦合到访客提取缓冲器的转换表,用于将访客指令块转换为相应的本机转换块。 硬件还包括耦合到用于存储对应的本机转换块的转换表的本地高速缓存,以及耦合到本地高速缓存的转换看待缓冲器,用于存储客户指令块到对应的本机转换块的映射,其中在随后的 请求访客指令,转换看待缓冲区被索引以确定是否发生命中,其中映射指示客户指令在本机高速缓存中具有对应的转换的本地指令。

    GUEST INSTRUCTION TO NATIVE INSTRUCTION RANGE BASED MAPPING USING A CONVERSION LOOK ASIDE BUFFER OF A PROCESSOR
    8.
    发明申请
    GUEST INSTRUCTION TO NATIVE INSTRUCTION RANGE BASED MAPPING USING A CONVERSION LOOK ASIDE BUFFER OF A PROCESSOR 审中-公开
    客户指导到基于指导范围的映射使用转换看起来缓冲区的处理器

    公开(公告)号:WO2012103209A2

    公开(公告)日:2012-08-02

    申请号:PCT/US2012022538

    申请日:2012-01-25

    Abstract: A method for translating instructions for a processor. The method includes accessing a plurality of guest instructions that comprise multiple guest branch instructions, and assembling the plurality of guest instructions into a guest instruction block. The guest instruction block is converted into a corresponding native conversion block. The native conversion block is stored into a native cache. A mapping of the guest instruction block to corresponding native conversion block is stored in a conversion look aside buffer. Upon a subsequent request for a guest instruction, the conversion look aside buffer is indexed to determine whether a hit occurred, wherein the mapping indicates whether the guest instruction has a corresponding converted native instruction in the native cache. The converted native instruction is forwarded for execution in response to the hit.

    Abstract translation: 翻译处理器指令的方法。 该方法包括访问包括多个访客分支指令的多个访客指令,并将多个访客指令组装成访客指令块。 访客指令块被转换成相应的本地转换块。 本地转换块存储在本地缓存中。 访客指令块到相应的本地转换块的映射被存储在转换后备缓冲器中。 在随后的访客指令请求时,转换后备缓冲器被索引以确定是否发生了命中,其中该映射指示访客指令是否在本地缓存中具有对应的经转换的本机指令。 为了响应命中,转换的本地指令被转发执行。

    AN INSTRUCTION SEQUENCE BUFFER TO STORE BRANCHES HAVING RELIABLY PREDICTABLE INSTRUCTION SEQUENCES
    9.
    发明申请
    AN INSTRUCTION SEQUENCE BUFFER TO STORE BRANCHES HAVING RELIABLY PREDICTABLE INSTRUCTION SEQUENCES 审中-公开
    存储具有可靠预测指令序列的分支的指令序列缓冲区

    公开(公告)号:WO2012051281A3

    公开(公告)日:2012-07-19

    申请号:PCT/US2011055943

    申请日:2011-10-12

    Abstract: A method for outputting reliably predictable instruction sequences. The method includes tracking repetitive hits to determine a set of frequently hit instruction sequences for a microprocessor, and out of that set, identifying a branch instruction having a series of subsequent frequently executed branch instructions that form a reliably predictable instruction sequence. The reliably predictable instruction sequence is stored into a buffer. On a subsequent hit to the branch instruction, the reliably predictable instruction sequence is output from the buffer.

    Abstract translation: 一种用于输出可靠地预测的指令序列的方法。 该方法包括跟踪重复命中以确定用于微处理器的一组经常命中的指令序列,并且从该组中识别具有形成可靠可预测的指令序列的一系列后续频繁执行的分支指令的分支指令。 可靠预测的指令序列存储在缓冲器中。 在分支指令的后续命中中,可靠地预测的指令序列从缓冲器输出。

    A VIRTUAL LOAD STORE QUEUE HAVING A DYNAMIC DISPATCH WINDOW WITH A DISTRIBUTED STRUCTURE
    10.
    发明申请
    A VIRTUAL LOAD STORE QUEUE HAVING A DYNAMIC DISPATCH WINDOW WITH A DISTRIBUTED STRUCTURE 审中-公开
    具有分布式结构的动态分配窗口的虚拟装载商店

    公开(公告)号:WO2013188460A3

    公开(公告)日:2014-03-27

    申请号:PCT/US2013045261

    申请日:2013-06-11

    Abstract: An out of order processor. The processor includes a distributed load queue and a distributed store queue that maintain single program sequential semantics while allowing an out of order dispatch of loads and stores across a plurality of cores and memory fragments; wherein the processor allocates other instructions besides loads and stores beyond the actual physical size limitation of the load/store queue; and wherein the other instructions can be dispatched and executed even though intervening loads or stores do not have spaces in the load store queue.

    Abstract translation: 一个乱序处理器。 处理器包括分布式负载队列和分布式存储队列,其维护单个程序顺序语义,同时允许跨多个核心和存储器片段的负载和存储的乱序分派; 其中所述处理器除了加载和存储之外分配超出所述加载/存储队列的实际物理大小限制的其他指令; 并且其中即使中间加载或存储在加载存储队列中没有空格,也可以调度和执行其他指令。

Patent Agency Ranking