Methods And Apparatuses For Efficient Load Processing Using Buffers
    1.
    发明申请
    Methods And Apparatuses For Efficient Load Processing Using Buffers 有权
    使用缓冲器高效加载处理的方法和设备

    公开(公告)号:US20110154002A1

    公开(公告)日:2011-06-23

    申请号:US12640707

    申请日:2009-12-17

    IPC分类号: G06F9/38

    摘要: Various embodiments of the invention concern methods and apparatuses for power and time efficient load handling. A compiler may identify producer loads, consumer reuse loads, consumer forwarded loads, and producer/consumer hybrid loads. Based on this identification, performance of the load may be efficiently directed to a load value buffer, store buffer, data cache, or elsewhere. Consequently, accesses to cache are reduced, through direct loading from load value buffers and store buffers, thereby efficiently processing the loads.

    摘要翻译: 本发明的各种实施例涉及用于功率和时间有效的负载处理的方法和装置。 编译器可以识别生产者负载,消费者重用负载,消费者转发负载以及生产者/消费者混合负载。 基于该识别,可以将负载的性能有效地指向负载值缓冲器,存储缓冲器,数据高速缓存或其他位置。 因此,通过从负载值缓冲区和存储缓冲区直接加载,从而降低对高速缓存的访问,从而有效地处理负载。

    Methods and apparatuses for efficient load processing using buffers
    2.
    发明授权
    Methods and apparatuses for efficient load processing using buffers 有权
    使用缓冲区进行高效加载处理的方法和装置

    公开(公告)号:US08452946B2

    公开(公告)日:2013-05-28

    申请号:US12640707

    申请日:2009-12-17

    IPC分类号: G06F9/30 G06F9/40 G06F15/00

    摘要: Various embodiments of the invention concern methods and apparatuses for power and time efficient load handling. A compiler may identify producer loads, consumer reuse loads, consumer forwarded loads, and producer/consumer hybrid loads. Based on this identification, performance of the load may be efficiently directed to a load value buffer, store buffer, data cache, or elsewhere. Consequently, accesses to cache are reduced, through direct loading from load value buffers and store buffers, thereby efficiently processing the loads.

    摘要翻译: 本发明的各种实施例涉及用于功率和时间有效的负载处理的方法和装置。 编译器可以识别生产者负载,消费者重用负载,消费者转发负载以及生产者/消费者混合负载。 基于该识别,可以将负载的性能有效地指向负载值缓冲器,存储缓冲器,数据高速缓存或其他位置。 因此,通过从负载值缓冲区和存储缓冲区的直接加载,减少对高速缓存的访问,从而有效地处理负载。

    Software constructed stands for execution on a multi-core architecture
    3.
    发明申请
    Software constructed stands for execution on a multi-core architecture 有权
    构建的软件代表在多核架构上执行

    公开(公告)号:US20090077360A1

    公开(公告)日:2009-03-19

    申请号:US11901644

    申请日:2007-09-18

    IPC分类号: G06F9/44 G06F9/38

    CPC分类号: G06F8/433

    摘要: In one embodiment, the present invention includes a software-controlled method of forming instruction strands. The software may include instructions to obtain code of a superblock including a plurality of basic blocks, build a dependency directed acyclic graph (DAG) for the code, sort nodes coupled by edges of the dependency DAG into a topological order, form strands from the nodes based on hardware constraints, rule constraints, and scheduling constraints, and generate executable code for the strands and store the executable code in a storage. Other embodiments are described and claimed.

    摘要翻译: 在一个实施例中,本发明包括一种形成指令串的软件控制方法。 软件可以包括用于获得包括多个基本块的超级块的代码的指令,为代码构建依赖性有向非循环图(DAG),将依赖性DAG的边缘耦合的分类节点排列成拓扑顺序,从节点形成线 基于硬件约束,规则约束和调度约束,并且生成链的可执行代码并将可执行代码存储在存储器中。 描述和要求保护其他实施例。

    Software constructed strands for execution on a multi-core architecture
    4.
    发明授权
    Software constructed strands for execution on a multi-core architecture 有权
    用于在多核架构上执行的软件构造的线

    公开(公告)号:US08789031B2

    公开(公告)日:2014-07-22

    申请号:US11901644

    申请日:2007-09-18

    IPC分类号: G06F9/45

    CPC分类号: G06F8/433

    摘要: In one embodiment, the present invention includes a software-controlled method of forming instruction strands. The software may include instructions to obtain code of a superblock including a plurality of basic blocks, build a dependency directed acyclic graph (DAG) for the code, sort nodes coupled by edges of the dependency DAG into a topological order, form strands from the nodes based on hardware constraints, rule constraints, and scheduling constraints, and generate executable code for the strands and store the executable code in a storage. Other embodiments are described and claimed.

    摘要翻译: 在一个实施例中,本发明包括一种形成指令串的软件控制方法。 软件可以包括用于获得包括多个基本块的超级块的代码的指令,为代码构建依赖性有向非循环图(DAG),将依赖性DAG的边缘耦合的分类节点排列成拓扑顺序,从节点形成线 基于硬件约束,规则约束和调度约束,并且生成链的可执行代码并将可执行代码存储在存储器中。 描述和要求保护其他实施例。

    DYNAMIC DATA SYNCHRONIZATION IN THREAD-LEVEL SPECULATION
    5.
    发明申请
    DYNAMIC DATA SYNCHRONIZATION IN THREAD-LEVEL SPECULATION 审中-公开
    动态数据同步在线程分析

    公开(公告)号:US20110320781A1

    公开(公告)日:2011-12-29

    申请号:US12826287

    申请日:2010-06-29

    申请人: Wei Liu Youfeng Wu

    发明人: Wei Liu Youfeng Wu

    IPC分类号: G06F9/312 G06F12/02 G06F12/08

    摘要: In one embodiment, the present invention introduces a speculation engine to parallelize serial instructions by creating separate threads from the serial instructions and inserting processor instructions to set a synchronization bit before a dependence source and to clear the synchronization bit after a dependence source, where the synchronization bit is designed to stall a dependence sink from a thread running on a separate core. Other embodiments are described and claimed.

    摘要翻译: 在一个实施例中,本发明引入了一种推测引擎,以通过从串行指令中创建单独的线程并插入处理器指令来在依赖源之前设置同步位并在依赖源之后清除同步位,从而并行化串行指令,其中同步 位被设计为从在单独核心上运行的线程停止依赖宿。 描述和要求保护其他实施例。

    DYNAMIC OPTIMIZATION FOR CONDITIONAL COMMIT
    6.
    发明申请
    DYNAMIC OPTIMIZATION FOR CONDITIONAL COMMIT 审中-公开
    动态优化条件咨询

    公开(公告)号:US20120079245A1

    公开(公告)日:2012-03-29

    申请号:US12890638

    申请日:2010-09-25

    IPC分类号: G06F9/312 G06F9/38 G06F9/30

    摘要: An apparatus and method is described herein for conditionally committing and/or speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.

    摘要翻译: 本文描述了用于有条件地提交和/或推测性检查点事务的装置和方法,这可能导致事务的动态调整大小。 在二进制代码的动态优化期间,插入事务以提供存储器排序保护措施,这使得动态优化器能够更积极地优化代码。 并且条件提交可以有效地执行动态优化代码,同时尝试防止事务用尽硬件资源。 虽然投机检查点能够在中止交易后快速有效地恢复。 处理器硬件适于支持事务的动态调整大小,诸如包括识别条件提交指令的解码器,推测性检查点指令或两者。 并且处理器硬件还适于执行响应于解码这样的指令来支持条件提交或推测性检查点的操作。

    APPARATUS, METHOD, AND SYSTEM FOR IMPROVING POWER, PERFORMANCE EFFICIENCY BY COUPLING A FIRST CORE TYPE WITH A SECOND CORE TYPE
    7.
    发明申请
    APPARATUS, METHOD, AND SYSTEM FOR IMPROVING POWER, PERFORMANCE EFFICIENCY BY COUPLING A FIRST CORE TYPE WITH A SECOND CORE TYPE 审中-公开
    用于提高功率的装置,方法和系统,通过与第二核心类型耦合的第一核心类型的性能效率

    公开(公告)号:US20110320766A1

    公开(公告)日:2011-12-29

    申请号:US12826107

    申请日:2010-06-29

    IPC分类号: G06F9/30 G06F15/76

    摘要: An apparatus and method is described herein for coupling a processor core of a first type with a co-designed core of a second type. Execution of program code on the first core is monitored and hot sections of the program code are identified. Those hot sections are optimize for execution on the co-designed core, such that upon subsequently encountering those hot sections, the optimized hot sections are executed on the co-designed core. When the co-designed core is executing optimized hot code, the first processor core may be in a low-power state to save power or executing other code in parallel. Furthermore, multiple threads of cold code may be pipelined on the first core, while multiple threads of hot code are pipeline on the co-designed core to achieve maximum performance.

    摘要翻译: 本文描述了一种用于将第一类型的处理器核与第二类型的共同设计的核耦合的装置和方法。 对第一个核心上的程序代码执行进行监控,并且识别程序代码的热部分。 这些热部分优化用于在共同设计的芯上执行,使得在随后遇到这些热部分时,优化的热部分在共同设计的核上执行。 当共同设计的核心正在执行优化的热代码时,第一处理器核心可以处于低功率状态以节省功率或并行执行其他代码。 此外,多个冷码线程可以在第一核心上流水线化,而多个热代码线程在共同设计的核心上进行流水线以实现最大性能。

    INSTRUCTION AND LOGIC TO EFFICIENTLY MONITOR LOOP TRIP COUNT
    10.
    发明申请
    INSTRUCTION AND LOGIC TO EFFICIENTLY MONITOR LOOP TRIP COUNT 有权
    指令和逻辑到有效的监视器循环次数

    公开(公告)号:US20140208085A1

    公开(公告)日:2014-07-24

    申请号:US13996861

    申请日:2012-03-30

    IPC分类号: G06F9/32

    摘要: Logic and instruction to efficiently monitor loop trip count. Loop trip count information of a loop may be stored in a dedicated hardware buffer. Average loop trip count of the loop may be calculated based on the stored loop trip count information. Based on the average trip count, loop optimizations may be applied or removed from the loop. The stored loop trip count information may include an identifier identifying the loop, a total loop trip count of the loop, and an exit count of the loop.

    摘要翻译: 有效监控回路行程数的逻辑和指令。 循环的循环行程计数信息可以存储在专用硬件缓冲器中。 可以基于存储的循环行程计数信息来计算循环的平均循环行程计数。 基于平均行程计数,循环优化可以从循环中应用或移除。 存储的循环行程计数信息可以包括标识循环的标识符,循环的总循环行程计数以及循环的退出计数。