Methods And Apparatuses For Efficient Load Processing Using Buffers
    131.
    发明申请
    Methods And Apparatuses For Efficient Load Processing Using Buffers 审中-公开
    使用缓冲器高效加载处理的方法和设备

    公开(公告)号:US20130246712A1

    公开(公告)日:2013-09-19

    申请号:US13886467

    申请日:2013-05-03

    Abstract: Various embodiments of the invention concern methods and apparatuses for power and time efficient load handling. A compiler may identify producer loads, consumer reuse loads, consumer forwarded loads, and producer/consumer hybrid loads. Based on this identification, performance of the load may be efficiently directed to a load value buffer, store buffer, data cache, or elsewhere. Consequently, accesses to cache are reduced, through direct loading from load value buffers and store buffers, thereby efficiently processing the loads.

    Abstract translation: 本发明的各种实施例涉及用于功率和时间有效的负载处理的方法和装置。 编译器可以识别生产者负载,消费者重用负载,消费者转发负载以及生产者/消费者混合负载。 基于该识别,可以将负载的性能有效地指向负载值缓冲器,存储缓冲器,数据高速缓存或其他位置。 因此,通过从负载值缓冲区和存储缓冲区的直接加载,减少对高速缓存的访问,从而有效地处理负载。

    Methods and apparatuses for efficient load processing using buffers
    132.
    发明授权
    Methods and apparatuses for efficient load processing using buffers 有权
    使用缓冲区进行高效加载处理的方法和装置

    公开(公告)号:US08452946B2

    公开(公告)日:2013-05-28

    申请号:US12640707

    申请日:2009-12-17

    Abstract: Various embodiments of the invention concern methods and apparatuses for power and time efficient load handling. A compiler may identify producer loads, consumer reuse loads, consumer forwarded loads, and producer/consumer hybrid loads. Based on this identification, performance of the load may be efficiently directed to a load value buffer, store buffer, data cache, or elsewhere. Consequently, accesses to cache are reduced, through direct loading from load value buffers and store buffers, thereby efficiently processing the loads.

    Abstract translation: 本发明的各种实施例涉及用于功率和时间有效的负载处理的方法和装置。 编译器可以识别生产者负载,消费者重用负载,消费者转发负载以及生产者/消费者混合负载。 基于该识别,可以将负载的性能有效地指向负载值缓冲器,存储缓冲器,数据高速缓存或其他位置。 因此,通过从负载值缓冲区和存储缓冲区的直接加载,减少对高速缓存的访问,从而有效地处理负载。

    Methods and apparatus to optimize the parallel execution of software processes
    133.
    发明授权
    Methods and apparatus to optimize the parallel execution of software processes 有权
    优化并行执行软件流程的方法和设备

    公开(公告)号:US08316360B2

    公开(公告)日:2012-11-20

    申请号:US11537585

    申请日:2006-09-29

    CPC classification number: G06F8/451

    Abstract: Methods and apparatus to optimize the parallel execution of software processes are disclosed. An example method includes receiving a first software process that processes a set of data, locating a first primitive in the first software process, and decomposing the first primitive into a first set of one or more sub-primitives. The example methods and apparatus additionally perform static fusion and dynamic fusion to optimize software processes for execution in parallel processing systems.

    Abstract translation: 公开了优化并行执行软件过程的方法和装置。 示例性方法包括接收处理一组数据的第一软件过程,在第一软件过程中定位第一原语,以及将第一原语分解成一个或多个子原语的第一集合。 示例性方法和设备另外执行静态融合和动态融合以优化用于并行处理系统中的执行的软件过程。

    DYNAMIC CORE SELECTION FOR HETEROGENEOUS MULTI-CORE SYSTEMS
    134.
    发明申请
    DYNAMIC CORE SELECTION FOR HETEROGENEOUS MULTI-CORE SYSTEMS 有权
    异构多核系统的动态核心选择

    公开(公告)号:US20120233477A1

    公开(公告)日:2012-09-13

    申请号:US13046031

    申请日:2011-03-11

    Abstract: Dynamically switching cores on a heterogeneous multi-core processing system may be performed by executing program code on a first processing core. Power up of a second processing core may be signaled. A first performance metric of the first processing core executing the program code may be collected. When the first performance metric is better than a previously determined core performance metric, power down of the second processing core may be signaled and execution of the program code may be continued on the first processing core. When the first performance metric is not better than the previously determined core performance metric, execution of the program code may be switched from the first processing core to the second processing core.

    Abstract translation: 可以通过在第一处理核上执行程序代码来执行异构多核处理系统上的动态切换核。 可以用信号通知第二处理核心的加电。 可以收集执行程序代码的第一处理核心的第一性能度量。 当第一性能指标优于先前确定的核心性能指标时,可以发信号通知第二处理核心的掉电,并且可以在第一处理核心上继续执行程序代码。 当第一性能度量不比先前确定的核心性能指标更好时,程序代码的执行可以从第一处理核心切换到第二处理核心。

    DYNAMIC OPTIMIZATION FOR CONDITIONAL COMMIT
    136.
    发明申请
    DYNAMIC OPTIMIZATION FOR CONDITIONAL COMMIT 审中-公开
    动态优化条件咨询

    公开(公告)号:US20120079245A1

    公开(公告)日:2012-03-29

    申请号:US12890638

    申请日:2010-09-25

    Abstract: An apparatus and method is described herein for conditionally committing and/or speculative checkpointing transactions, which potentially results in dynamic resizing of transactions. During dynamic optimization of binary code, transactions are inserted to provide memory ordering safeguards, which enables a dynamic optimizer to more aggressively optimize code. And the conditional commit enables efficient execution of the dynamic optimization code, while attempting to prevent transactions from running out of hardware resources. While the speculative checkpoints enable quick and efficient recovery upon abort of a transaction. Processor hardware is adapted to support dynamic resizing of the transactions, such as including decoders that recognize a conditional commit instruction, a speculative checkpoint instruction, or both. And processor hardware is further adapted to perform operations to support conditional commit or speculative checkpointing in response to decoding such instructions.

    Abstract translation: 本文描述了用于有条件地提交和/或推测性检查点事务的装置和方法,这可能导致事务的动态调整大小。 在二进制代码的动态优化期间,插入事务以提供存储器排序保护措施,这使得动态优化器能够更积极地优化代码。 并且条件提交可以有效地执行动态优化代码,同时尝试防止事务用尽硬件资源。 虽然投机检查点能够在中止交易后快速有效地恢复。 处理器硬件适于支持事务的动态调整大小,诸如包括识别条件提交指令的解码器,推测性检查点指令或两者。 并且处理器硬件还适于执行响应于解码这样的指令来支持条件提交或推测性检查点的操作。

    Compressing and accessing a microcode ROM
    137.
    发明授权
    Compressing and accessing a microcode ROM 有权
    压缩和访问微码ROM

    公开(公告)号:US08099587B2

    公开(公告)日:2012-01-17

    申请号:US11186240

    申请日:2005-07-20

    CPC classification number: G06F12/06 G06F8/4436 G06F9/30178 G06F2212/401

    Abstract: An arrangement is provided for compressing microcode ROM (“uROM”) in a processor and for efficiently accessing a compressed “uROM”. A clustering-based approach may be used to effectively compress a uROM. The approach groups similar columns of microcode into different clusters and identifies unique patterns within each cluster. Only unique patterns identified in each cluster are stored in a pattern storage. Indices, which help map an address of a microcode word (“uOP”) to be fetched from a uROM to unique patterns required for the uOP, may be stored in an index storage. Typically it takes a longer time to fetch a uOP from a compressed uROM than from an uncompressed uROM. The compressed uROM may be so designed that the process of fetching a uOP (or uOPs) from a compressed uROM may be fully-pipelined to reduce the access latency.

    Abstract translation: 提供了一种用于在处理器中压缩微代码ROM(“uROM”)并有效访问压缩的“uROM”的装置。 可以使用基于聚类的方法来有效地压缩uROM。 该方法将相似的微代码列组合成不同的集群,并识别每个集群内的唯一模式。 每个集群中唯一标识的模式都存储在模式存储中。 帮助将从uROM获取的微代码字(“uOP”)的地址映射到uOP所需的唯一模式的索引可以存储在索引存储器中。 通常,从压缩的uROM获取uop比从未压缩的uROM获取更长的时间。 压缩的uROM可以被设计成使得从压缩的uROM获取uop(或uop)的过程可以被完全流水线化以减少访问等待时间。

    APPARATUS, METHOD, AND SYSTEM FOR IMPROVING POWER, PERFORMANCE EFFICIENCY BY COUPLING A FIRST CORE TYPE WITH A SECOND CORE TYPE
    138.
    发明申请
    APPARATUS, METHOD, AND SYSTEM FOR IMPROVING POWER, PERFORMANCE EFFICIENCY BY COUPLING A FIRST CORE TYPE WITH A SECOND CORE TYPE 审中-公开
    用于提高功率的装置,方法和系统,通过与第二核心类型耦合的第一核心类型的性能效率

    公开(公告)号:US20110320766A1

    公开(公告)日:2011-12-29

    申请号:US12826107

    申请日:2010-06-29

    Abstract: An apparatus and method is described herein for coupling a processor core of a first type with a co-designed core of a second type. Execution of program code on the first core is monitored and hot sections of the program code are identified. Those hot sections are optimize for execution on the co-designed core, such that upon subsequently encountering those hot sections, the optimized hot sections are executed on the co-designed core. When the co-designed core is executing optimized hot code, the first processor core may be in a low-power state to save power or executing other code in parallel. Furthermore, multiple threads of cold code may be pipelined on the first core, while multiple threads of hot code are pipeline on the co-designed core to achieve maximum performance.

    Abstract translation: 本文描述了一种用于将第一类型的处理器核与第二类型的共同设计的核耦合的装置和方法。 对第一个核心上的程序代码执行进行监控,并且识别程序代码的热部分。 这些热部分优化用于在共同设计的芯上执行,使得在随后遇到这些热部分时,优化的热部分在共同设计的核上执行。 当共同设计的核心正在执行优化的热代码时,第一处理器核心可以处于低功率状态以节省功率或并行执行其他代码。 此外,多个冷码线程可以在第一核心上流水线化,而多个热代码线程在共同设计的核心上进行流水线以实现最大性能。

    Compiler technique for efficient register checkpointing to support transaction roll-back
    139.
    发明授权
    Compiler technique for efficient register checkpointing to support transaction roll-back 有权
    编译器技术,用于高效的注册检查点支持事务回滚

    公开(公告)号:US08001421B2

    公开(公告)日:2011-08-16

    申请号:US12856505

    申请日:2010-08-13

    CPC classification number: G06F9/3863 G06F9/3004 G06F9/3834 G06F11/1407

    Abstract: A method and apparatus for efficient register checkpointing is herein described. A transaction is detected in program code. A recovery block is inserted in the program code to perform recovery operations in response to an abort of the first transaction. A roll-back edge is potentially inserted from an abort point to the recovery block. A control flow edge is inserted from the recovery block to a entry point of the transaction. Checkpoint code is inserted before the entry point to backup live-in registers in backup storage elements and recovery code is inserted in the recovery block to restore the live-in registers from the backup storage elements in response to an abort of the transaction.

    Abstract translation: 这里描述用于有效的寄存器检查点的方法和装置。 在程序代码中检测到事务。 在程序代码中插入恢复块,以响应于第一个事务的中止来执行恢复操作。 回退边缘可能从中止点插入到恢复块。 将控制流程边缘从恢复块插入到事务的入口点。 检查点代码被插入到备份存储元件中的备份实时寄存器的入口点之前,并且恢复代码被插入到恢复块中,以便响应于事务的中止从备份存储元件恢复实时寄存器。

    Transient fault detection by integrating an SRMT code and a non SRMT code in a single application
    140.
    发明授权
    Transient fault detection by integrating an SRMT code and a non SRMT code in a single application 有权
    通过在单个应用程序中集成SRMT代码和非SRMT代码来进行瞬态故障检测

    公开(公告)号:US07937621B2

    公开(公告)日:2011-05-03

    申请号:US11770095

    申请日:2007-06-28

    CPC classification number: G06F11/1487 G06F8/458 G06F9/4484

    Abstract: Disclosed is a method for running a first code generated by a Software-based Redundant Multi-Threading (SRMT) compiler along with a second code generated by a normal compiler at runtime, the first code including a first function and a second function, the second code including a third function. The method comprises running the first function in a leading thread and a tailing thread (104); running the third function in a single thread (106), the leading thread calls the third function and running the second function in the leading thread and the tailing thread (108), the third function calls the second function. The present disclosure provides a mechanism for handling function calls wherein SRMT functions and binary functions can call each other irrespective of whether the callee function is a SRMT function or a binary function and thereby dynamically adjusts reliability and performance tradeoff based on run-time information and user selectable policies.

    Abstract translation: 公开了一种用于在运行时运行由基于软件的冗余多线程(SRMT)编译器生成的第一代码以及由正常编译器生成的第二代码的方法,所述第一代码包括第一功能和第二功能,第二代码 代码包括第三个功能。 该方法包括在前导线和尾线(104)中运行第一功能; 在单个线程(106)中运行第三个函数,前导线程调用第三个函数并在前导线程和后退线程(108)中运行第二个函数,第三个函数调用第二个函数。 本公开提供了一种用于处理功能调用的机制,其中SRMT功能和二进制功能可以彼此调用,而不管被叫方功能是SRMT功能还是二进制功能,从而基于运行时信息和用户动态地调整可靠性和性能权衡 可选择的政策。

Patent Agency Ranking