Tracing mechanism for recording shared memory interleavings on multi-core processors
    11.
    发明授权
    Tracing mechanism for recording shared memory interleavings on multi-core processors 有权
    用于在多核处理器上记录共享内存交错的跟踪机制

    公开(公告)号:US09558118B2

    公开(公告)日:2017-01-31

    申请号:US13997747

    申请日:2012-03-30

    摘要: A memory race recorder (MRR) is provided. The MRR includes a multi-core processor having a relaxed memory consistency model, an extension to the multi-core processor, the extension to store chunks, the chunk having a chunk size (CS) and an instruction count (IC), and a plurality of cores to execute instructions. The plurality of cores executes load/store instructions to/from a store buffer (STB) and a simulated memory to store the value when the value is not in the STB. The oldest value in the STB is transferred to the simulated memory when the IC is equal to zero and the CS is greater than zero. The MRR logs a trace entry comprising the CS, the IC, and a global timestamp, the global timestamp proving a total order across all logged chunks.

    摘要翻译: 提供了一个记忆体记录仪(MRR)。 MRR包括具有放松的存储器一致性模型的多核处理器,多核处理器的扩展,存储块的扩展,具有块大小(CS)和指令计数(IC)的块,以及多个 的核心执行指令。 多个核对存储缓冲器(STB)和模拟存储器执行加载/存储指令以在值不在STB中时存储该值。 当IC等于零并且CS大于零时,STB中的最旧值被传送到模拟存储器。 MRR记录包含CS,IC和全局时间戳的跟踪条目,全局时间戳记证明所有记录的块的总顺序。

    APPARATUS AND METHOD FOR A PROFILER FOR HARDWARE TRANSACTIONAL MEMORY PROGRAMS
    12.
    发明申请
    APPARATUS AND METHOD FOR A PROFILER FOR HARDWARE TRANSACTIONAL MEMORY PROGRAMS 有权
    用于硬件交易记忆程序的配置文件的装置和方法

    公开(公告)号:US20160179569A1

    公开(公告)日:2016-06-23

    申请号:US14581772

    申请日:2014-12-23

    摘要: An apparatus and method are described for a hardware transactional memory (HTM) profiler. For example, one embodiment of an apparatus comprises a transactional debugger (TDB) recording module to record data related to the execution of transactional memory program code, including data related to the execution of branches and transactional events in the transactional memory program code; and a profiler to analyze portions of the recorded data using trace-based replay techniques to responsively generate profile data comprising transaction-level events and function-level conflict data usable to optimize the transactional memory program code.

    摘要翻译: 描述了用于硬件事务存储器(HTM)分析器的装置和方法。 例如,设备的一个实施例包括事务调试器(TDB)记录模块,用于记录与事务存储器程序代码的执行有关的数据,包括与事务存储器程序代码中的分支和事务事件的执行相关的数据; 以及分析器,用于使用基于跟踪的重放技术来分析记录数据的部分,以响应地生成包括事务级事件和可用于优化事务存储器程序代码的功能级冲突数据的简档数据。

    UNBOUNDED TRANSACTIONAL MEMORY WITH FORWARD PROGRESS GUARANTEES USING A HARDWARE GLOBAL LOCK
    15.
    发明申请
    UNBOUNDED TRANSACTIONAL MEMORY WITH FORWARD PROGRESS GUARANTEES USING A HARDWARE GLOBAL LOCK 有权
    使用硬件全局锁定的前进进程保护的无关紧要的交易记忆

    公开(公告)号:US20150169362A1

    公开(公告)日:2015-06-18

    申请号:US14108892

    申请日:2013-12-17

    IPC分类号: G06F9/46 G06F12/14

    CPC分类号: G06F9/467 G06F9/52 G06F9/528

    摘要: A processing device implementing unbounded transactional memory with forward progress guarantees using a hardware global lock is disclosed. A processing device of the disclosure includes a hardware transactional memory (HTM) hardware contention manager to cause a bounded transaction to be translated to an unbounded transaction, the unbounded transaction to acquire a global hardware lock for the unbounded transaction, the global hardware lock read by bounded transactions that abort when the global hardware lock is taken. The processing device further includes an execution unit communicably coupled to the HTM hardware contention manager to execute instructions of the unbounded transaction without speculation, the unbounded transaction to release the global hardware lock upon completion of execution of the instructions.

    摘要翻译: 公开了一种使用硬件全局锁来实现具有前进进度的无界事务存储器的处理设备。 本公开的处理装置包括硬件事务存储器(HTM)硬件竞争管理器,用于使有界事务被转换为无界事务,该无界事务获取无界事务的全局硬件锁,全局硬件锁由 全局硬件锁定时中止的有界事务。 处理装置还包括执行单元,其可通信地耦合到HTM硬件争用管理器以执行无界事务的指令而无需推测,该无限制事务在完成指令的执行时释放全局硬件锁定。

    Instruction and Logic for Processor Trace Information for Control Flow Integrity

    公开(公告)号:US20170090926A1

    公开(公告)日:2017-03-30

    申请号:US14866254

    申请日:2015-09-25

    IPC分类号: G06F9/30

    摘要: A processor includes a front end to decode an instruction and pass the instruction to execution units with branch suffix information. The processor further includes execution units to execute the instruction and a retirement unit to retire the instruction. The instruction is to specify an operation to be conditionally executed based upon a branch suffix to identify previous execution. The processor further includes logic to, upon retirement of the instruction, determine the result of a series of branch operations preceding execution of the instruction, compare the result to the branch suffix information, allow execution and retirement of the instruction based on a determination that the result matches the branch suffix information, and generate a fault based on a determination that the result does not match the branch suffix information.

    TECHNOLOGIES FOR OPTIMIZING SPARSE MATRIX CODE WITH FIELD-PROGRAMMABLE GATE ARRAYS

    公开(公告)号:US20180004496A1

    公开(公告)日:2018-01-04

    申请号:US15200053

    申请日:2016-07-01

    IPC分类号: G06F9/45

    摘要: Technologies for optimizing sparse matrix code include a target computing device having a processor and a field-programmable gate array (FPGA). A compiler identifies a performance-critical loop in a sparse matrix source code and generates optimized executable code, including processor code and FPGA code. The target computing device executes the optimized executable code, using the processor for the processor code and the FPGA for the FPGA code. The processor executes a first iteration of the loop, generates reusable optimization data in response to executing the first iteration, and stores the reusable optimization data in a shared memory. The FPGA accesses the optimization data in the shared memory, executes additional iterations of the loop, and optimizes the additional iterations of the loop based on the optimization data. The optimization data may include, for example, loop-invariant data, reordered data, or alternate data storage representations. Other embodiments are described and claimed.