COUNTER TO MONITOR ADDRESS CONFLICTS
    1.
    发明申请
    COUNTER TO MONITOR ADDRESS CONFLICTS 审中-公开
    计数器来监视地址冲突

    公开(公告)号:WO2017117392A1

    公开(公告)日:2017-07-06

    申请号:PCT/US2016/069214

    申请日:2016-12-29

    CPC classification number: G06F9/3838 G06F9/30 G06F9/30021

    Abstract: Embodiments of systems, methods, and apparatuses for monitoring address conflicts are described. In some embodiments, an apparatus includes execution circuitry to execute instructions; a plurality of registers to store data coupled to the execution circuitry; and performance monitoring circuitry to perform address conflict counting by at least determining address conflicts between an executing instruction and previously executed instructions and counting each instance of a conflict.

    Abstract translation: 描述了用于监视地址冲突的系统,方法和装置的实施例。 在一些实施例中,一种装置包括执行电路以执行指令; 多个寄存器,用于存储耦合到所述执行电路的数据; 以及性能监视电路,用于通过至少确定执行指令和先前执行的指令之间的地址冲突并且计数冲突的每个实例来执行地址冲突计数。

    RUN-TIME CODE PARALLELIZATION USING OUT-OF-ORDER RENAMING WITH PRE-ALLOCATION OF PHYSICAL REGISTERS
    2.
    发明申请
    RUN-TIME CODE PARALLELIZATION USING OUT-OF-ORDER RENAMING WITH PRE-ALLOCATION OF PHYSICAL REGISTERS 审中-公开
    使用物理寄存器预分配的无序重新发送来实现运行时代码并行化

    公开(公告)号:WO2017072600A1

    公开(公告)日:2017-05-04

    申请号:PCT/IB2016/054706

    申请日:2016-08-04

    CPC classification number: G06F9/384 G06F9/3838

    Abstract: A method includes processing a sequence of instructions of program code that are specified using one or more architectural registers, by a hardware -implemented pipeline that renames the architectural registers in the instructions so as to produce operations specified using one or more physical registers (50), At least first and second segments of the sequence of instructions are selected, wherein the second segment occurs later in the sequence than the first segment. One or more of the architectural registers in the instructions of the second segment are renamed, before completing renaming the architectural registers in the instructions of the first segment, by pre-allocating one or more of the physical registers to one or more of the architectural registers.

    Abstract translation: 一种方法包括通过硬件实现的流水线来处理由一个或多个体系结构寄存器指定的程序代码的指令序列,该硬件实现的流水线重命名指令中的体系结构寄存器以便产生使用一个或多个体系结构寄存器 一个或多个物理寄存器(50),选择指令序列的至少第一和第二段,其中第二段在序列中比第一段晚。 在完成对第一段的指令中的架构寄存器的重命名之前,通过将一个或多个物理寄存器预先分配给一个或多个架构寄存器来重命名第二段的指令中的一个或多个架构寄存器

    METHOD AND APPARATUS FOR DYNAMICALLY TUNING SPECULATIVE OPTIMIZATIONS BASED ON PREDICTOR EFFECTIVENESS
    3.
    发明申请
    METHOD AND APPARATUS FOR DYNAMICALLY TUNING SPECULATIVE OPTIMIZATIONS BASED ON PREDICTOR EFFECTIVENESS 审中-公开
    基于预测有效性的动态调谐优化的方法与装置

    公开(公告)号:WO2017053111A1

    公开(公告)日:2017-03-30

    申请号:PCT/US2016/051253

    申请日:2016-09-12

    Abstract: A method for instruction signature based (ISB) speculative optimization includes storing a plurality of entries. Each entry of the plurality of entries includes an instruction signature tag and an ISB predictor effectiveness measurement. The instruction signature tag corresponds to an instruction signature and the ISB predictor effectiveness measurement is based, least in part, on an effectiveness of a predictor when applied to the instruction signature. The method also includes detecting a to-be-executed instruction signature and determining if the plurality of entries includes a matching entry. The matching entry has an instruction signature tag corresponding to the to-be-executed instruction signature. Upon determining that the plurality of entries includes the matching entry, the method includes controlling an application of the predictor to the to-be-executed instruction signature, based at least in part on the ISB predictor effectiveness measurement in the matching entry.

    Abstract translation: 一种用于基于指令签名(ISB)推测优化的方法包括存储多个条目。 多个条目的每个条目包括指令签名标签和ISB预测器有效性测量。 指令签名标签对应于指令签名,并且ISB预测器有效性测量至少部分地基于当应用于指令签名时预测器的有效性。 该方法还包括检测待执行指令签名并确定多个条目是否包括匹配条目。 匹配条目具有与待执行指令签名相对应的指令签名标签。 在确定多个条目包括匹配条目时,该方法包括至少部分地基于匹配条目中的ISB预测器有效性测量来控制预测器到待执行指令签名的应用。

    DEVICE AND PROCESSING ARCHITECTURE FOR INSTRUCTION MEMORY EFFICIENCY
    6.
    发明申请
    DEVICE AND PROCESSING ARCHITECTURE FOR INSTRUCTION MEMORY EFFICIENCY 审中-公开
    设备和处理架构的指令记忆效率

    公开(公告)号:WO2017032022A1

    公开(公告)日:2017-03-02

    申请号:PCT/CN2016/080512

    申请日:2016-04-28

    Abstract: Different processor architectures are described to evaluate and track dependencies required by instructions. The processors may hold or queue instructions that require output of other instructions until required data and resources are available which may remove the requirement of NOPs in the instruction memory to resolve dependencies and pipeline hazards. The processor may divide instruction data into bundles for parallel execution and provide speculative execution. The processor may include various components to implement an evaluation unit, execution unit and termination unit.

    Abstract translation: 描述不同的处理器架构来评估和跟踪指令所需的依赖性。 处理器可以保存或排队需要输出其他指令的指令,直到需要的数据和资源可用,这可以消除指令存储器中NOP的要求以解决依赖性和流水线危险。 处理器可以将指令数据划分为并行执行的捆绑包,并提供推测执行。 处理器可以包括实现评估单元,执行单元和终止单元的各种组件。

    STORING NARROW PRODUCED VALUES FOR INSTRUCTION OPERANDS DIRECTLY IN A REGISTER MAP IN AN OUT-OF-ORDER PROCESSOR
    7.
    发明申请
    STORING NARROW PRODUCED VALUES FOR INSTRUCTION OPERANDS DIRECTLY IN A REGISTER MAP IN AN OUT-OF-ORDER PROCESSOR 审中-公开
    在非订单处理者的注册地图中直接存储指令操作的生成值

    公开(公告)号:WO2017030692A1

    公开(公告)日:2017-02-23

    申请号:PCT/US2016/042240

    申请日:2016-07-14

    CPC classification number: G06F9/30112 G06F9/3838 G06F9/384 G06F9/3857

    Abstract: Storing narrow produced values for instruction operands directly in a register map in an out-of-order processor (OoP) is provided. An OoP is provided that includes an instruction processing system. The instruction processing system includes a number of instruction processing stages configured to pipeline the processing and execution of instructions according to a dataflow execution. The instruction processing system also includes a register map table (RMT) configured to store address pointers mapping logical registers to physical registers in a physical register file (PRF) for storing produced data for use by consumer instructions without overwriting logical registers for later executed, out-of-order instructions. In certain aspects, the instruction processing system is configured to write back (i.e., store) narrow values produced by executed instructions directly into the RMT, as opposed to writing the narrow produced values into the PRF in a write back stage.

    Abstract translation: 提供将指令操作数的窄生成值直接存储在乱序处理器(OoP)的寄存器映射中。 提供了包括指令处理系统的OoP。 指令处理系统包括多个指令处理阶段,其被配置为根据数据流执行流水线处理和执行指令。 指令处理系统还包括寄存器映射表(RMT),其被配置为存储将逻辑寄存器映射到物理寄存器文件(PRF)中的物理寄存器的地址指针,用于存储由消费者指令使用的产生数据,而不覆盖用于稍后执行的逻辑寄存器 订单说明。 在某些方面,指令处理系统被配置为将由执行的指令产生的窄值直接写入(即存储)到RMT中,而不是在写回阶段将窄的产生值写入PRF。

    为内存控制器分配硬件加速指令的方法和装置

    公开(公告)号:WO2016134656A1

    公开(公告)日:2016-09-01

    申请号:PCT/CN2016/074450

    申请日:2016-02-24

    Abstract: 本发明实施例提供一种为内存控制器分配硬件加速指令的方法和装置。该方法包括:按照多个硬件加速指令之间的依赖关系将多个硬件加速指令划分为不同的指令集合;按照将硬件加速指令之间无依赖关系的不同指令集合分配给不同的内存控制器的原则,获取各指令集合与计算机系统中的内存控制器的第一映射关系;根据第一内存控制器集合中的各内存控制器的负载信息调整第一映射关系,以获得各指令集合与计算机系统的内存控制器的第二映射关系;按照第二映射关系将各指令集合中的硬件加速指令分配给第二内存控制器集合中的内存控制器。实现由计算机系统中的多个内存控制器执行硬件加速指令时,各个内存控制器的负载均衡。

    METHOD AND APPARATUS FOR REALIZING SELF-TIMED PARALLELIZED MANY-CORE PROCESSOR
    10.
    发明申请
    METHOD AND APPARATUS FOR REALIZING SELF-TIMED PARALLELIZED MANY-CORE PROCESSOR 审中-公开
    用于实现自相并列的多核处理器的方法和装置

    公开(公告)号:WO2016119546A1

    公开(公告)日:2016-08-04

    申请号:PCT/CN2015/098786

    申请日:2015-12-24

    Abstract: A self-timed parallelized multi-core processor and method for operating the processor are provided. The processor has an instruction decoder unit to receive a program code instruction, determine an operating code and latency for the program code instructions, and assign a loop index to the program code instruction. The processor further includes an instruction decomposer unit coupled to the instruction decoder unit, the instruction decomposer configured to create a primitive by decomposing the instruction, replace the loop index with a core index, and broadcast the primitive. The processor further has a plurality of self-timed processing cores coupled to the instruction decomposer unit, each core having a unique core index and having a dispatch unit for comparing the core index in the primitive with the core index of its processing core, each core acting on the primitive when the index of the processing core is within a threshold of the core index.

    Abstract translation: 提供了一种用于操作处理器的自定时并行多核处理器和方法。 处理器具有指令解码器单元,用于接收程序代码指令,确定程序代码指令的操作代码和延迟,并向程序代码指令分配循环索引。 所述处理器还包括耦合到所述指令译码器单元的指令分解器单元,所述指令分解器被配置为通过分解所述指令来创建基元,用核心索引替换所述循环索引,并且广播所述图元。 处理器还具有耦合到指令分解器单元的多个自定时处理核心,每个核心具有唯一的核心索引,并且具有用于将原语中的核心索引与其处理核心的核心索引进行比较的调度单元,每个核心 当处理核心的索引在核心索引的阈值内时,它作用于原语。

Patent Agency Ranking