System and method for run-time value tracking during execution
    11.
    发明公开
    System and method for run-time value tracking during execution 审中-公开
    系统与Verfahren zur Laufzeit-Verfolgung von Werte

    公开(公告)号:EP1632846A2

    公开(公告)日:2006-03-08

    申请号:EP05108039.8

    申请日:2005-09-02

    IPC分类号: G06F9/38

    摘要: A technique for run-time tracking changes to variables and memory locations during code execution to increase efficiency of execution of the code and to facilitate in debugging the code. In one example embodiment, this is achieved by determining whether a received instruction in a trackable instruction during code execution. The trackable instructions can include one or more trackable variables. The trackable instruction is then decoded and a track instruction cache and a track variable cache are then updated with associated decoded trackable instruction and the one or more trackable variables, respectively.

    摘要翻译: 在代码执行期间,用于运行时跟踪的变量和存储器位置的技术可以提高代码的执行效率,并有助于调试代码。 在一个示例实施例中,这是通过在代码执行期间确定在可跟踪指令中的接收指令来实现的。 可跟踪指令可以包括一个或多个可追踪变量。 然后对可跟踪指令进行解码,然后分别用相关联的解码可跟踪指令和一个或多个可跟踪变量来更新跟踪指令高速缓存和轨道变量高速缓存。

    VALUE PREDICTION IN A PROCESSOR FOR PROVIDING SPECULATIVE EXECUTION
    12.
    发明公开
    VALUE PREDICTION IN A PROCESSOR FOR PROVIDING SPECULATIVE EXECUTION 审中-公开
    价值预测在处理器用于提供推测执行

    公开(公告)号:EP1421477A1

    公开(公告)日:2004-05-26

    申请号:EP02712589.7

    申请日:2002-02-21

    IPC分类号: G06F9/38

    摘要: The present invention relates to a processing unit (1) for executing instructions in a computer system and to a method in such a processing unit. According to the present invention a decision is made whether or not to base execution on a value prediction (P), wherein the decision is based on information associated with the estimated time gain of execution based on a correct prediction. According to an embodiment of the present invention the decision regarding whether or not to execute speculatively is based on information (14) regarding whether a cache hit or a cache miss is detected in connection with a load instruction. In an alternative embodiment of the present invention the decision is based on information regarding the dependency depth of the load instruction, i.e. the number of instructions that are dependent on the load.

    A DATA CACHE CAPABLE OF PERFORMING STORE ACCESSES IN A SINGLE CLOCK CYCLE
    13.
    发明授权
    A DATA CACHE CAPABLE OF PERFORMING STORE ACCESSES IN A SINGLE CLOCK CYCLE 失效
    在一个位置存储数据CACHE在一个简单的UHRZYLKUS OUT

    公开(公告)号:EP1015980B1

    公开(公告)日:2002-04-24

    申请号:EP96943477.8

    申请日:1996-11-04

    IPC分类号: G06F12/08

    摘要: A data cache configured to perform store accesses in a single clock cycle is provided. The data cache speculatively stores data within a predicted way of the cache after capturing the data currently being stored in that predicted way. During a subsequent clock cycle, the cache hit information for the store access validates the way prediction. If the way prediction is correct, then the store is complete. If the way prediction is incorrect, then the captured data is restored to the predicted way. If the store access hits in an unpredicted way, the store data is transferred into the correct storage location within the data cache concurrently with the restoration of data in the predicted storage location. Each store for which the way prediction is correct utilizes a single clock cycle of data cache bandwidth. Additionally, the way prediction structure implemented within the data cache bypasses the tag comparisons of the data cache to select data bytes for the output. Therefore, the access time of the associative data cache may be substantially similar to a direct-mapped cache access time. The present data cache is therefore suitable for high frequency superscalar microprocessors.

    Method for prefetching structured data
    14.
    发明公开
    Method for prefetching structured data 有权
    韦尔法罕zum Vorausholen von strukturierten Daten

    公开(公告)号:EP1031919A2

    公开(公告)日:2000-08-30

    申请号:EP00103800.9

    申请日:2000-02-23

    申请人: NEC CORPORATION

    IPC分类号: G06F9/38

    摘要: A method for prefetching structured data, and more particularly a mechanism for observing address references made by a processor, and learning from those references the patterns of accesses made to structured data. Structured data means aggregates of related data such as arrays, records, and data containing links and pointers. When subsequent accesses are made to data structured in the same way, the mechanism generates in advance the sequence of addresses that will be needed for the new accesses. This sequence is utilized by the memory to obtain the data somewhat earlier than the instructions would normally request it, and thereby eliminate idle time due to memory latency while awaiting the arrival of the data.

    摘要翻译: 一种用于预取结构化数据的方法,更具体地,涉及用于观察由处理器进行的地址参考的机制,并且从这些引用中学习对结构化数据进行访问的模式。 结构化数据表示相关数据的聚合,例如数组,记录和包含链接和指针的数据。 当以相同的方式对结构化的数据进行后续访问时,机制预先生成新访问所需的地址序列。 该序列由存储器利用以获得比指令通常要求的更早的数据,从而在等待数据到达时消除由于存储器等待时间引起的空闲时间。

    Information processing device and method for performing parallel processing
    15.
    发明公开
    Information processing device and method for performing parallel processing 审中-公开
    Datenverarbeitungsschaltung und Verfahren zur paralleer Verarbeitung

    公开(公告)号:EP0969358A2

    公开(公告)日:2000-01-05

    申请号:EP99302565.9

    申请日:1999-03-31

    申请人: FUJITSU LIMITED

    IPC分类号: G06F9/38

    摘要: An information processing device (1), for detecting a register interference state where a register which is updated by a preceding instruction is used by a succeeding instruction, for example, when the generation of an operand address, is detected, the execution of a succeedingly fetched instruction is started by storing the operand address generated when the succeeding instruction is executed in association with the address of the succeeding instruction, and by using as an estimated address the operand address which corresponds to the address of the succeedingly fetched instruction and is retrieved from the stored contents.

    摘要翻译: 一种信息处理装置(1),用于检测由例如在检测到操作数地址的产生时由后面的指令使用由先前指令更新的寄存器的寄存器干扰状态,随后的执行 通过存储与后续指令的地址相关联地执行后续指令时产生的操作数地址,并且通过使用对应于随后取出的指令的地址的操作数地址作为估计地址,并从 存储的内容。

    A DATA ADDRESS PREDICTION STRUCTURE UTILIZING A STRIDE PREDICTION METHOD
    16.
    发明公开
    A DATA ADDRESS PREDICTION STRUCTURE UTILIZING A STRIDE PREDICTION METHOD 失效
    结构用于数据处理与预测方法预测步

    公开(公告)号:EP0912928A1

    公开(公告)日:1999-05-06

    申请号:EP96925349.0

    申请日:1996-07-16

    IPC分类号: G06F9

    摘要: A data prediction structure is provided. The data prediction structure stores base addresses and stride values in a prediction array. The base address and the stride value are added to form a data prediction address which is then used to fetch data bytes into a relatively small, relatively fast buffer which may be accessed by the decode stage(s) of the instruction processing pipeline. If the data associated with an operand address calculated by a decode stage resides in the buffer, the clock cycles used to perform the load operation occur before the instruction reaches the execution stage of the instruction processing pipeline. The execution stage clock cycles that are saved may be used to execute other instructions. Additionally, the base address is updated to the address generated by a decode unit each time a basic block is executed, and the stride value is updated when the data prediction address is found to be incorrect. In this way, the data prediction address may be more accurate than a static data prediction address.

    Combined queue for invalidates and return data in multi-processsor system
    17.
    发明公开
    Combined queue for invalidates and return data in multi-processsor system 失效
    在multi-processsor系统中组合队列用于无效和返回数据

    公开(公告)号:EP0465320A3

    公开(公告)日:1995-03-22

    申请号:EP91401767.8

    申请日:1991-06-27

    IPC分类号: G06F9/38 G06F12/08

    摘要: A pipelined CPU executing instructions of variable length, and referencing memory using various data widths. Macroinstruction pipelining is employed (instead of microinstruction pipelining), with queueing between units of the CPU to allow flexibility in instruction execution times. A wide bandwidth is available for memory access; fetching 64-bit data blocks on each cycle. A hierarchical cache arrangement has an improved method of cache set selection, increasing the likelihood of a cache hit. A writeback cache is used (instead of writethrough) and writeback is allowed to proceed even though other accesses are suppressed due to queues being full. A branch prediction method employs a branch history table which records the taken vs. not-taken history of branch opcodes recently used, and uses an empirical algorithm to predict which way the next occurrence of this branch will go, based upon the history table. A floating point processor function is integrated on-chip, with enhanced speed due to a bypass technique; a trial mini-rounding is done on low-order bits of the result, and if correct, the last stage of the floating point processor can be bypassed, saving one cycle of latency. For CAL type instructions, a method for determining which registers need to be saved is executed in a minimum number of cycles, examining groups of register mask bits at one time. Internal processor registers are accessed with short (byte width) addresses instead of full physical addresses as used for memory and I/O references, but off-chip processor registers are memory-mapped and accessed by the same busses using the same controls as the memory and I/O. If a non-recoverable error detected by ECC circuits in the cache, an error transition mode is entered wherein the cache operates under limited access rules, allowing a maximum of access by the system for data blocks owned by the cache, but yet minimizing changes to the cache data so that diagnostics may be run. Separate queues are provided for the return data from memory and cache invalidates, yet the order or bus transactions is maintained by a pointer arrangement. The bus protocol used by the CPU to communicate with the system bus is of the pended type, with transactions on the bus identified by an ID field specifying the originator, and arbitration for bus grant goes one simultaneously with address/data transactions on the bus.

    摘要翻译: 流水线CPU执行可变长度的指令,并使用各种数据宽度来引用内存。 采用宏指令流水线(而不是微指令流水线),在CPU单元之间排队以允许指令执行时间的灵活性。 宽带宽可用于存储器访问; 在每个周期中获取64位数据块。 分层高速缓存配置具有改进的高速缓存集选择方法,增加了高速缓存命中的可能性。 使用写回缓存(而不是写入),并且允许回写,即使其他访问由于队列已满而被抑制。 分支预测方法使用分支历史表,其记录最近使用的分支操作码的已采取与未采取的历史,并基于历史表使用经验算法来预测该分支的下一次出现将以何种方式进行。 浮点处理器功能集成在芯片上,由于旁路技术而具有更高的速度; 对结果的低位进行试验小型四舍五入,如果正确,则可以绕过浮点处理器的最后一级,节省一个周期的等待时间。 对于CAL类型的指令,用于确定哪些寄存器需要被保存的方法是以最少数量的周期执行的,一次检查寄存器屏蔽位组。 内部处理器寄存器用短(字节宽度)地址访问,而不是用于存储器和I / O引用的完整物理地址,但片外处理器寄存器是内存映射的,并且使用与内存相同的控制器访问相同的总线 和I / O。 如果高速缓存中的ECC电路检测到不可恢复的错误,则进入错误转换模式,其中高速缓存在有限的访问规则下操作,允许系统对由高速缓存拥有的数据块进行最大限度的访问,但是仍然最小化 缓存数据,以便可以运行诊断。 为来自内存和高速缓存失效的返回数据提供了单独的队列,但订单或总线事务由指针安排维护。 CPU用来和系统总线通信的总线协议是挂起型的,总线上的事务由指定发起者的ID字段标识,总线授权的仲裁与总线上的地址/数据事务同时进行。

    PROVIDING LOOP-INVARIANT VALUE PREDICTION USING A PREDICTED VALUES TABLE, AND RELATED APPARATUSES, METHODS, AND COMPUTER-READABLE MEDIA
    18.
    发明公开
    PROVIDING LOOP-INVARIANT VALUE PREDICTION USING A PREDICTED VALUES TABLE, AND RELATED APPARATUSES, METHODS, AND COMPUTER-READABLE MEDIA 审中-公开
    使用预测值表提供环路 - 不变值预测以及相关设备,方法和计算机可读介质

    公开(公告)号:EP3221784A1

    公开(公告)日:2017-09-27

    申请号:EP15793971.1

    申请日:2015-10-27

    IPC分类号: G06F9/38

    摘要: Providing loop-invariant value prediction using a predicted values table, and related apparatuses, methods, and computer-readable media are disclosed. In one aspect, an apparatus comprising an instruction processing circuit is provided. The instruction processing circuit is configured to detect a loop body in an instruction stream, and to detect a value-generating instruction within the loop body. The instruction processing circuit determines whether an attribute of the value-generating instruction matches an entry of a predicted values table. If the attribute of the value-generating instruction is determined to be present in the entry of the predicted values table, the instruction processing circuit further determines whether a counter of the entry exceeds an iteration threshold. Responsive to determining that the counter of the entry exceeds the iteration threshold, the instruction processing circuit provides a predicted value in the entry of the predicted values table for execution of at least one dependent instruction.

    摘要翻译: 公开了使用预测值表提供循环不变值预测,以及相关装置,方法和计算机可读介质。 在一个方面,提供了一种包括指令处理电路的设备。 指令处理电路被配置为检测指令流中的循环体,并检测循环体内的值生成指令。 指令处理电路确定值生成指令的属性是否匹配预测值表的条目。 如果确定值生成指令的属性存在于预测值表的条目中,则指令处理电路进一步确定条目的计数器是否超过迭代阈值。 响应于确定条目的计数器超过迭代阈值,指令处理电路在预测值表的条目中提供预测值以用于执行至少一个相关指令。

    OPERAND CONFLICT RESOLUTION FOR REDUCED PORT GENERAL PURPOSE REGISTER
    19.
    发明公开
    OPERAND CONFLICT RESOLUTION FOR REDUCED PORT GENERAL PURPOSE REGISTER 审中-公开
    OPERAND冲突解决方案,用于降低港口总用途寄存器

    公开(公告)号:EP3201760A1

    公开(公告)日:2017-08-09

    申请号:EP15782160.4

    申请日:2015-10-02

    IPC分类号: G06F9/30 G06F9/38

    摘要: Techniques are described for determining whether execution of an instruction would require reading more values from a memory cell of a general purpose register (GPR) than a read port of the memory cell would allow. In such a case, the techniques may store, prior to execution of the instruction, one or more values from the memory cell in a separate conflict queue. During execution of the instruction to implement an operation defined by the instruction, one value that is an operand of the operation would be read from the memory cell and another value that is an operand of the operation other would be read from the conflict queue.

    摘要翻译: 描述了用于确定指令的执行是否需要从通用寄存器(GPR)的存储器单元读取比存储器单元的读取端口允许的更多值的技术。 在这种情况下,这些技术可以在执行指令之前将来自存储器单元的一个或多个值存储在单独的冲突队列中。 在执行指令以执行由指令定义的操作期间,操作数的一个值将从存储器单元中读取,并且操作数的另一个值将从冲突队列中读取。

    PREDICTING LITERAL LOAD VALUES USING A LITERAL LOAD PREDICTION TABLE, AND RELATED CIRCUITS, METHODS, AND COMPUTER-READABLE MEDIA
    20.
    发明公开
    PREDICTING LITERAL LOAD VALUES USING A LITERAL LOAD PREDICTION TABLE, AND RELATED CIRCUITS, METHODS, AND COMPUTER-READABLE MEDIA 审中-公开
    使用侧面负载预测表来预测极化负载值以及相关电路,方法和计算机可读介质

    公开(公告)号:EP3191938A1

    公开(公告)日:2017-07-19

    申请号:EP15760558.5

    申请日:2015-08-24

    IPC分类号: G06F9/38

    摘要: Predicting literal load values using a literal load prediction table, and related circuits, methods, and computer-readable media are disclosed. In one aspect, an instruction processing circuit provides a literal load prediction table containing one or more entries, each comprising an address and a literal load value. Upon detecting a literal load instruction in an instruction stream, the instruction processing circuit determines whether the literal load prediction table contains an entry having an address of the literal load instruction. If so, the instruction processing circuit provides the predicted literal load value stored in the entry to at least one dependent instruction. The instruction processing circuit subsequently determines whether the predicted literal load value matches the actual literal load value loaded by the literal load instruction. If a mismatch exists, the instruction processing circuit initiates a misprediction recovery. The at least one dependent instruction is re-executed using the actual literal load value.

    摘要翻译: 公开了使用文字加载预测表来预测文字加载值,以及相关电路,方法和计算机可读介质。 在一个方面,指令处理电路提供包含一个或多个条目的文字加载预测表,每个条目包括地址和文字加载值。 在检测到指令流中的文字加载指令时,指令处理电路确定文字加载预测表是否包含具有文字加载指令的地址的条目。 如果是,则指令处理电路将存储在条目中的预测文字加载值提供给至少一个相关指令。 指令处理电路随后确定预测文字加载值是否与通过文字加载指令加载的实际文字加载值相匹配。 如果存在不匹配,则指令处理电路启动错误预测恢复。 至少一个依赖指令是使用实际文字加载值重新执行的。