Techniques for Utilizing Transaction Lookaside Buffer Entry Numbers to Improve Processor Performance
    111.
    发明申请
    Techniques for Utilizing Transaction Lookaside Buffer Entry Numbers to Improve Processor Performance 有权
    利用事务后备缓冲区入口号来提高处理器性能的技术

    公开(公告)号:US20140095784A1

    公开(公告)日:2014-04-03

    申请号:US13630346

    申请日:2012-09-28

    CPC classification number: G06F12/1027 Y02D10/13

    Abstract: A technique for operating a processor includes translating, using an associated transaction lookaside buffer, a first virtual address into a first physical address through a first entry number in the transaction lookaside buffer. The technique also includes translating, using the transaction lookaside buffer, a second virtual address into a second physical address through a second entry number in the translation lookaside buffer. The technique further includes, in response to the first entry number being the same as the second entry number, determining that the first and second virtual addresses point to the same physical address in memory and reference the same data.

    Abstract translation: 用于操作处理器的技术包括通过事务后备缓冲器中的第一条目号码将使用相关联的事务后备缓冲器的第一虚拟地址翻译成第一物理地址。 该技术还包括通过翻译后备缓冲器中的第二条目号码将使用事务后备缓冲器的第二虚拟地址翻译成第二物理地址。 该技术还包括响应于第一入口号与第二入口号相同,确定第一和第二虚拟地址指向存储器中相同的物理地址并引用相同的数据。

    APPARATUS AND METHOD FOR MEMORY COPY AT A PROCESSOR
    112.
    发明申请
    APPARATUS AND METHOD FOR MEMORY COPY AT A PROCESSOR 有权
    用于处理器的存储器复制的装置和方法

    公开(公告)号:US20130290639A1

    公开(公告)日:2013-10-31

    申请号:US13455800

    申请日:2012-04-25

    Abstract: A processor uses a dedicated buffer to reduce the amount of time needed to execute memory copy operations. For each load instruction associated with the memory copy operation, the processor copies the load data from memory to the dedicated buffer. For each store operation associated with the memory copy operation, the processor retrieves the store data from the dedicated buffer and transfers it to memory. The dedicated buffer is separate from a register file and caches of the processor, so that each load operation associated with a memory copy operation does not have to wait for data to be loaded from memory to the register file. Similarly, each store operation associated with a memory copy operation does not have to wait for data to be transferred from the register file to memory.

    Abstract translation: 处理器使用专用缓冲器来减少执行内存复制操作所需的时间。 对于与存储器复制操作相关联的每个加载指令,处理器将负载数据从存储器复制到专用缓冲区。 对于与存储器复制操作相关联的每个存储操作,处理器从专用缓冲器检索存储数据并将其传送到存储器。 专用缓冲器与寄存器文件和处理器的高速缓存分开,使得与存储器复制操作相关联的每个加载操作不必等待数据从存储器加载到寄存器文件。 类似地,与存储器复制操作相关联的每个存储操作不必等待数据从寄存器文件传送到存储器。

    DATA PROCESSING SYSTEM WITH LATENCY TOLERANCE EXECUTION
    113.
    发明申请
    DATA PROCESSING SYSTEM WITH LATENCY TOLERANCE EXECUTION 有权
    具有延期执行的数据处理系统

    公开(公告)号:US20130212358A1

    公开(公告)日:2013-08-15

    申请号:US13397452

    申请日:2012-02-15

    Abstract: A data processing system comprises a processor unit that includes an instruction decode/issue unit including a re-order buffer having entries that include an execution queue tag that indicates an execution queue location of an instruction to which a re-order buffer entry is assigned, a result valid indicator to indicate that a corresponding instruction has executed with a status bit valid result, and a forward indicator to indicate that the status bit can be forwarded to an execution queue of an instruction pointed to that is waiting to receive the status bit.

    Abstract translation: 一种数据处理系统,包括:处理器单元,其包括指令解码/发布单元,该指令解码/发布单元包括具有条目的重新排序缓冲器,所述条目包括指示分配了重新排序缓冲器条目的指令的执行队列位置的执行队列标签, 结果有效指示符,以指示相应的指令已经执行了状态位有效结果,以及前向指示符,以指示状态位可以被转发到指示正在等待接收状态位的指令的执行队列。

    Methods and apparatus for instruction alignment including current instruction pointer logic responsive to instruction length information
    116.
    发明授权
    Methods and apparatus for instruction alignment including current instruction pointer logic responsive to instruction length information 有权
    用于指令对准的方法和装置,包括响应于指令长度信息的当前指令指针逻辑

    公开(公告)号:US07134000B2

    公开(公告)日:2006-11-07

    申请号:US10442329

    申请日:2003-05-21

    CPC classification number: G06F9/30152 G06F9/3816 G06F9/382

    Abstract: An instruction alignment unit for aligning instructions in a digital processor having a pipelined architecture includes an instruction queue, a current instruction buffer and a next instruction buffer in a pipeline stage n, an aligned instruction buffer in a pipeline stage n+1, instruction fetch logic for loading instructions into the current instruction buffer from an instruction cache or from the next instruction buffer and for loading instructions into the next instruction buffer from the instruction cache or from the instruction queue, and alignment control logic responsive to instruction length information contained in the instructions for controlling transfer of instructions from the current instruction buffer and the next instruction buffer to the aligned instruction buffer. The alignment control logic includes predecoders for predecoding the instructions to provide instruction length information and pointer generation logic responsive to the instruction length information for generating a current instruction pointer for controlling transfer of instructions to the aligned instruction buffer.

    Abstract translation: 用于对准具有流水线架构的数字处理器中的指令的指令对准单元包括流水线阶段n中的指令队列,当前指令缓冲器和下一指令缓冲器,流水线级n + 1中的对准指令缓冲器,指令提取逻辑 用于从指令高速缓存或从下一个指令缓冲器将指令加载到当前指令缓冲器中,并且用于从指令高速缓存或指令队列将指令加载到下一个指令缓冲器中,以及对准控制逻辑,其响应于指令中包含的指令长度信息 用于控制从当前指令缓冲器和下一个指令缓冲器到指令缓冲器的指令传送。 对准控制逻辑包括预解码器,用于预处理指令以响应于指令长度信息提供指令长度信息和指针生成逻辑,用于产生用于控制向对齐的指令缓冲器的指令传送的当前指令指针。

    Memory system for supporting multiple parallel accesses at very high frequencies
    117.
    发明授权
    Memory system for supporting multiple parallel accesses at very high frequencies 有权
    用于以非常高的频率支持多个并行访问的存储器系统

    公开(公告)号:US06963962B2

    公开(公告)日:2005-11-08

    申请号:US10120686

    申请日:2002-04-11

    CPC classification number: G11C7/10 G06F13/1615 G06F13/1647

    Abstract: A memory system for operation with a processor, such as a digital signal processor, includes a high speed pipelined memory, a store buffer for holding store access requests from the processor, a load buffer for holding load access requests from the processor, and a memory control unit for processing access requests from the processor, from the store buffer and from the load buffer. The memory control unit may include prioritization logic for selecting access requests in accordance with a priority scheme and bank conflict logic for detecting and handling conflicts between access requests. The pipelined memory may be configured to output two load results per clock cycle at very high speed.

    Abstract translation: 用于与处理器(例如数字信号处理器)一起操作的存储器系统包括高速流水线存储器,用于保存来自处理器的存储访问请求的存储缓冲器,用于保存来自处理器的加载访问请求的加载缓冲器和存储器 控制单元,用于处理来自处理器,存储缓冲器和加载缓冲器的访问请求。 存储器控制单元可以包括用于根据优先级方案选择访问请求的优先化逻辑和用于检测和处理访问请求之间的冲突的库冲突逻辑。 流水线存储器可以被配置为以非常高的速度在每个时钟周期输出两个负载结果。

    Line-oriented reorder buffer configured to selectively store a memory operation result in one of the plurality of reorder buffer storage locations corresponding to the executed instruction
    118.
    发明授权
    Line-oriented reorder buffer configured to selectively store a memory operation result in one of the plurality of reorder buffer storage locations corresponding to the executed instruction 有权
    面向行的重排序缓冲器,被配置为选择性地将存储器操作结果存储在与所执行的指令相对应的多个重排序缓冲存储位置之一中

    公开(公告)号:US06381689B2

    公开(公告)日:2002-04-30

    申请号:US09804768

    申请日:2001-03-13

    Abstract: A reorder buffer is configured into multiple lines of storage, wherein a line of storage includes sufficient storage for instruction results regarding a predefined maximum number of concurrently dispatchable instructions. A line of storage is allocated whenever one or more instructions are dispatched. A microprocessor employing the reorder buffer is also configured with fixed, symmetrical issue positions. The symmetrical nature of the issue positions may increase the average number of instructions to be concurrently dispatched and executed by the microprocessor. The average number of unused locations within the line decreases as the average number of concurrently dispatched instructions increases. One particular implementation of the reorder buffer includes a future file. The future file comprises a storage location corresponding to each register within the microprocessor. The reorder buffer tag (or instruction result, if the instruction has executed) of the last instruction in program order to update the register is stored in the future file. The reorder buffer provides the value (either reorder buffer tag or instruction result) stored in the storage location corresponding to a register when the register is used as a source operand for another instruction. Another advantage of the future file for microprocessors which allow access and update to portions of registers is that narrow-to-wide dependencies are resolved upon completion of the instruction which updates the narrower register.

    Abstract translation: 重排序缓冲器被配置成多个存储线,其中存储线包括关于预定的最大数量的可同时分发的指令的指令结果的足够的存储。 只要调度一个或多个指令,就分配一行存储空间。 采用重排序缓冲器的微处理器也配置有固定的对称发布位置。 问题位置的对称性质可能会增加由微处理器同时调度和执行的指令的平均数量。 随着并发调度指令的平均数量的增加,行中未使用位置的平均数量减少。 重排序缓冲器的一个特定实现包括将来的文件。 未来文件包括与微处理器内的每个寄存器对应的存储位置。 程序顺序中的最后一条指令的重新排序缓冲区标签(或指令结果已执行)更新寄存器存储在将来的文件中。 重新排序缓冲器提供当寄存器用作另一个指令的源操作数时,存储在与寄存器相对应的存储位置中的值(重新排序缓冲器标签或指令结果)。 允许访问和更新寄存器部分的微处理器的未来文件的另一个优点是,在更新较窄寄存器的指令完成后,解决了窄到宽的依赖关系。

    Reverse TLB for providing branch target address in a microprocessor having a physically-tagged cache
    119.
    发明授权
    Reverse TLB for providing branch target address in a microprocessor having a physically-tagged cache 失效
    用于在具有物理标记的高速缓存的微处理器中提供分支目标地址的反向TLB

    公开(公告)号:US06266752B1

    公开(公告)日:2001-07-24

    申请号:US09550847

    申请日:2000-04-17

    Abstract: A microprocessor employs a branch prediction unit including a branch prediction storage which stores the index portion of branch target addresses and an instruction cache which is virtually indexed and physically tagged. The branch target index (if predicted-taken, or the sequential index if predicted not-taken) is provided as the index to the instruction cache. The selected physical tag is provided to a reverse translation lookaside buffer (TLB) which translates the physical tag to a virtual page number. Concatenating the virtual page number to the virtual index from the instruction cache (and the offset portion, generated from the branch prediction) results in the branch target address being generated. In one embodiment, the process of reading an index from the branch prediction storage, accessing the instruction cache, selecting the physical tag, and reverse translating the physical tag to achieve a virtual page number may require more than a clock cycle to complete. Such an embodiment may employ a current page register which stores the most recently translated virtual page number and the corresponding real page number. The branch prediction unit predicts that each fetch address will continue to reside in the current page and uses the virtual page number from the current page to form the branch target address. The physical tag from the fetched cache line is compared to the corresponding real page number to verify that the fetch address is actually still within the current page. When a mismatch is detected between the corresponding real page number and the physical tag from the fetched cache line, the branch target address is corrected with the linear page number provided by the reverse TLB and the current page register is updated.

    Abstract translation: 微处理器采用分支预测单元,该分支预测单元包括分支预测存储器,该分支预测存储器存储分支目标地址的索引部分和虚拟索引并被物理标记的指令高速缓存。 提供分支目标索引(如果预测取得的,或者如果预测未被采用的顺序索引)作为指令高速缓存的索引。 所选择的物理标签被提供给反向翻译后备缓冲器(TLB),其将物理标签转换成虚拟页码。 将虚拟页号连接到来自指令高速缓存(以及从分支预测生成的偏移部分)的虚拟索引导致生成分支目标地址。 在一个实施例中,从分支预测存储器读取索引,访问指令高速缓存,选择物理标签以及反转翻译物理标签以实现虚拟页面号的过程可能需要多于一个时钟周期来完成。 这样的实施例可以使用存储最近翻译的虚拟页面号码和对应的真实页面号码的当前页面寄存器。 分支预测单元预测每个获取地址将继续驻留在当前页面中,并使用当前页面中的虚拟页面号来形成分支目标地址。 将获取的高速缓存行中的物理标记与相应的实际页码进行比较,以验证提取地址实际上仍在当前页面中。 当在相应的实际页码与来自取出的高速缓存行的物理标记之间检测到不匹配时,用反向TLB提供的线性页码修正分支目标地址,并更新当前页寄存器。

    Branch prediction mechanism employing branch selectors to select a branch prediction
    120.
    发明授权
    Branch prediction mechanism employing branch selectors to select a branch prediction 有权
    使用分支选择器选择分支预测的分支预测机制

    公开(公告)号:US06247123B1

    公开(公告)日:2001-06-12

    申请号:US09401561

    申请日:1999-09-22

    Applicant: Thang M. Tran

    Inventor: Thang M. Tran

    CPC classification number: G06F9/3806 G06F9/3844

    Abstract: A branch prediction apparatus is provided which stores multiple branch selectors corresponding to instruction bytes within a cache line of instructions or portion thereof. The branch selectors identify a branch prediction to be selected if the corresponding instruction byte is the byte indicated by the offset of the fetch address used to fetch the cache line. Instead of comparing pointers to the branch instructions with the offset of the fetch address, the branch prediction is selected simply by decoding the offset of the fetch address and choosing the corresponding branch selector. The branch prediction apparatus may operate at a higher frequencies (i.e. lower clock cycles) than if the pointers to the branch instruction and the fetch address were compared (a greater than or less than comparison). The branch selectors directly determine which branch prediction is appropriate according to the instructions being fetched, thereby decreasing the amount of logic employed to select the branch prediction.

    Abstract translation: 提供了一种分支预测装置,其存储与指令或其部分的高速缓存行中的指令字节对应的多个分支选择器。 如果相应的指令字节是由用于获取高速缓存线的提取地址的偏移指示的字节,则分支选择器识别要选择的分支预测。 代替将指针与获取地址偏移量的分支指令进行比较,通过解码获取地址的偏移并选择相应的分支选择器简单地选择分支预测。 分支预测装置可以比比较分支指令和获取地址的指针(大于或小于比较)更高的频率(即较低的时钟周期)操作。 分支选择器根据所取指令直接确定哪个分支预测是适当的,从而减少用于选择分支预测的逻辑量。

Patent Agency Ranking