Store stream prefetching in a microprocessor
    1.
    发明授权
    Store stream prefetching in a microprocessor 失效
    在微处理器中存储流预取

    公开(公告)号:US07716427B2

    公开(公告)日:2010-05-11

    申请号:US11969677

    申请日:2008-01-04

    IPC分类号: G06F12/00 G06F9/38

    摘要: In a microprocessor having a load/store unit and prefetch hardware, the prefetch hardware includes a prefetch queue containing entries indicative of allocated data streams. A prefetch engine receives an address associated with a store instruction executed by the load/store unit. The prefetch engine determines whether to allocate an entry in the prefetch queue corresponding to the store instruction by comparing entries in the queue to a window of addresses encompassing multiple cache blocks, where the window of addresses is derived from the received address. The prefetch engine compares entries in the prefetch queue to a window of 2M contiguous cache blocks. The prefetch engine suppresses allocation of a new entry when any entry in the prefetch queue is within the address window. The prefetch engine further suppresses allocation of a new entry when the data address of the store instruction is equal to an address in a border area of the address window.

    摘要翻译: 在具有加载/存储单元和预取硬件的微处理器中,预取硬件包括预取队列,其包含指示分配的数据流的条目。 预取引擎接收与由加载/存储单元执行的存储指令相关联的地址。 预取引擎通过将队列中的条目与包含多个高速缓存块的地址的窗口进行比较来确定是否对与存储指令相对应的预取队列中的条目进行分配,其中地址窗口从接收到的地址导出。 预取引擎将预取队列中的条目与2M个连续高速缓存块的窗口进行比较。 当预取队列中的任何条目都在地址窗口内时,预取引擎抑制新条目的分配。 当存储指令的数据地址等于地址窗口的边界区域中的地址时,预取引擎进一步抑制新条目的分配。

    STORE STREAM PREFETCHING IN A MICROPROCESSOR
    2.
    发明申请
    STORE STREAM PREFETCHING IN A MICROPROCESSOR 失效
    微处理器中的STORE STREAM PREFETCHING

    公开(公告)号:US20090070556A1

    公开(公告)日:2009-03-12

    申请号:US11969677

    申请日:2008-01-04

    IPC分类号: G06F9/38

    摘要: In a microprocessor having a load/store unit and prefetch hardware, the prefetch hardware includes a prefetch queue containing entries indicative of allocated data streams. A prefetch engine receives an address associated with a store instruction executed by the load/store unit. The prefetch engine determines whether to allocate an entry in the prefetch queue corresponding to the store instruction by comparing entries in the queue to a window of addresses encompassing multiple cache blocks, where the window of addresses is derived from the received address. The prefetch engine compares entries in the prefetch queue to a window of 2M contiguous cache blocks. The prefetch engine suppresses allocation of a new entry when any entry in the prefetch queue is within the address window. The prefetch engine further suppresses allocation of a new entry when the data address of the store instruction is equal to an address in a border area of the address window.

    摘要翻译: 在具有加载/存储单元和预取硬件的微处理器中,预取硬件包括预取队列,其包含指示分配的数据流的条目。 预取引擎接收与由加载/存储单元执行的存储指令相关联的地址。 预取引擎通过将队列中的条目与包含多个高速缓存块的地址的窗口进行比较来确定是否对与存储指令相对应的预取队列中的条目进行分配,其中地址窗口从接收到的地址导出。 预取引擎将预取队列中的条目与2M个连续高速缓存块的窗口进行比较。 当预取队列中的任何条目都在地址窗口内时,预取引擎抑制新条目的分配。 当存储指令的数据地址等于地址窗口的边界区域中的地址时,预取引擎进一步抑制新条目的分配。

    Store stream prefetching in a microprocessor
    3.
    发明授权
    Store stream prefetching in a microprocessor 失效
    在微处理器中存储流预取

    公开(公告)号:US07380066B2

    公开(公告)日:2008-05-27

    申请号:US11054871

    申请日:2005-02-10

    IPC分类号: G06F13/28 G06F12/00

    摘要: In a microprocessor having a load/store unit and prefetch hardware, the prefetch hardware includes a prefetch queue containing entries indicative of allocated data streams. A prefetch engine receives an address associated with a store instruction executed by the load/store unit. The prefetch engine determines whether to allocate an entry in the prefetch queue corresponding to the store instruction by comparing entries in the queue to a window of addresses encompassing multiple cache blocks, where the window of addresses is derived from the received address. The prefetch engine compares entries in the prefetch queue to a window of 2M contiguous cache blocks. The prefetch engine suppresses allocation of a new entry when any entry in the prefetch queue is within the address window. The prefetch engine further suppresses allocation of a new entry when the data address of the store instruction is equal to an address in a border area of the address window.

    摘要翻译: 在具有加载/存储单元和预取硬件的微处理器中,预取硬件包括预取队列,其包含指示分配的数据流的条目。 预取引擎接收与由加载/存储单元执行的存储指令相关联的地址。 预取引擎通过将队列中的条目与包含多个高速缓存块的地址的窗口进行比较来确定是否对与存储指令相对应的预取队列中的条目进行分配,其中地址窗口从接收到的地址导出。 预取引擎将预取队列中的条目与两个连续高速缓存块的窗口进行比较。 当预取队列中的任何条目都在地址窗口内时,预取引擎抑制新条目的分配。 当存储指令的数据地址等于地址窗口的边界区域中的地址时,预取引擎进一步抑制新条目的分配。

    Data stream prefetching in a microprocessor
    4.
    发明授权
    Data stream prefetching in a microprocessor 失效
    数据流在微处理器中预取

    公开(公告)号:US07904661B2

    公开(公告)日:2011-03-08

    申请号:US11953637

    申请日:2007-12-10

    IPC分类号: G06F12/00 G06F13/00

    CPC分类号: G06F12/0862 G06F2212/6028

    摘要: A method of prefetching data in a microprocessor includes identifying a data stream associated with a process and determining a depth associated with the data stream based upon prefetch factors including the number of currently concurrent data streams and data consumption rates associated with the concurrent data streams. Data prefetch requests are allocated with the data stream to reflect the determined depth of the data stream. Allocating data prefetch requests may include allocating prefetch requests for a number of cache lines away from the cache line currently being referenced, wherein the number of cache lines is equal to the determined depth. The method may include, responsive to determining the depth associated with a data stream, configuring prefetch hardware to reflect the determined depth for the identified data stream. Prefetch control bits in an instruction executed by the processor control the prefetch hardware configuration.

    摘要翻译: 在微处理器中预取数据的方法包括基于包括当前并发数据流的数量和与并发数据流相关联的数据消耗速率的预取因子来识别与进程相关联的数据流并确定与数据流相关联的深度。 数据预取请求被分配与数据流以反映确定的数据流的深度。 分配数据预取请求可以包括为当前被引用的高速缓存行分配多个高速缓存行的预取请求,其中高速缓存行的数量等于所确定的深度。 该方法可以响应于确定与数据流相关联的深度,配置预取硬件以反映所识别的数据流的确定的深度。 由处理器执行的指令中的预取控制位控制预取硬件配置。

    Data stream prefetching in a microprocessor
    5.
    发明授权
    Data stream prefetching in a microprocessor 失效
    数据流在微处理器中预取

    公开(公告)号:US07350029B2

    公开(公告)日:2008-03-25

    申请号:US11054889

    申请日:2005-02-10

    IPC分类号: G06F12/00 G06F13/00

    CPC分类号: G06F12/0862 G06F2212/6028

    摘要: A method of prefetching data in a microprocessor includes identifying a data stream associated with a process and determining a depth associated with the data stream based upon prefetch factors including the number of currently concurrent data streams and data consumption rates associated with the concurrent data streams. Data prefetch requests are allocated with the data stream to reflect the determined depth of the data stream. Allocating data prefetch requests may include allocating prefetch requests for a number of cache lines away from the cache line currently being referenced, wherein the number of cache lines is equal to the determined depth. The method may include, responsive to determining the depth associated with a data stream, configuring prefetch hardware to reflect the determined depth for the identified data stream. Prefetch control bits in an instruction executed by the processor control the prefetch hardware configuration.

    摘要翻译: 在微处理器中预取数据的方法包括基于包括当前并发数据流的数量和与并发数据流相关联的数据消耗速率的预取因子来识别与进程相关联的数据流并确定与数据流相关联的深度。 数据预取请求被分配与数据流以反映确定的数据流的深度。 分配数据预取请求可以包括为当前被引用的高速缓存行分配多个高速缓存行的预取请求,其中高速缓存行的数量等于所确定的深度。 该方法可以响应于确定与数据流相关联的深度,配置预取硬件以反映所识别的数据流的确定的深度。 由处理器执行的指令中的预取控制位控制预取硬件配置。

    INFORMATION HANDLING SYSTEM WITH REAL AND VIRTUAL LOAD/STORE INSTRUCTION ISSUE QUEUE
    6.
    发明申请
    INFORMATION HANDLING SYSTEM WITH REAL AND VIRTUAL LOAD/STORE INSTRUCTION ISSUE QUEUE 有权
    信息处理系统与真实和虚拟负载/存储指导问题队列

    公开(公告)号:US20100161945A1

    公开(公告)日:2010-06-24

    申请号:US12341930

    申请日:2008-12-22

    IPC分类号: G06F9/312

    摘要: An information handling system includes a processor that may perform issue queue virtual load/store instruction operations. The issue queue maintains load and store instructions with a real/virtual dependency flag. The issue queue provides storage resources for real and virtual load/store instructions. Real load/store instructions execute in a load store unit LSU. Virtual load/store instructions are pending execution in the LSU. The LSU may keep track of each virtual load/store instruction within the issue queue by thread, type, and pointer data. Provided that all dependencies are clear for a pending virtual load/store instruction, the LSU marks the pending virtual load/store instruction as real. The pending virtual load/store instruction may then issue to the LSU as a real load/store instruction.

    摘要翻译: 信息处理系统包括可执行发布队列虚拟加载/存储指令操作的处理器。 问题队列通过实际/虚拟依赖标志来维护加载和存储指令。 问题队列为实际和虚拟加载/存储指令提供存储资源。 实际加载/存储指令在加载存储单元LSU中执行。 虚拟加载/存储指令正在等待在LSU中执行。 LSU可以通过线程,类型和指针数据跟踪发布队列内的每个虚拟加载/存储指令。 假设所有依赖关系对待处理的虚拟加载/存储指令都是清楚的,则LSU将待处理的虚拟加载/存储指令标记为真实的。 然后,挂起的虚拟加载/存储指令可以作为实际加载/存储指令发布到LSU。

    Information handling system with real and virtual load/store instruction issue queue
    7.
    发明授权
    Information handling system with real and virtual load/store instruction issue queue 有权
    具有实际和虚拟加载/存储指令问题队列的信息处理系统

    公开(公告)号:US08041928B2

    公开(公告)日:2011-10-18

    申请号:US12341930

    申请日:2008-12-22

    IPC分类号: G06F9/00

    摘要: An information handling system includes a processor that may perform issue queue virtual load/store instruction operations. The issue queue maintains load and store instructions with a real/virtual dependency flag. The issue queue provides storage resources for real and virtual load/store instructions. Real load/store instructions execute in a load store unit LSU. Virtual load/store instructions are pending execution in the LSU. The LSU may keep track of each virtual load/store instruction within the issue queue by thread, type, and pointer data. Provided that all dependencies are clear for a pending virtual load/store instruction, the LSU marks the pending virtual load/store instruction as real. The pending virtual load/store instruction may then issue to the LSU as a real load/store instruction.

    摘要翻译: 信息处理系统包括可执行发布队列虚拟加载/存储指令操作的处理器。 问题队列通过实际/虚拟依赖标志来维护加载和存储指令。 问题队列为实际和虚拟加载/存储指令提供存储资源。 实际加载/存储指令在加载存储单元LSU中执行。 虚拟加载/存储指令正在等待在LSU中执行。 LSU可以通过线程,类型和指针数据跟踪发布队列内的每个虚拟加载/存储指令。 假设所有依赖关系对待处理的虚拟加载/存储指令都是清楚的,则LSU将待处理的虚拟加载/存储指令标记为真实的。 然后,挂起的虚拟加载/存储指令可以作为实际加载/存储指令发布到LSU。

    Design of provably correct storage arrays
    8.
    发明授权
    Design of provably correct storage arrays 失效
    可靠的存储阵列设计

    公开(公告)号:US5995425A

    公开(公告)日:1999-11-30

    申请号:US898826

    申请日:1997-07-23

    IPC分类号: G01R31/3185 G11C7/00

    CPC分类号: G01R31/318536 Y10S257/903

    摘要: A hardware design technique allows checking of design system language (DSL) specification of an element and schematics of large macros with embedded arrays and registers. The hardware organization reduces CPU time for logical verification by exponential order of magnitude without blowing up a verification process or logic simulation. The hardware organization consists of horizontal word level rather than bit level. Using the elimination process for elements which are difficult to be extracted in Boolean form the logic around and inside a memory structure can be verified. The resultant register array hardware organization can be verified to all pins and nets up to the storage element.

    摘要翻译: 硬件设计技术允许检查具有嵌入式阵列和寄存器的大型宏的元素和原理图的设计系统语言(DSL)规范。 硬件组织将逻辑验证的CPU时间缩小到指数级数量级,而不会引发验证过程或逻辑仿真。 硬件组织由水平字层而不是位级组成。 使用布尔形式难以提取的元素的消除过程可以验证存储器结构周围和内部的逻辑。 结果寄存器阵列硬件组织可以验证所有引脚和网络直到存储元件。

    Method and logical apparatus for rename register reallocation in a simultaneous multi-threaded (SMT) processor
    9.
    发明授权
    Method and logical apparatus for rename register reallocation in a simultaneous multi-threaded (SMT) processor 有权
    同时多线程(SMT)处理器中重命名寄存器重新分配的方法和逻辑设备

    公开(公告)号:US07290261B2

    公开(公告)日:2007-10-30

    申请号:US10422651

    申请日:2003-04-24

    IPC分类号: G06F9/46

    摘要: A circuit and method provide rename register reallocation for simultaneous multi-threaded (SMT) processors that redistributes rename (mapped) resources between one thread during single-threaded (ST) execution and multiple threads during multi-threaded execution. The processor receives an instruction specifying a transition from a single-threaded to a multi-threaded mode or vice-versa and halts execution of all threads executing on the processor. The internal control logic then signals the resources to reallocate the resources. Rename resources are reallocated by directing an action at the rename mapper. When switching from SMT to ST mode, the mapper is directed to drop entries for the dying thread, but on a switch from ST to SMT mode, “dummy” instruction group dispatch indications are sent to the mapper that indicate use of all architected registers for each thread.

    摘要翻译: 电路和方法为同时多线程(SMT)处理器提供重命名寄存器重新分配,该处理器在单线程(ST)执行期间的一个线程和多线程执行期间的多个线程之间重新分配重命名(映射)资源。 处理器接收指定从单线程转换到多线程模式或反之亦然的指令,并停止在处理器上执行的所有线程的执行。 内部控制逻辑然后发出资源重新分配资源。 重命名资源通过在重命名映射器处指示一个动作来重新分配。 当从SMT切换到ST模式时,映射器被定向到垂死线程的条目,但是在从ST到SMT模式的切换中,将“伪”指令组分派指示发送到映射器,指示使用所有架构的寄存器 每个线程。

    Provably correct storage arrays
    10.
    发明授权
    Provably correct storage arrays 失效
    提供正确的存储阵列

    公开(公告)号:US06279144B1

    公开(公告)日:2001-08-21

    申请号:US09377389

    申请日:1999-08-19

    IPC分类号: G06F1750

    CPC分类号: G01R31/318536 Y10S257/903

    摘要: A hardware design technique allows checking of design system language (DSL) specification of an element and schematics of large macros with embedded arrays and registers. The hardware organization reduces CPU time for logical verification by exponential order of magnitude without blowing up a verification process or logic simulation. The hardware organization consists of horizontal word level rather than bit level. A memory array cell comprises a pair of cross-coupled inverters forming a first latch for storing data. The first latch has an output connected to a read bit line. True and complement write word and bit line input to the first latch. A first set of pass gates connects between the true and complement write word and bit line inputs via gates and the input of said first latch. The first set of pass gates is responsive to a first clock via a second pass gate. A pair of cross-coupled inverters forms a second latch of a Level Sensitive Scan Design (LSSD). The second latch has output connected to an LSSD output for design verification. A second pass gate connects between the output of the first set of pass gates and the input of said first latch. The second pass gate is responsive to said first clock. A third pass gate connects between the output of said first latch and the input of said second latch. The third pass gate is responsive to a second clock. The first and second clocks are responsive to a black boxing process for incremental verification.

    摘要翻译: 硬件设计技术允许检查具有嵌入式阵列和寄存器的大型宏的元素和原理图的设计系统语言(DSL)规范。 硬件组织将逻辑验证的CPU时间缩小到指数级数量级,而不会引发验证过程或逻辑仿真。 硬件组织由水平字层而不是位级组成。 存储器阵列单元包括一对交叉耦合的反相器,形成用于存储数据的第一锁存器。 第一个锁存器具有连接到读取位线的输出。 将第一个锁存器的写入字和位线输入为真和补码。 第一组通过门通过门和所述第一锁存器的输入连接在真和补写写字和位线输入之间。 第一组传递门通过第二传递门响应于第一时钟。 一对交叉耦合的反相器形成了级别敏感扫描设计(LSSD)的第二个锁存器。 第二个锁存器具有输出连接到LSSD输出,用于设计验证。 第二传递门连接在第一组通过门的输出和所述第一锁存器的输入之间。 第二传递门响应于所述第一时钟。 第三传输门连接在所述第一锁存器的输出端和所述第二锁存器的输入端之间。 第三传递门响应第二个时钟。 第一和第二时钟响应于黑色加密处理以进行增量验证。