Methods and arrangements to manage on-chip memory to reduce memory latency
    1.
    发明申请
    Methods and arrangements to manage on-chip memory to reduce memory latency 有权
    管理片上存储器以减少内存延迟的方法和安排

    公开(公告)号:US20060155886A1

    公开(公告)日:2006-07-13

    申请号:US11032876

    申请日:2005-01-11

    IPC分类号: G06F3/00

    摘要: Methods, systems, and media for reducing memory latency seen by processors by providing a measure of control over on-chip memory (OCM) management to software applications, implicitly and/or explicitly, via an operating system are contemplated. Many embodiments allow part of the OCM to be managed by software applications via an application program interface (API), and part managed by hardware. Thus, the software applications can provide guidance regarding address ranges to maintain close to the processor to reduce unnecessary latencies typically encountered when dependent upon cache controller policies. Several embodiments utilize a memory internal to the processor or on a processor node so the memory block used for this technique is referred to as OCM.

    摘要翻译: 考虑通过操作系统提供对软件应用(OCM)的控制的措施来减少处理器所看到的存储器延迟的方法,系统和媒体。 许多实施例允许OCM的一部分由软件应用程序通过应用程序接口(API)和由硬件管理的部分来管理。 因此,软件应用程序可以提供关于地址范围的指导,以保持靠近处理器,以减少在依赖于缓存控制器策略时通常遇到的不必要的延迟。 几个实施例利用处理器内部或处理器节点上的存储器,因此用于该技术的存储器块被称为OCM。

    HARDWARE SUPPORT FOR SUPERPAGE COALESCING
    2.
    发明申请
    HARDWARE SUPPORT FOR SUPERPAGE COALESCING 审中-公开
    硬件支持超级加煤

    公开(公告)号:US20070067604A1

    公开(公告)日:2007-03-22

    申请号:US11551168

    申请日:2006-10-19

    IPC分类号: G06F12/00

    CPC分类号: G06F12/1045

    摘要: A method of assigning virtual memory to physical memory in a data processing system allocates a set of contiguous physical memory pages for a new page mapping, instructs the memory controller to move the virtual memory pages according to the new page mapping, and then allows access to the virtual memory pages using the new page mapping while the memory controller is still copying the virtual memory pages to the set of physical memory pages. The memory controller can use a mapping table which temporarily stores entries of the old and new page addresses, and releases the entries as copying for each entry is completed. The translation look aside buffer (TLB) entries in the processor cores are updated for the new page addresses prior to completion of copying of the memory pages by the memory controller. The invention can be extended to non-uniform memory array (NUMA) systems. For systems with cache memory, any cache entry which is affected by the page move can be updated by modifying its address tag according to the new page mapping. This tag modification may be limited to cache entries in a dirty coherency state. The cache can further relocate a cache entry based on a changed congruence class for any modified address tag.

    摘要翻译: 将虚拟存储器分配给数据处理系统中的物理存储器的方法为新的页面映射分配一组连续的物理存储器页面,指示存储器控制器根据新的页面映射移动虚拟存储器页面,然后允许访问 虚拟内存页面使用新页面映射,而内存控制器仍将虚拟内存页面复制到物理内存页面集合。 存储器控制器可以使用临时存储旧页面地址和新页面地址的条目的映射表,并且对于每个条目的拷贝完成,释放条目。 在存储器控制器完成内存页复制之前,处理器核心中的缓冲区(TLB)条目将被更新为新页面地址。 本发明可以扩展到非均匀存储器阵列(NUMA)系统。 对于具有缓存内存的系统,可以通过根据新页面映射修改其地址标签来更新受页面移动影响的任何缓存条目。 该标签修改可能被限制在脏相关性状态下的高速缓存条目。 高速缓存可以根据修改后的地址标签的改变的一致性类别进一步重新定位缓存条目。

    Hardware support for superpage coalescing
    3.
    发明申请
    Hardware support for superpage coalescing 失效
    硬件支持超级页面合并

    公开(公告)号:US20050108496A1

    公开(公告)日:2005-05-19

    申请号:US10713733

    申请日:2003-11-13

    IPC分类号: G06F12/08 G06F12/10

    CPC分类号: G06F12/1045

    摘要: A method of assigning virtual memory to physical memory in a data processing system allocates a set of contiguous physical memory pages for a new page mapping, instructs the memory controller to move the virtual memory pages according to the new page mapping, and then allows access to the virtual memory pages using the new page mapping while the memory controller is still copying the virtual memory pages to the set of physical memory pages. The memory controller can use a mapping table which temporarily stores entries of the old and new page addresses, and releases the entries as copying for each entry is completed. The translation lookaside buffer (TLB) entries in the processor cores are updated for the new page addresses prior to completion of copying of the memory pages by the memory controller. The invention can be extended to non-uniform memory array (NUMA) systems. For systems with cache memory, any cache entry which is affected by the page move can be updated by modifying its address tag according to the new page mapping. This tag modification may be limited to cache entries in a dirty coherency state. The cache can further relocate a cache entry based on a changed congruence class for any modified address tag.

    摘要翻译: 将虚拟存储器分配给数据处理系统中的物理存储器的方法为新的页面映射分配一组连续的物理存储器页面,指示存储器控制器根据新的页面映射移动虚拟存储器页面,然后允许访问 虚拟内存页面使用新页面映射,而内存控制器仍将虚拟内存页面复制到物理内存页面集合。 存储器控制器可以使用临时存储旧页面地址和新页面地址的条目的映射表,并且对于每个条目的拷贝完成,释放条目。 在由存储器控制器完成对存储器页面的复制之前,处理器核心中的翻译后备缓冲器(TLB)条目针对新的页地址进行更新。 本发明可以扩展到非均匀存储器阵列(NUMA)系统。 对于具有缓存内存的系统,可以通过根据新页面映射修改其地址标签来更新受页面移动影响的任何缓存条目。 该标签修改可能被限制在脏相关性状态下的高速缓存条目。 高速缓存可以根据修改后的地址标签的改变的一致性类别进一步重新定位缓存条目。

    Method and system for compressing reduced instruction set computer (RISC) executable code
    4.
    发明授权
    Method and system for compressing reduced instruction set computer (RISC) executable code 失效
    压缩指令集计算机(RISC)可执行代码的方法和系统

    公开(公告)号:US06442680B1

    公开(公告)日:2002-08-27

    申请号:US09239259

    申请日:1999-01-29

    IPC分类号: G06F930

    CPC分类号: G06F9/30156 G06F8/4434

    摘要: A method and system for a compression scheme used with program executables that run in a reduced instruction set computer (RISC) architecture such as the PowerPC is disclosed. Initially, a RISC instruction set is expanded to produce code that facilitates the removal of redundant fields. The program is then rewritten using this new expanded instruction set. Next, a filter is applied to remove redundant fields from the expanded instructions. The expanded instructions are then clustered into groups, such that instructions belonging to the same cluster show similar bit patterns. Within each cluster, the scopes are created such that register usage patterns within each scope are similar. Within each cluster, more scopes are created such that literals within each instruction scope are drawn from the same range of integers. A conventional compression technique such as Huffman encoding is then applied on each instruction scope within each cluster. Dynamic programming techniques are then used to produce the best combination of encoding among all scopes within all the different clusters. Where applicable, instruction scopes are combined that use the same encoding scheme to reduce the size of the resulting dictionary. Similarly instruction clusters are combined that use the same encoding scheme to reduce the size of the resulting dictionary.

    摘要翻译: 公开了一种用于在诸如PowerPC的简化指令集计算机(RISC)架构中运行的程序可执行程序的压缩方案和系统。 最初,RISC指令集被扩展以产生便于去除冗余字段的代码。 然后使用这个新的扩展指令集重写该程序。 接下来,应用一个过滤器来从扩展的指令中删除冗余字段。 扩展的指令然后被聚集成组,使得属于相同集群的指令显示相似的位模式。 在每个集群中,创建范围,使每个范围内的注册使用模式相似。 在每个集群中,创建更多的范围,使得每个指令范围内的文字从相同的整数范围中得出。 然后在每个簇内的每个指令范围上应用诸如霍夫曼编码的常规压缩技术。 然后使用动态编程技术在所有不同的集群内的所有范围之间产生编码的最佳组合。 在适用的情况下,结合使用相同编码方案的指令范围来减小最终字典的大小。 类似的指令集合被使用相同的编码方案来减小所得到的字典的大小。

    Method and system for address trace compression through loop detection and reduction
    5.
    发明授权
    Method and system for address trace compression through loop detection and reduction 有权
    通过环路检测和减少进行地址跟踪压缩的方法和系统

    公开(公告)号:US06347383B1

    公开(公告)日:2002-02-12

    申请号:US09281878

    申请日:1999-03-31

    IPC分类号: G06F1100

    CPC分类号: G06F11/3636

    摘要: A method and system for compressing memory address traces based on detecting and reducing the loops that exist in a trace is disclosed. The method and system consists of two steps. In the first step, the trace is analyzed and loops are detected by determining the control flow among the program basic blocks. In the second step, each loop is analyzed to eliminate constant address references, and to apply compiler-like strength reduction on addresses that differ only by a fixed offset between consecutive loop iterations. Addresses that cannot be eliminated using the method and system of the present invention are kept in the trace.

    摘要翻译: 公开了一种基于检测和减少跟踪中存在的循环来压缩存储器地址迹线的方法和系统。 该方法和系统由两个步骤组成。 在第一步中,通过确定程序基本块之间的控制流来分析跟踪并检测循环。 在第二步中,分析每个循环以消除常量地址引用,并对仅在连续循环迭代之间的固定偏移量不同的地址上应用编译器强度降低。 使用本发明的方法和系统不能消除的地址被保留在跟踪中。

    Error detection in a data processing system
    6.
    发明申请
    Error detection in a data processing system 失效
    数据处理系统中的错误检测

    公开(公告)号:US20060156156A1

    公开(公告)日:2006-07-13

    申请号:US11034553

    申请日:2005-01-13

    CPC分类号: G06F11/3636 G06F11/3624

    摘要: A compiler for incorporating error detection into executable code generates conventional assembler language object code from a source code file. The compiler identifies an error detection segment (EDS) in the assembler code, where the EDS includes a subset of basic blocks in the assembler code. The compiler also identifies register and memory references in the EDS and inserts a set of instructions into the EDS. The inserted instructions record an entry state and an exit state of the referenced registers and memory locations. The state information is stored in a checkpoint portion of system memory. The compiler may generate shadow EDS code including instructions mirroring the instructions in the main EDS and verifying instructions that compare results produced by the mirroring instructions with results produced by the main EDS. The shadow EDS initiates an error recovery process if results produced by the shadow EDS and the main EDS differ.

    摘要翻译: 用于将错误检测结合到可执行代码中的编译器从源代码文件生成常规的汇编语言对象代码。 编译器识别汇编代码中的错误检测段(EDS),其中EDS包括汇编代码中的基本块的子集。 编译器还识别EDS中的寄存器和内存引用,并将一组指令插入到EDS中。 插入的指令记录引用的寄存器和存储器位置的入口状态和退出状态。 状态信息存储在系统存储器的检查点部分。 编译器可以生成影子EDS代码,包括镜像主EDS中的指令的指令,以及验证指令,将镜像指令产生的结果与主EDS产生的结果进行比较。 如果阴影EDS和主EDS产生的结果不同,阴影EDS将启动错误恢复过程。

    Apparatus and method for providing remote access redirect capability in a channel adapter of a system area network
    7.
    发明申请
    Apparatus and method for providing remote access redirect capability in a channel adapter of a system area network 失效
    用于在系统区域网络的信道适配器中提供远程访问重定向能力的装置和方法

    公开(公告)号:US20060155880A1

    公开(公告)日:2006-07-13

    申请号:US11034557

    申请日:2005-01-13

    IPC分类号: G06F3/00

    摘要: A method and apparatus for providing remote access redirect in a host channel adapter of a system area network are provided. The apparatus and method provide a mechanism by which a host channel adapter, in response to receiving a marker message, places selected channel(s) of the host channel adapter in a remote access redirect (RAR) mode of operation. During the RAR mode of operation, memory access messages received by the host channel adapter that are destined for portions of an application memory space marked as being protected are converted to RAR receive messages and redirected to a queue pair associated with an operating system rather than the queue pair for the application. The operating system is responsible for serializing access to application memory pages outside of the host channel adapter. The mechanisms of the present invention may be used to perform a checkpoint data integrity operation.

    摘要翻译: 提供了一种用于在系统区域网络的主机信道适配器中提供远程访问重定向的方法和装置。 该装置和方法提供了一种机制,主机信道适配器响应于接收到标记消息,将主机信道适配器的选定信道放置在远程接入重定向(RAR)操作模式中。 在RAR操作模式期间,由主机信道适配器接收的目的地为标记为受保护的应用存储器空间的部分的存储器访问消息被转换为RAR接收消息,并被重定向到与操作系统相关联的队列对,而不是 应用程序的队列对。 操作系统负责对主机通道适配器之外的应用程序内存页进行序列化访问。 本发明的机制可以用于执行检查点数据完整性操作。

    Method and system for clustering instructions within executable code for compression
    8.
    发明授权
    Method and system for clustering instructions within executable code for compression 失效
    用于在可执行代码内进行压缩的聚类指令的方法和系统

    公开(公告)号:US06317867B1

    公开(公告)日:2001-11-13

    申请号:US09239261

    申请日:1999-01-29

    IPC分类号: G06F945

    CPC分类号: G06F8/4434 H03M7/30

    摘要: In accordance with a method and system of the present invention, a compression scheme for program executables is disclosed. First, instruction clustering starts by placing each instruction in a cluster by itself. The method and system then compute in an iterative fashion the distance between clusters, and merge the nearest clusters to form larger clusters. Therefore, instructions are clustered into groups, such that instructions belonging to the same cluster show similar bit patterns. This process stops when the number of clusters reaches a pre-specified goal. This goal is defined empirically, and may be adjusted if better compression can result. After all clusters have been defined, a suitable compressor is applied to each cluster to produce the compressed executable.

    摘要翻译: 根据本发明的方法和系统,公开了一种用于程序可执行程序的压缩方案。 首先,通过将每个指令本身放在一个集群中开始指令集群。 然后,方法和系统以迭代的方式计算集群之间的距离,并且合并最近的集群以形成更大的集群。 因此,指令被聚集成组,使得属于相同集群的指令显示相似的位模式。 当集群数达到预先指定的目标时,此过程将停止。 这个目标是根据经验定义的,如果可以得到更好的压缩,可以进行调整。 在定义了所有集群之后,将合适的压缩器应用于每个集群以产生压缩的可执行文件。

    Method and system for scope-based compression of register and literal encoding in a reduced instruction set computer (RISC)
    9.
    发明授权
    Method and system for scope-based compression of register and literal encoding in a reduced instruction set computer (RISC) 失效
    简化指令集计算机(RISC)中寄存器和字面编码的范围压缩方法和系统

    公开(公告)号:US06233674B1

    公开(公告)日:2001-05-15

    申请号:US09239258

    申请日:1999-01-29

    IPC分类号: G06F930

    CPC分类号: G06F9/3017 G06F8/4434

    摘要: A compression scheme for program executables that run in a reduced instruction set computer (RISC) architecture such as the PowerPC is disclosed. The method and system utilize scope-based compression for increasing the effectiveness of conventional compression with respect to register and literal encoding. First, discernible patterns are determined by exploiting instruction semantics and conventions that compilers adopt in register and literal usage. Additional conventions may also be set for register usage to facilitate compression. Using this information, separate scopes are created such that in each scope there is a more prevalent usage of a limited set of registers or literal value ranges, or there is an easily discernible pattern of register or literal usage. Each scope then is compressed separately by a conventional compressor. The resulting code is more compact because the small number of registers and literals in each scope makes the encoding sparser than when the compressor operates on the global scope that includes all instructions in a program. Additionally, scope-based compression reveals more frequent patterns within each scope than when considering the entire instruction stream as an opaque stream of bits.

    摘要翻译: 公开了一种在诸如PowerPC之类的精简指令集计算机(RISC)架构中运行的程序可执行程序的压缩方案。 该方法和系统利用基于范围的压缩来增加关于寄存器和字面编码的常规压缩的有效性。 首先,通过利用编译器在注册和文字使用中采用的指令语义和惯例来确定可辨别的模式。 还可以为寄存器使用设置附加约定以便于压缩。 使用该信息,创建单独的范围,使得在每个范围内存在有限的一组寄存器或文字值范围的更普遍的用法,或者存在注册或文字使用的容易辨别的模式。 每个范围然后由传统的压缩机单独压缩。 所得到的代码更紧凑,因为每个范围中的少量寄存器和文字使得编码更加困难,而不是当压缩器在包含程序中的所有指令的全局范围上运行时。 另外,基于范围的压缩可以显示每个范围内更频繁的模式,而不是将整个指令流视为不透明的数据流。

    Method and system for compressing reduced instruction set computer (RISC) executable code through instruction set expansion
    10.
    发明授权
    Method and system for compressing reduced instruction set computer (RISC) executable code through instruction set expansion 失效
    通过指令集扩展压缩简化指令集计算机(RISC)可执行代码的方法和系统

    公开(公告)号:US06195743B1

    公开(公告)日:2001-02-27

    申请号:US09239260

    申请日:1999-01-29

    IPC分类号: G06F500

    摘要: A compression scheme is disclosed for program executables that run on Reduced Instruction Set Computer (RISC) processors, such as the PowerPC architecture. The RISC instruction set is expanded by adding opcodes to produce code that facilitates the removal of redundant fields. To compress a program, a compressor engine rewrites the executable using the new expanded instruction set. Next, a filter is applied to remove the redundant fields from the expanded instructions. A conventional compression technique such as Huffman encoding is then applied on the resulting code.

    摘要翻译: 对于在诸如PowerPC架构的精简指令集计算机(RISC)处理器上运行的程序可执行程序,公开了一种压缩方案。 通过添加操作码来生成RISC指令集,以生成便于删除冗余字段的代码。 为了压缩程序,压缩器引擎使用新的扩展指令集重写可执行程序。 接下来,应用一个过滤器来从扩展的指令中删除冗余字段。 然后将诸如霍夫曼编码的常规压缩技术应用于所得代码。