Method for fast large-integer arithmetic on IA processors
    1.
    发明授权
    Method for fast large-integer arithmetic on IA processors 有权
    在IA处理器上进行快速大整数运算的方法

    公开(公告)号:US09292283B2

    公开(公告)日:2016-03-22

    申请号:US13707105

    申请日:2012-12-06

    Abstract: Methods, systems, and apparatuses are disclosed for implementing fast large-integer arithmetic within an integrated circuit, such as on IA (Intel Architecture) processors, in which such means include receiving a 512-bit value for squaring, the 512-bit value having eight sub-elements each of 64-bits and performing a 512-bit squaring algorithm by: (i) multiplying every one of the eight sub-elements by itself to yield a square of each of the eight sub-elements, the eight squared sub-elements collectively identified as T1, (ii) multiplying every one of the eight sub-elements by the other remaining seven of the eight sub-elements to yield an asymmetric intermediate result having seven diagonals therein, wherein each of the seven diagonals are of a different length, (iii) reorganizing the asymmetric intermediate result having the seven diagonals therein into a symmetric intermediate result having four diagonals each of 7×1 sub-elements of the 64-bits in length arranged across a plurality of columns, (iv) adding all sub-elements within their respective columns, the added sub-elements collectively identified as T2, and (v) yielding a final 512-bit squared result of the 512-bit value by adding the value of T2 twice with the value of T1 once. Other related embodiments are disclosed.

    Abstract translation: 公开了用于在诸如IA(Intel Architecture)处理器之类的集成电路内实现快速大整数运算的方法,系统和装置,其中这种装置包括接收512位的平方值,512位值具有 八个子元素,每个64位,并通过以下方式执行512位平方算法:(i)将八个子元素中的每一个本身相乘以产生八个子元素中的每一个的平方,八个子元素 - 集体标识为T1的元件,(ii)将八个子元素中的每一个乘以八个子元素中的其余七个子元素以产生其中具有七个对角线的不对称中间结果,其中七个对角线中的每一个为 (iii)将其中具有七个对角线的非对称中间结果重新组合成具有四个对角线的对称中间结果,每个对角线的长度为64位的7×1个子元素跨越多个 列,(iv)将其所有列中的所有子元素加入集体标识为T2的所添加的子元素,以及(v)通过将T2值增加两次来产生512位值的最终512位平方结果 其T1值一次。 公开了其他相关实施例。

    METHOD FOR FAST LARGE-INTEGER ARITHMETIC ON IA PROCESSORS
    2.
    发明申请
    METHOD FOR FAST LARGE-INTEGER ARITHMETIC ON IA PROCESSORS 有权
    用于IA处理器的快速大整数算术的方法

    公开(公告)号:US20140019725A1

    公开(公告)日:2014-01-16

    申请号:US13707105

    申请日:2012-12-06

    Abstract: Methods, systems, and apparatuses are disclosed for implementing fast large-integer arithmetic within an integrated circuit, such as on IA (Intel Architecture) processors, in which such means include receiving a 512-bit value for squaring, the 512-bit value having eight sub-elements each of 64-bits and performing a 512-bit squaring algorithm by: (i) multiplying every one of the eight sub-elements by itself to yield a square of each of the eight sub-elements, the eight squared sub-elements collectively identified as T1, (ii) multiplying every one of the eight sub-elements by the other remaining seven of the eight sub-elements to yield an asymmetric intermediate result having seven diagonals therein, wherein each of the seven diagonals are of a different length, (iii) reorganizing the asymmetric intermediate result having the seven diagonals therein into a symmetric intermediate result having four diagonals each of 7×1 sub-elements of the 64-bits in length arranged across a plurality of columns, (iv) adding all sub-elements within their respective columns, the added sub-elements collectively identified as T2, and (v) yielding a final 512-bit squared result of the 512-bit value by adding the value of T2 twice with the value of T1 once. Other related embodiments are disclosed.

    Abstract translation: 公开了用于在诸如IA(Intel Architecture)处理器之类的集成电路内实现快速大整数运算的方法,系统和装置,其中这种装置包括接收512位的平方值,512位值具有 八个子元素,每个64位,并通过以下方式执行512位平方算法:(i)将八个子元素中的每一个本身相乘以产生八个子元素中的每一个的平方,八个子元素 - 集体标识为T1的元件,(ii)将八个子元素中的每一个乘以八个子元素中的其余七个子元素以产生其中具有七个对角线的不对称中间结果,其中七个对角线中的每一个为 (iii)将其中具有七个对角线的非对称中间结果重新组合成具有四个对角线的对称中间结果,每个对角线的长度为64位的7×1个子元素排列成跨越多个 列,(iv)将其所有列中的所有子元素加入集体标识为T2的所添加的子元素,以及(v)通过将T2值增加两次来产生512位值的最终512位平方结果 其T1值一次。 公开了其他相关实施例。

    Method, system, and program for optimizing code
    4.
    发明申请
    Method, system, and program for optimizing code 审中-公开
    用于优化代码的方法,系统和程序

    公开(公告)号:US20050251795A1

    公开(公告)日:2005-11-10

    申请号:US10805106

    申请日:2004-03-19

    CPC classification number: G06F8/4441

    Abstract: Provided are a method, system, and program for optimizing code. A program is accessed comprising a plurality of instructions including at least one no operation (NOP) instruction. At least one NOP instruction in the program that is not needed to provide a processing delay to ensure data is available to at least one dependent instruction accessing the data is removed.

    Abstract translation: 提供了一种用于优化代码的方法,系统和程序。 访问包括至少一个无操作(NOP)指令的多个指令的程序。 程序中至少有一个NOP指令不需要提供处理延迟来确保数据可用于访问数据的至少一个依赖指令。

    METHOD AND APPARATUS FOR A DICTIONARY COMPRESSION ACCELERATOR

    公开(公告)号:US20220308763A1

    公开(公告)日:2022-09-29

    申请号:US17214470

    申请日:2021-03-26

    Abstract: Apparatus and method for dictionary accelerator compression. For example, one embodiment of an apparatus comprises: a plurality of cores; a compression/decompression accelerator coupled to or integral to one or more of the plurality of cores, the compression/decompression accelerator to perform decompression and compression operations in response to read and write operations, respectively, wherein responsive to notification of a compression job to compress a memory page or a portion thereof, a history buffer associated with the compression/decompression accelerator to is to be initialized with pre-configured dictionary data, the compression/decompression accelerator to match portions of the pre-configured dictionary data with portions of the memory page to generate compressed output data.

    Debug system having assembler correcting register allocation errors
    6.
    发明申请
    Debug system having assembler correcting register allocation errors 失效
    具有汇编器校正寄存器分配错误的调试系统

    公开(公告)号:US20050210457A1

    公开(公告)日:2005-09-22

    申请号:US10807218

    申请日:2004-03-22

    Applicant: James Guilford

    Inventor: James Guilford

    CPC classification number: G06F11/3624

    Abstract: An assembler, which can be provided as part of a debugger and/or development system, avoids register allocation errors, such as register bank conflicts and/or insufficient physical registers, automatically.

    Abstract translation: 可以作为调试器和/或开发系统的一部分提供的汇编器自动避免寄存器分配错误,例如寄存器组冲突和/或不足的物理寄存器。

    Method and system providing virtual resource usage information
    7.
    发明申请
    Method and system providing virtual resource usage information 审中-公开
    提供虚拟资源使用信息的方法和系统

    公开(公告)号:US20050278707A1

    公开(公告)日:2005-12-15

    申请号:US10864666

    申请日:2004-06-09

    Applicant: James Guilford

    Inventor: James Guilford

    CPC classification number: G06F11/3476 G06F11/3495

    Abstract: A method and system to provide virtual resource usage information for assembler programs. In one embodiment, a graphical user interface displays virtual resource usage for portions of an assembler program.

    Abstract translation: 一种为汇编程序提供虚拟资源使用信息的方法和系统。 在一个实施例中,图形用户界面显示汇编程序的部分的虚拟资源使用。

    Instructions for Sliding Window Encoding Algorithms
    8.
    发明申请
    Instructions for Sliding Window Encoding Algorithms 有权
    滑动窗口编码算法的说明

    公开(公告)号:US20140189293A1

    公开(公告)日:2014-07-03

    申请号:US13730732

    申请日:2012-12-28

    Abstract: A processor is described having an instruction execution pipeline having a functional unit to execute an instruction that compares vector elements against an input value. Each of the vector elements and the input value have a first respective section identifying a location within data and a second respective section having a byte sequence of the data. The functional unit has comparison circuitry to compare respective byte sequences of the input vector elements against the input value's byte sequence to identify a number of matching bytes for each comparison. The functional unit also has difference circuitry to determine respective distances between the input vector ‘s elements’ byte sequences and the input value's byte sequence within the data.

    Abstract translation: 描述了具有指令执行流水线的处理器,其具有功能单元,以执行将矢量元素与输入值进行比较的指令。 矢量元素和输入值中的每一个具有识别数据内的位置的第一相应部分和具有数据的字节序列的第二相应部分。 功能单元具有比较电路,用于将输入向量元素的各个字节序列与输入值的字节序列进行比较,以识别每个比较的匹配字节数。 功能单元还具有差分电路,以确定输入向量元素的字节序列与数据内的输入值的字节序列之间的相应距离。

    Virtual microengine systems and methods
    9.
    发明申请
    Virtual microengine systems and methods 审中-公开
    虚拟微型发动机系统和方法

    公开(公告)号:US20060150165A1

    公开(公告)日:2006-07-06

    申请号:US11027785

    申请日:2004-12-30

    CPC classification number: G06F9/455

    Abstract: Systems and methods are disclosed for supporting virtual microengines in a multithreaded processor, such as a microengine running on a network processor. In one embodiment code is written for execution by a plurality of virtual microengines. The code is than compiled and linked for execution on a physical microengine, at which time the physical microengine's threads are assigned to thread groups corresponding to the virtual microengines. Internal next neighbor rings are allocated within the physical microengine to facilitate communication between the thread groups. The code can then be loaded onto the physical microengine and executed, with each thread group executing the code written for its corresponding virtual microengine.

    Abstract translation: 公开了用于在多线程处理器中支持虚拟微引擎的系统和方法,诸如在网络处理器上运行的微型引擎。 在一个实施例中,代码被写入以由多个虚拟微引擎执行。 代码被编译和链接以在物理微引擎上执行,此时物理微引擎的线程被分配给对应于虚拟微引擎的线程组。 内部下一个邻居环在物理微引擎内分配,以促进线程组之间的通信。 然后可以将代码加载到物理微引擎上并执行,每个线程组执行为其相应的虚拟微引擎编写的代码。

    Method and apparatus utilizing non-uniformly distributed DRAM configurations and to detect in-range memory address matches
    10.
    发明申请
    Method and apparatus utilizing non-uniformly distributed DRAM configurations and to detect in-range memory address matches 有权
    利用非均匀分布的DRAM配置并检测范围内的存储器地址匹配的方法和装置

    公开(公告)号:US20050144413A1

    公开(公告)日:2005-06-30

    申请号:US10751263

    申请日:2003-12-30

    CPC classification number: G06F13/4243

    Abstract: Methods, software and systems to determine channel ownership and physical block location within the channel in non-uniformly distributed DRAM configurations and also to detect in-range memory address matches are presented. A first method, which may also be implemented in software and/or hardware, allocates memory non-uniformly between a number of memory channels, determines a selected memory channel from the memory channels for a program address, and maps the program address to a physical address within the selected memory channel. A second method, which may also be implemented in software and/or hardware, designates a range of memory to perform address matching, monitors memory accesses and when a memory access occurs with the specified range, perform a particular function.

    Abstract translation: 给出了在非均匀分布的DRAM配置中确定通道中的通道所有权和物理块位置的方法,软件和系统,以及检测范围内的存储器地址匹配。 还可以在软件和/或硬件中实现的第一种方法在多个存储器通道之间非均匀地分配存储器,从存储器通道确定用于程序地址的所选择的存储器通道,并将程序地址映射到物理 所选内存通道内的地址。 也可以在软件和/或硬件中实现的第二种方法指定一系列存储器来执行地址匹配,监视存储器访问以及当以指定范围发生存储器访问时,执行特定功能。

Patent Agency Ranking