Efficient function interpolation using SIMD vector permute functionality
    1.
    发明授权
    Efficient function interpolation using SIMD vector permute functionality 失效
    使用SIMD向量置换功能的高效函数插值

    公开(公告)号:US06924802B2

    公开(公告)日:2005-08-02

    申请号:US10242566

    申请日:2002-09-12

    IPC分类号: G06F17/17 G06T11/20 G06F15/76

    CPC分类号: G06F17/17 G06T11/203

    摘要: A system, method, and computer program product are provided for generating display data. The data processing system loads coefficient values corresponding to a behavior of a selected function in pre-defined ranges of input data. The data processing system then determines, responsive to items of input data, the range of input data in which the selected function is to be estimated. The data processing system then selects, through the use of a vector permute function, the coefficient values, and evaluates an index function at the each of the items of input data. It then estimates the value of the selected function through parallel mathematical operations on the items of input data, the selected coefficient values, and the values of the index function, and, responsive to the one or more values of the selected function, generates display data.

    摘要翻译: 提供了用于产生显示数据的系统,方法和计算机程序产品。 数据处理系统在预定义的输入数据范围内加载与选定功能的行为相对应的系数值。 然后,数据处理系统响应于输入数据的项目确定要在其中估计所选择的功能的输入数据的范围。 然后,数据处理系统通过使用向量置换函数来选择系数值,并且在输入数据的每一项上评估索引函数。 然后,通过对输入数据,所选系数值和索引函数的值的并行数学运算来估计所选函数的值,并且响应于所选函数的一个或多个值,生成显示数据 。

    Hiding memory latency
    2.
    发明授权
    Hiding memory latency 失效
    隐藏内存延迟

    公开(公告)号:US07620951B2

    公开(公告)日:2009-11-17

    申请号:US12049293

    申请日:2008-03-15

    IPC分类号: G06F9/46

    CPC分类号: G06F9/322 G06F8/41 G06F9/3851

    摘要: An approach to hiding memory latency in a multi-thread environment is presented. Branch Indirect and Set Link (BISL) and/or Branch Indirect and Set Link if External Data (BISLED) instructions are placed in thread code during compilation at instances that correspond to a prolonged instruction. A prolonged instruction is an instruction that instigates latency in a computer system, such as a DMA instruction. When a first thread encounters a BISL or a BISLED instruction, the first thread passes control to a second thread while the first thread's prolonged instruction executes. In turn, the computer system masks the latency of the first thread's prolonged instruction. The system can be optimized based on the memory latency by creating more threads and further dividing a register pool amongst the threads to further hide memory latency in operations that are highly memory bound.

    摘要翻译: 介绍了一种在多线程环境中隐藏内存延迟的方法。 分支间接和设置链接(BISL)和/或分支间接和设置链接,如果外部数据(BISLED)指令在对应于延长的指令的实例的编译期间被放置在线程代码中。 延长的指令是指示计算机系统中的延迟,例如DMA指令。 当第一个线程遇到BISL或BISLED指令时,第一个线程在第一个线程的延长指令执行时将控制传递给第二个线程。 反过来,计算机系统掩盖了第一个线程延长的指令的延迟。 可以通过创建更多线程并在线程之间进一步划分寄存器池来进一步隐藏高度内存限制的操作中的内存延迟,从而可以基于内存延迟来优化系统。

    Dynamically partitioning processing across a plurality of heterogeneous processors
    3.
    发明授权
    Dynamically partitioning processing across a plurality of heterogeneous processors 失效
    跨多个异构处理器的动态分区处理

    公开(公告)号:US08091078B2

    公开(公告)日:2012-01-03

    申请号:US12116628

    申请日:2008-05-07

    IPC分类号: G06F9/45

    摘要: A program is into at least two object files: one object file for each of the supported processor environments. During compilation, code characteristics, such as data locality, computational intensity, and data parallelism, are analyzed and recorded in the object file. During run time, the code characteristics are combined with runtime considerations, such as the current load on the processors and the size of the data being processed, to arrive at an overall value. The overall value is then used to determine which of the processors will be assigned the task. The values are assigned based on the characteristics of the various processors. For example, if one processor is better at handling intensive computations against large streams of data, programs that are highly computationally intensive and process large quantities of data are weighted in favor of that processor. The corresponding object is then loaded and executed on the assigned processor.

    摘要翻译: 一个程序进入至少两个对象文件:一个对象文件,用于每个受支持的处理器环境。 在编译过程中,将数据位置,计算强度和数据并行等代码特征分析并记录在目标文件中。 在运行时间期间,代码特征与运行时考虑相结合,例如处理器上的当前负载和正在处理的数据的大小,以达到总体值。 然后,总体值用于确定哪些处理器将被分配任务。 这些值基于各种处理器的特性分配。 例如,如果一个处理器更好地处理针对大量数据流的密集计算,则高度计算密集的程序和处理大量数据的程序对该处理器进行加权。 然后在分配的处理器上加载和执行相应的对象。

    Hiding Memory Latency
    4.
    发明申请
    Hiding Memory Latency 失效
    隐藏内存延迟

    公开(公告)号:US20080162906A1

    公开(公告)日:2008-07-03

    申请号:US12049293

    申请日:2008-03-15

    IPC分类号: G06F9/30

    CPC分类号: G06F9/322 G06F8/41 G06F9/3851

    摘要: An approach to hiding memory latency in a multi-thread environment is presented. Branch Indirect and Set Link (BISL) and/or Branch Indirect and Set Link if External Data (BISLED) instructions are placed in thread code during compilation at instances that correspond to a prolonged instruction. A prolonged instruction is an instruction that instigates latency in a computer system, such as a DMA instruction. When a first thread encounters a BISL or a BISLED instruction, the first thread passes control to a second thread while the first thread's prolonged instruction executes. In turn, the computer system masks the latency of the first thread's prolonged instruction. The system can be optimized based on the memory latency by creating more threads and further dividing a register pool amongst the threads to further hide memory latency in operations that are highly memory bound.

    摘要翻译: 介绍了一种在多线程环境中隐藏内存延迟的方法。 分支间接和设置链接(BISL)和/或分支间接和设置链接,如果外部数据(BISLED)指令在对应于延长的指令的实例的编译期间被放置在线程代码中。 延长的指令是指示计算机系统中的延迟,例如DMA指令。 当第一个线程遇到BISL或BISLED指令时,第一个线程在第一个线程的延长指令执行时将控制传递给第二个线程。 反过来,计算机系统掩盖了第一个线程延长的指令的延迟。 可以通过创建更多线程并在线程之间进一步划分寄存器池来进一步隐藏高度内存限制的操作中的内存延迟,从而基于内存延迟来优化系统。

    Dynamically partitioning processing across plurality of heterogeneous processors
    5.
    发明授权
    Dynamically partitioning processing across plurality of heterogeneous processors 失效
    跨多个异构处理器的动态分区处理

    公开(公告)号:US07392511B2

    公开(公告)日:2008-06-24

    申请号:US10670824

    申请日:2003-09-25

    IPC分类号: G06F9/45

    摘要: A program is into at least two object files: one object file for each of the supported processor environments. During compilation, code characteristics, such as data locality, computational intensity, and data parallelism, are analyzed and recorded in the object file. During run time, the code characteristics are combined with runtime considerations, such as the current load on the processors and the size of the data being processed, to arrive at an overall value. The overall value is then used to determine which of the processors will be assigned the task. The values are assigned based on the characteristics of the various processors. For example, if one processor is better at handling intensive computations against large streams of data, programs that are highly computationally intensive and process large quantities of data are weighted in favor of that processor. The corresponding object is then loaded and executed on the assigned processor.

    摘要翻译: 一个程序进入至少两个对象文件:一个对象文件,用于每个受支持的处理器环境。 在编译过程中,将数据位置,计算强度和数据并行等代码特征分析并记录在目标文件中。 在运行时间期间,代码特征与运行时考虑相结合,例如处理器上的当前负载和正在处理的数据的大小,以达到总体值。 然后,总体值用于确定哪些处理器将被分配任务。 这些值基于各种处理器的特性分配。 例如,如果一个处理器更好地处理针对大量数据流的密集计算,则高度计算密集的程序和处理大量数据的程序对该处理器进行加权。 然后在分配的处理器上加载和执行相应的对象。

    Dynamically Partitioning Processing Across A Plurality of Heterogeneous Processors
    6.
    发明申请
    Dynamically Partitioning Processing Across A Plurality of Heterogeneous Processors 失效
    跨多个异构处理器的动态分区处理

    公开(公告)号:US20080250414A1

    公开(公告)日:2008-10-09

    申请号:US12116628

    申请日:2008-05-07

    IPC分类号: G06F9/44 G06F9/46

    摘要: A program is into at least two object files: one object file for each of the supported processor environments. During compilation, code characteristics, such as data locality, computational intensity, and data parallelism, are analyzed and recorded in the object file. During run time, the code characteristics are combined with runtime considerations, such as the current load on the processors and the size of the data being processed, to arrive at an overall value. The overall value is then used to determine which of the processors will be assigned the task. The values are assigned based on the characteristics of the various processors. For example, if one processor is better at handling intensive computations against large streams of data, programs that are highly computationally intensive and process large quantities of data are weighted in favor of that processor. The corresponding object is then loaded and executed on the assigned processor.

    摘要翻译: 一个程序进入至少两个对象文件:一个对象文件,用于每个受支持的处理器环境。 在编译过程中,将数据位置,计算强度和数据并行等代码特征分析并记录在目标文件中。 在运行时间期间,代码特征与运行时考虑相结合,例如处理器上的当前负载和正在处理的数据的大小,以达到总体值。 然后,总体值用于确定哪些处理器将被分配任务。 这些值基于各种处理器的特性分配。 例如,如果一个处理器更好地处理针对大量数据流的密集计算,则高度计算密集的程序和处理大量数据的程序对该处理器进行加权。 然后在分配的处理器上加载和执行相应的对象。

    Supplying cryptographic algorithm constants to a storage-constrained target
    9.
    发明授权
    Supplying cryptographic algorithm constants to a storage-constrained target 失效
    将密码算法常量提供给存储受限目标

    公开(公告)号:US08086865B2

    公开(公告)日:2011-12-27

    申请号:US12116258

    申请日:2008-05-07

    IPC分类号: H04L9/12

    CPC分类号: H04L9/3242

    摘要: The present invention provides for authenticating a message, A security function is performed upon the message, The message is sent to a target. The output of the security function is sent to the target. At least one publicly known constant is sent to the target. The received message is authenticated as a function of at least a shared key, the received publicly known constants, the security function, the received message, and the output of the security function. If the output of the security function received by the target is the same as the output generated as a function of at least the received message, the received publicly known constants, the security function, and the shared key, neither the message nor the constants have been altered.

    摘要翻译: 本发明提供了对消息的认证,对该消息执行安全功能。该消息被发送到目标。 安全功能的输出被发送到目标。 至少有一个公认的常数被发送到目标。 接收到的消息被认证为至少共享密钥,接收的公知常数,安全功能,接收到的消息和安全功能的输出的功能。 如果目标接收到的安全功能的输出与至少作为接收到的消息的函数产生的输出相同,则所接收的已知常数,安全功能和共享密钥,消息和常数都不具有 被改变了

    Random number generator
    10.
    发明授权
    Random number generator 失效
    随机数发生器

    公开(公告)号:US07890561B2

    公开(公告)日:2011-02-15

    申请号:US11204402

    申请日:2005-08-16

    IPC分类号: G06F1/02 G06F7/58

    摘要: A random number generator, a method, and a computer program product are provided for producing a random number seed. Each oscillator within an array of oscillators operates at a different frequency. The operating frequencies of each oscillator are not harmonically related, such that no integer multiple exists between the frequencies of any two oscillators. In one embodiment, the outputs of the array of oscillators connect to a multiple input latch. The multiple input latch also receives a sample signal, which is a clock signal. The clock signal samples the outputs of the array of oscillators, and the multiple input latch in conjunction with the random number determination logic (“RNDL”) produces a digital output (0 or 1) for each oscillator within the array. The RNDL uses these digital outputs to create a random number seed.

    摘要翻译: 提供随机数生成器,方法和计算机程序产品用于产生随机数种子。 振荡器阵列内的每个振荡器以不同的频率工作。 每个振荡器的工作频率不是谐波相关的,使得在任何两个振荡器的频率之间不存在整数倍。 在一个实施例中,振荡器阵列的输出连接到多输入锁存器。 多输入锁存器还接收作为时钟信号的采样信号。 时钟信号对振荡器阵列的输出采样,并且多输入锁存器与随机数确定逻辑(“RNDL”)一起为阵列内的每个振荡器产生数字输出(0或1)。 RNDL使用这些数字输出创建一个随机数字种子。