USING A PIXEL OFFSET FOR EVALUATING A PLANE EQUATION
    21.
    发明申请
    USING A PIXEL OFFSET FOR EVALUATING A PLANE EQUATION 有权
    使用像素偏移来评估平面公式

    公开(公告)号:US20110081100A1

    公开(公告)日:2011-04-07

    申请号:US12898537

    申请日:2010-10-05

    IPC分类号: G06K9/32

    CPC分类号: G06T3/4007

    摘要: One embodiment of the present invention sets forth a technique controlling the pixel location at which the plane equation is evaluated. Multiple pixel offsets (dx, dy) may be specified that each define to a sub-pixel sample position. Attributes are then calculated for each sub-pixel sample position that is covered by a geometric primitive. One advantage of the technique is that anti-aliasing quality may be improved since high frequency color components may be selectively supersampled for particular geometric primitives.

    摘要翻译: 本发明的一个实施例提出了一种控制平面方程被评估的像素位置的技术。 可以指定多个像素偏移(dx,dy),每个像素偏移定义为子像素采样位置。 然后对由几何图元覆盖的每个子像素样本位置计算属性。 该技术的一个优点是可以改善抗混叠质量,因为可以对特定几何基元选择性地超采样高频彩色分量。

    Dynamic load balancing of instructions for execution by heterogeneous processing engines
    22.
    发明授权
    Dynamic load balancing of instructions for execution by heterogeneous processing engines 有权
    用于异构处理引擎执行的指令的动态负载平衡

    公开(公告)号:US08578387B1

    公开(公告)日:2013-11-05

    申请号:US11831873

    申请日:2007-07-31

    IPC分类号: G06F9/46

    摘要: An embodiment of a computing system is configured to process data using a multithreaded SIMD architecture that includes heterogeneous processing engines to execute a program. The program is constructed of various program instructions. A first type of the program instructions can only be executed by a first type of processing engine and a third type of program instructions can only be executed by a second type of processing engine. A second type of program instructions can be executed by the first and the second type of processing engines. An assignment unit may be configured to dynamically determine which of the two processing engines executes any program instructions of the second type in order to balance the workload between the heterogeneous processing engines.

    摘要翻译: 计算系统的实施例被配置为使用包括异构处理引擎来执行程序的多线程SIMD架构来处理数据。 该程序由各种程序指令构成。 第一类型的程序指令只能由第一类型的处理引擎执行,并且第三类型的程序指令只能由第二类型的处理引擎执行。 第二类型的程序指令可以由第一类和第二类处理引擎执行。 分配单元可以被配置为动态地确定两个处理引擎中的哪一个执行第二类型的任何程序指令,以便平衡异构处理引擎之间的工作负载。

    Execution of parallel groups of threads with per-instruction serialization
    23.
    发明授权
    Execution of parallel groups of threads with per-instruction serialization 有权
    使用每条指令序列化执行并行组线程

    公开(公告)号:US07634637B1

    公开(公告)日:2009-12-15

    申请号:US11305803

    申请日:2005-12-16

    IPC分类号: G06F15/76 G06F9/00

    摘要: In a processor, a SIMD group (a group of threads for which instructions are issued in parallel using single instruction, multiple data instruction issue techniques) is logically divided into two or more “SIMD subsets,” each containing one or more of the threads in the SIMD group. Each SIMD subset is associated with a different instance of a variable state parameter. The processor determines which of the instructions to be executed for the SIMD group rely on the state variable and serializes execution of such instructions so that the instruction is executed separately for each SIMD subset. Instructions that do not rely on the state variable are advantageously not serialized.

    摘要翻译: 在处理器中,SIMD组(一组线程,其中使用单个指令并行地发出指令,多个数据指令发布技术)在逻辑上被划分为两个或更多个“SIMD子集”,每个包含一个或多个线程 SIMD组。 每个SIMD子集与可变状态参数的不同实例相关联。 处理器确定要为SIMD组执行的指令中的哪一个依赖于状态变量并串行化这些指令的执行,使得针对每个SIMD子集分别执行该指令。 不依赖于状态变量的指令有利地不是序列化的。

    Shared FP and SIMD 3D multiplier
    25.
    发明授权
    Shared FP and SIMD 3D multiplier 有权
    共享FP和SIMD 3D乘数

    公开(公告)号:US06490607B1

    公开(公告)日:2002-12-03

    申请号:US09416401

    申请日:1999-10-12

    申请人: Stuart F. Oberman

    发明人: Stuart F. Oberman

    IPC分类号: G06F752

    摘要: A multiplier configured to perform multiplication of both scalar floating point values (X×Y) and packed floating point values (i.e., X1×Y1 and X2×Y2). In addition, the multiplier may be configured to calculate X×Y−Z. The multiplier comprises selection logic for selecting source operands, a partial product generator, an adder tree, and two or more adders configured to sum the results from the adder tree to achieve a final result. The multiplier may also be configured to perform iterative multiplication operations to implement such arithmetical operations such as division and square root. The multiplier may be configured to generate two versions of the final result, one assuming there is an overflow, and another assuming there is not an overflow. A computer system and method for performing multiplication are also disclosed.

    摘要翻译: 配置为执行两个标量浮点值(XxY)和压缩浮点值(即X1xY1和X2xY2)的乘法的乘法器。 此外,乘法器可以被配置为计算XxY-Z。 乘法器包括用于选择源操作数的选择逻辑,部分乘积生成器,加法器树和被配置为对来自加法器树的结果求和以获得最终结果的两个或更多个加法器。 乘法器还可以被配置为执行迭代乘法运算以实现诸如除法和平方根的算术运算。 乘法器可以被配置为生成最终结果的两个版本,一个假设有溢出,另一个假设没有溢出。 还公开了一种用于执行乘法的计算机系统和方法。

    Floating point addition pipeline including extreme value, comparison and accumulate functions
    26.
    发明授权
    Floating point addition pipeline including extreme value, comparison and accumulate functions 失效
    浮点附加流水线包括极值,比较和累加功能

    公开(公告)号:US06298367B1

    公开(公告)日:2001-10-02

    申请号:US09055916

    申请日:1998-04-06

    IPC分类号: G06F738

    摘要: A multimedia execution unit configured to perform vectored floating point and integer instructions. The execution unit may include an add/subtract pipeline having far and close data paths. The far path is configured to handle effective addition operations and effective subtraction operations for operands having an absolute exponent difference greater than one. The close path is configured to handle effective subtraction operations for operands having an absolute exponent difference less than or equal to one. The close path is configured to generate two output values, wherein one output value is the first input operand plus an inverted version of the second input operand, while the second output value is equal to the first output value plus one. Selection of the first or second output value in the close path effectuates the round-to-nearest operation for the output of the adder. The execution unit may be configured to perform vectored addition and subtraction, integer/floating point conversion, reverse subtraction, accumulate, extreme value (minimum/maximum), and comparison instructions.

    摘要翻译: 多媒体执行单元被配置为执行矢量的浮点和整数指令。 执行单元可以包括具有远近数据路径的加法/减法流水线。 远程路径被配置为处理具有大于1的绝对指数差的操作数的有效加法运算和有效减法运算。 关闭路径被配置为处理具有小于或等于1的绝对指数差的操作数的有效减法操作。 关闭路径被配置为生成两个输出值,其中一个输出值是第一输入操作数加上第二输入操作数的反转版本,而第二输出值等于第一输出值加1。 在闭合路径中选择第一或第二输出值对加法器的输出实现了舍入到最近的运算。 执行单元可以被配置为执行向量加法和减法,整数/浮点转换,反向减法,累加,极值(最小/最大)和比较指令。

    Bipartite look-up table with output values having minimized absolute error
    27.
    发明授权
    Bipartite look-up table with output values having minimized absolute error 失效
    输出值为绝对误差最小的双向查找表

    公开(公告)号:US06223192B1

    公开(公告)日:2001-04-24

    申请号:US09098482

    申请日:1998-06-16

    IPC分类号: G06F102

    摘要: A method for generating entries for a bipartite look-up table having base and difference table portions. In one embodiment, these entries are usable to form output values for a mathematical function, f(x), in response to receiving corresponding input values within a predetermined input range. The method first comprises partitioning the input range into I intervals, J subintervals/interval, and K sub-subintervals/subinterval. For a given interval M, the method includes generating K difference table entries and J base table entries. Each of the K difference table entries corresponds to a particular group of sub-subintervals within interval M, each of which has the same relative position within their respective subintervals. Each difference table entry is computed by averaging difference values for the sub-subintervals included in a corresponding group N. Each difference value which makes up this average is equal to f(X1)−f(X2), where X1 is the midpoint of the sub-subinterval within group N, and X2 is the midpoint of a predetermined reference sub-subinterval within the same subinterval as X1. Each of these midpoints is calculated such that maximum absolute error is minimized for all possible input values in the sub-subinterval. Each of the J base table entries, on the other hand, corresponds to a subinterval within interval M. Each entry is equal to f(X2)+adjust, where X2 is the midpoint of the reference sub-subinterval of the subinterval corresponding to the base table entry. The adjust value is calculated so that error introduced by the averaging of the difference table entries is evenly distributed over the entire subinterval.

    摘要翻译: 一种用于为具有基准和差分表部分的二分查找表生成条目的方法。 在一个实施例中,响应于在预定输入范围内接收对应的输入值,这些条目可用于形成数学函数f(x)的输出值。 该方法首先包括将输入范围分为I个间隔,J个子间隔/间隔和K个子间隔/子间隔。 对于给定的间隔M,该方法包括生成K个差表表项和J个基表项。 K个差异表条目中的每一个对应于间隔M内的特定的子子区间组,每个子区间在它们各自的子区间内具有相同的相对位置。 通过对包括在对应组N中的子子间隔的差分值进行平均来计算每个差分表项。构成该平均值的每个差值等于f(X1)-f(X2),其中X1是 在组N内的子子间隔,X2是与X1相同的子间隔内的预定参考子子间隔的中点。 计算这些中点中的每一个,使得对子子区间中的所有可能输入值的最大绝对误差最小化。 另一方面,每个J基表条目对应于间隔M内的子间隔。每个条目等于f(X2)+调整,其中X2是对应于子帧的子间隔的参考子子间隔的中点 基表项。 计算调整值,使得通过差表表项的平均引入的误差在整个子间隔上均匀分布。

    Method and apparatus for simultaneously performing arithmetic on two or
more pairs of operands
    29.
    发明授权
    Method and apparatus for simultaneously performing arithmetic on two or more pairs of operands 失效
    用于同时对两对或更多对操作数执行算术的方法和装置

    公开(公告)号:US6026483A

    公开(公告)日:2000-02-15

    申请号:US014455

    申请日:1998-01-28

    摘要: A multiplier capable of performing both signed and unsigned scalar and vector multiplication is disclosed. The multiplier is configured for use in a microprocessor and comprises a partial product generator, a selection logic unit, and an adder. The multiplier is configured to receive signed or unsigned multiplier and multiplicand operands in scalar or packed vector form. The multiplier is also configured to receive a first control signal indicative of whether signed or unsigned multiplication is to be performed and a second control signal indicative of whether vector multiplication is to be performed. The multiplier is configured to calculate an effective sign for the multiplier and multiplicand operands based upon each operand's most significant bit and the control signal. The effective signs may then be used by the partial product generation unit and the selection logic to create and select a number of partial products according to Booth's algorithm. Once the partial products have been created and selected, the adder is configured to sum them and output the results, which may be signed or unsigned. When a vector multiplication is performed, the multiplier is configured to generate and select partial products so as to effectively isolate the multiplication process for each pair of vector components.

    摘要翻译: 公开了能够执行有符号和无符号标量和矢量乘法的乘法器。 乘法器被配置用于微处理器并且包括部分乘积发生器,选择逻辑单元和加法器。 乘法器配置为以标量或压缩向量形式接收带符号或无符号乘数和被乘数操作数。 乘法器还被配置为接收指示是否要执行带符号或无符号乘法的第一控制信号,以及指示是否执行向量乘法的第二控制信号。 乘法器被配置为基于每个操作数的最高有效位和控制信号来计算乘数的有效符号和被乘数操作数。 然后,有效符号可以被部分乘积生成单元和选择逻辑用于根据布斯算法创建和选择多个部分乘积。 一旦创建并选择了部分产品,加法器被配置为对它们进行求和并输出结果,这可能是有符号或无符号的。 当执行向量乘法时,乘法器被配置为产生和选择部分乘积,以便有效地隔离每对向量分量的乘法过程。

    Multipurpose arithmetic functional unit
    30.
    发明授权
    Multipurpose arithmetic functional unit 有权
    多功能算术功能单元

    公开(公告)号:US08190669B1

    公开(公告)日:2012-05-29

    申请号:US10970253

    申请日:2004-10-20

    IPC分类号: G06F7/44

    摘要: Multipurpose arithmetic functional units can perform planar attribute interpolation and unary function approximation operations. In one embodiment, planar interpolation operations for coordinates (x, y) are executed by computing A*x+B*y+C, and unary function approximation operations for operand x are executed by computing F2(xb)*xh2+F1(xb)*xh+F0(xb), where xh=x−xb. Shared multiplier and adder circuits are advantageously used to implement the product and sum operations for both classes of operations.

    摘要翻译: 多用途算术功能单元可以执行平面属性插值和一元函数近似运算。 在一个实施例中,通过计算A * x + B * y + C来执行坐标(x,y)的平面内插操作,并且通过计算F2(xb)* xh2 + F1(xb)来执行操作数x的一元函数近似运算 )* xh + F0(xb),其中xh = x-xb。 共享乘法器和加法器电路有利地用于实现两类操作的乘积和求和运算。