Method and Apparatus for Executing Instructions
    21.
    发明申请
    Method and Apparatus for Executing Instructions 有权
    执行指令的方法和装置

    公开(公告)号:US20090113181A1

    公开(公告)日:2009-04-30

    申请号:US11877754

    申请日:2007-10-24

    CPC classification number: G06F9/3885 G06F9/3851

    Abstract: A method and apparatus for executing instructions in a processor are provided. In one embodiment of the invention, the method includes receiving a plurality of instructions. The plurality of instructions includes first instructions in a first thread and second instructions in a second thread. The method further includes forming a common issue group including an instruction of a first instruction type and an instruction of a second instruction type. The method also includes issuing the common issue group to a first execution unit and a second execution unit. The instruction of the first instruction type is issued to the first execution unit and the instruction of the second instruction type is issued to the second execution unit.

    Abstract translation: 提供了一种用于在处理器中执行指令的方法和装置。 在本发明的一个实施例中,该方法包括接收多个指令。 多个指令包括第一线程中的第一指令和第二线程中的第二指令。 该方法还包括形成包括第一指令类型的指令和第二指令类型的指令的公共发行组。 该方法还包括向第一执行单元和第二执行单元发布公共问题组。 向第一执行单元发出第一指令类型的指令,并向第二执行单元发出第二指令类型的指令。

    "> DESIGN STRUCTURE FOR SCALAR PRECISION FLOAT IMPLEMENTATION ON THE
    22.
    发明申请
    DESIGN STRUCTURE FOR SCALAR PRECISION FLOAT IMPLEMENTATION ON THE "W" LANE OF VECTOR UNIT 审中-公开
    “矢量单位”W“范围上的标准精度浮标实施设计结构

    公开(公告)号:US20090106525A1

    公开(公告)日:2009-04-23

    申请号:US12048324

    申请日:2008-03-14

    CPC classification number: G06F9/30036 G06F9/30014 G06F15/8076

    Abstract: A design structure embodied in a machine readable storage medium for designing, manufacturing, and/or testing a design for image processing, and more specifically to vector units for supporting image processing is provided. A combined vector/scalar unit is provided wherein one or more processing lanes of the vector unit are used for performing scalar operations. An integrated register file is also provided for storing vector and scalar data. Therefore, the transfer of data to memory to exchange data between independent vector and scalar units is obviated and a significant amount of chip area is saved.

    Abstract translation: 提供了一种体现在用于设计,制造和/或测试图像处理设计的机器可读存储介质中的设计结构,更具体地,涉及用于支持图像处理的矢量单元。 提供了组合矢量/标量单元,其中矢量单元的一个或多个处理通道用于执行标量运算。 还提供了用于存储向量和标量数据的集成寄存器文件。 因此,消除了将数据传输到存储器以在独立矢量和标量单位之间交换数据,并且节省了大量的芯片面积。

    Method and Apparatus Implementing a Floating Point Weighted Average Function
    23.
    发明申请
    Method and Apparatus Implementing a Floating Point Weighted Average Function 有权
    实现浮点加权平均函数的方法和装置

    公开(公告)号:US20090083357A1

    公开(公告)日:2009-03-26

    申请号:US11861518

    申请日:2007-09-26

    CPC classification number: G06F7/483

    Abstract: A method, computer-readable medium, and an apparatus for implementing a floating point weighted average function. The method includes receiving an input containing 2N input values, 2N weights, and an opcode, where N is a positive integer number and each of the input values corresponds to one of the weights. Furthermore, the method also includes using existing dot product circuit function to generate 2N addends by multiplying each of the input values with the corresponding weight. In addition, the method includes generating a sum value by adding the 2N addends, where the sum value includes an exponent value, and generating the weighted average value based on the sum value by decreasing the exponent value by N. In this fashion, the same circuit area may be used to carry out both dot product and weighted average calculations, leading to greater circuit area savings and performance advantages.

    Abstract translation: 一种用于实现浮点加权平均函数的方法,计算机可读介质和装置。 该方法包括接收包含2N个输入值,2N个权重和操作码的输入,其中N是正整数,并且每个输入值对应于其中一个权重。 此外,该方法还包括使用现有的点积电路函数,通过将每个输入值与相应的权重相乘来产生2N个加数。 此外,该方法包括通过加上2N加数来产生和值,其中和值包括指数值,并且通过将指数值减小N来基于和值生成加权平均值。以这种方式,相同 电路面积可用于进行点积和加权平均计算,从而实现更大的电路面积节省和性能优势。

    Store Misaligned Vector with Permute
    24.
    发明申请
    Store Misaligned Vector with Permute 失效
    存储不对齐向量与Permute

    公开(公告)号:US20090015589A1

    公开(公告)日:2009-01-15

    申请号:US11775999

    申请日:2007-07-11

    CPC classification number: G06T15/06

    Abstract: Embodiments of the invention provide logic within the store data path between a processor and a memory array. The logic may be configured to misalign vector data as it is stored to memory. By misaligning vector data as it is stored to memory, memory bandwidth may be maximized while processing bandwidth required to store vector data misaligned is minimized. Furthermore, embodiments of the invention provide logic within the load data path which allows vector data which is stored misaligned to be aligned as it is loaded into a vector register. By aligning misaligned vector data as it is loaded into a vector register, memory bandwidth may be maximized while processing bandwidth required to align misaligned vector data may be minimized.

    Abstract translation: 本发明的实施例提供处理器和存储器阵列之间的存储数据路径内的逻辑。 逻辑可以被配置为在向量数据存储到存储器时将其对准。 通过在将矢量数据存储到存储器时将其对准,存储器带宽可以最大化,而存储向量数据不对齐所需的处理带宽最小化。 此外,本发明的实施例提供了负载数据路径内的逻辑,其允许存储的未对准的矢量数据在被加载到向量寄存器中时被对准。 通过在将其加载到向量寄存器中时对准未对齐的矢量数据,可以最大化存储器带宽,同时可以最小化对准未对齐矢量数据所需的处理带宽。

    Area Optimized Full Vector Width Vector Cross Product
    25.
    发明申请
    Area Optimized Full Vector Width Vector Cross Product 审中-公开
    区域优化全矢量宽度矢量交叉乘积

    公开(公告)号:US20080082784A1

    公开(公告)日:2008-04-03

    申请号:US11925064

    申请日:2007-10-26

    CPC classification number: G06T15/06 G06T2200/28

    Abstract: The present invention is generally related to integrated circuit devices, and more particularly, to methods, systems and design structures for the field of image processing, and more specifically to vector units for supporting image processing. A dual vector unit implementation is described wherein two vector units are configured receive data from a common register file. The vector units may independently and simultaneously process instructions. Furthermore, the vector units may be adapted to perform scalar operations thereby integrating the vector and scalar processing. The vector units may also be configured to share resources to perform an operation, for example, a cross product operation.

    Abstract translation: 本发明通常涉及集成电路装置,更具体地涉及图像处理领域的方法,系统和设计结构,更具体地涉及用于支持图像处理的矢量单元。 描述了双向量单元实现,其中配置了两个向量单元从公共寄存器文件接收数据。 向量单元可以独立地并且同时处理指令。 此外,矢量单元可以适于执行标量运算,从而整合向量和标量处理。 矢量单元还可以被配置为共享资源以执行操作,例如交叉产品操作。

    Dual Independent and Shared Resource Vector Execution Units with Shared Register File
    26.
    发明申请
    Dual Independent and Shared Resource Vector Execution Units with Shared Register File 有权
    具有共享寄存器文件的双独立和共享资源向量执行单元

    公开(公告)号:US20080082783A1

    公开(公告)日:2008-04-03

    申请号:US11924980

    申请日:2007-10-26

    CPC classification number: G06T1/20 G06F15/8092 G06T15/005

    Abstract: The present invention is generally related to integrated circuit devices, and more particularly, to methods, systems and design structures for the field of image processing, and more specifically to vector units for supporting image processing. A dual vector unit implementation is described wherein two vector units are configured receive data from a common register file. The vector units may independently and simultaneously process instructions. Furthermore, the vector units may be adapted to perform scalar operations thereby integrating the vector and scalar processing. The vector units may also be configured to share resources to perform an operation, for example, a cross product operation.

    Abstract translation: 本发明通常涉及集成电路装置,更具体地涉及图像处理领域的方法,系统和设计结构,更具体地涉及用于支持图像处理的矢量单元。 描述了双向量单元实现,其中配置了两个向量单元从公共寄存器文件接收数据。 向量单元可以独立地并且同时处理指令。 此外,矢量单元可以适于执行标量运算,从而整合向量和标量处理。 矢量单元还可以被配置为共享资源以执行操作,例如交叉产品操作。

    Implied storage operation decode using redundant target address detection
    27.
    发明授权
    Implied storage operation decode using redundant target address detection 有权
    隐藏存储操作使用冗余目标地址检测进行解码

    公开(公告)号:US08255674B2

    公开(公告)日:2012-08-28

    申请号:US12360975

    申请日:2009-01-28

    Abstract: A logic arrangement and method to support implied storage operation decode uses redundant target address detection, whereby target addresses of previous instructions are compared with the target address of the current instruction, and if equal, and the target addresses of previous instructions are not used as sources, the current instruction is decoded as a store instruction. This allows a redundant operation in an instruction set architecture to be redefined as a store instruction, freeing up opcodes normally used for store instructions to be used for other instructions.

    Abstract translation: 支持隐含存储操作解码的逻辑布置和方法使用冗余目标地址检测,由此将先前指令的目标地址与当前指令的目标地址进行比较,如果相等,并且先前指令的目标地址不被用作源 ,当前指令被解码为存储指令。 这允许将指令集架构中的冗余操作重新定义为存储指令,释放通常用于存储指令的操作码以用于其他指令。

    Anisotropic texture filtering with texture data prefetching
    28.
    发明授权
    Anisotropic texture filtering with texture data prefetching 有权
    具有纹理数据预取的各向异性纹理过滤

    公开(公告)号:US08217953B2

    公开(公告)日:2012-07-10

    申请号:US12110045

    申请日:2008-04-25

    CPC classification number: G06T15/04 G06T2200/12

    Abstract: A circuit arrangement and method utilize texture data prefetching to prefetch texture data used by an anisotropic filtering algorithm. In particular, stride-based prefetching may be used to prefetch texture data for use in anisotropic filtering, where the value of the stride, or difference between successive accesses, is based upon a distance in a memory address space between sample points taken along the line of anisotropy used in an anisotropic filtering algorithm.

    Abstract translation: 电路布置和方法利用纹理数据预取来预取由各向异性滤波算法使用的纹理数据。 特别地,可以使用基于步幅的预取来预取用于各向异性过滤中的纹理数据,其中步幅的值或连续访问之间的差是基于沿着线所取的采样点之间的存储器地址空间中的距离 在各向异性过滤算法中使用各向异性。

    Scalar precision float implementation on the “W” lane of vector unit
    29.
    发明授权
    Scalar precision float implementation on the “W” lane of vector unit 失效
    向量单位“W”通道上的标量精度浮点执行

    公开(公告)号:US08169439B2

    公开(公告)日:2012-05-01

    申请号:US11877205

    申请日:2007-10-23

    Abstract: Embodiments of the invention are generally related to image processing, and more specifically to vector units for supporting image processing. A combined vector/scalar unit is provided wherein one or more processing lanes of the vector unit are used for performing scalar operations. An integrated register file is also provided for storing vector and scalar data. Therefore, the transfer of data to memory to exchange data between independent vector and scalar units is obviated and a significant amount of chip area is saved.

    Abstract translation: 本发明的实施例通常涉及图像处理,更具体地涉及用于支持图像处理的矢量单元。 提供了组合矢量/标量单元,其中矢量单元的一个或多个处理通道用于执行标量运算。 还提供了用于存储向量和标量数据的集成寄存器文件。 因此,消除了将数据传输到存储器以在独立矢量和标量单位之间交换数据,并且节省了大量的芯片面积。

    Store misaligned vector with permute
    30.
    发明授权
    Store misaligned vector with permute 失效
    存储未对齐的向量与置换

    公开(公告)号:US08161271B2

    公开(公告)日:2012-04-17

    申请号:US11775999

    申请日:2007-07-11

    CPC classification number: G06T15/06

    Abstract: Embodiments of the invention provide logic within the store data path between a processor and a memory array. The logic may be configured to misalign vector data as it is stored to memory. By misaligning vector data as it is stored to memory, memory bandwidth may be maximized while processing bandwidth required to store vector data misaligned is minimized. Furthermore, embodiments of the invention provide logic within the load data path which allows vector data which is stored misaligned to be aligned as it is loaded into a vector register. By aligning misaligned vector data as it is loaded into a vector register, memory bandwidth may be maximized while processing bandwidth required to align misaligned vector data may be minimized.

    Abstract translation: 本发明的实施例提供处理器和存储器阵列之间的存储数据路径内的逻辑。 逻辑可以被配置为在向量数据存储到存储器时将其对准。 通过在将矢量数据存储到存储器时将其对准,存储器带宽可以最大化,而存储向量数据不对齐所需的处理带宽最小化。 此外,本发明的实施例提供了负载数据路径内的逻辑,其允许存储的未对准的矢量数据在被加载到向量寄存器中时被对准。 通过在将其加载到向量寄存器中时对准未对齐的矢量数据,可以最大化存储器带宽,同时可以最小化对准未对齐矢量数据所需的处理带宽。

Patent Agency Ranking