VECTOR PROCESSING SYSTEM
    41.
    发明申请
    VECTOR PROCESSING SYSTEM 审中-公开
    矢量处理系统

    公开(公告)号:US20090100252A1

    公开(公告)日:2009-04-16

    申请号:US12273236

    申请日:2008-11-18

    IPC分类号: G06F9/30

    摘要: A vector processing system for executing vector instructions, each instruction defining multiple pairs of values, an operation to be executed on each of said value pairs and a scalar modifier, the vector processing system comprising a plurality of parallel processing units, each arranged to receive one of said pairs of values and to implement the defined operation on said value pair to generate a respective result; and a scalar result unit for receiving the results of the parallel processing units and for using said results in a manner defined by the scalar modifier to generate a single output value for said instruction.

    摘要翻译: 一种用于执行向量指令的矢量处理系统,定义多对值的每个指令,对每个所述值对执行的操作和标量修改器,所述向量处理系统包括多个并行处理单元,每个并行处理单元被布置为接收一个 并且对所述值对实现所定义的操作以产生相应的结果; 以及标量结果单元,用于接收并行处理单元的结果,并且以由标量修饰符定义的方式使用所述结果以产生用于所述指令的单个输出值。

    Histogram generation with vector operations in SIMD and VLIW processor by consolidating LUTs storing parallel update incremented count values for vector data elements
    42.
    发明授权
    Histogram generation with vector operations in SIMD and VLIW processor by consolidating LUTs storing parallel update incremented count values for vector data elements 失效
    通过合并存储用于向量数据元素的并行更新递增计数值的LUT的SIMD和VLIW处理器中的向量操作的直方图生成

    公开(公告)号:US07506135B1

    公开(公告)日:2009-03-17

    申请号:US10441352

    申请日:2003-05-20

    申请人: Tibet Mimar

    发明人: Tibet Mimar

    IPC分类号: G06F15/80

    摘要: The present invention provides histogram calculation for images and video applications using a SIMD and VLIW processor with vector Look-Up Table (LUT) operations. This provides a speed up of histogram calculation by a factor of N times over a scalar processor where the SIMD processor could perform N LUT operations per instruction. Histogram operation is partitioned into a vector LUT operation, followed by vector increment, vector LUT update, and at the end by reduction of vector histogram components. The present invention could be used for intensity, RGBA, YUV, and other type of multi-component images.

    摘要翻译: 本发明提供使用具有向量查找表(LUT)操作的SIMD和VLIW处理器的图像和视频应用的直方图计算。 这提供了在标量处理器上的N倍的直方图计算的加速,其中SIMD处理器可以对每个指令执行N个LUT操作。 直方图操作被划分为向量LUT操作,随后是向量增量,向量LUT更新,最后通过减少向量直方图组件。 本发明可用于强度,RGBA,YUV和其他类型的多分量图像。

    Vector register file with arbitrary vector addressing
    43.
    发明授权
    Vector register file with arbitrary vector addressing 失效
    矢量寄存器文件与任意向量寻址

    公开(公告)号:US07467288B2

    公开(公告)日:2008-12-16

    申请号:US10713502

    申请日:2003-11-15

    IPC分类号: G06F15/76 G06F15/82 G06F15/80

    摘要: A system and method for processing operations that use data vectors each comprising a plurality of data elements, in accordance with the present invention, includes a vector data file comprising a plurality of storage elements for storing data elements of the data vectors. A pointer array is coupled by a bus to the vector data file. The pointer array includes a plurality of entries wherein each entry identifies at least one storage element in the vector data file. The at least one storage element stores at least one data element of the data vectors, wherein for at least one particular entry in the pointer array, the at least one storage element identified by the particular entry has an arbitrary starting address in the vector data file.

    摘要翻译: 根据本发明的用于处理使用包括多个数据元素的数据矢量的操作的系统和方法包括:矢量数据文件,包括用于存储数据矢量的数据元素的多个存储元件。 指针阵列通过总线耦合到矢量数据文件。 指针阵列包括多个条目,其中每个条目标识向量数据文件中的至少一个存储元件。 所述至少一个存储元件存储所述数据向量的至少一个数据元素,其中对于所述指针阵列中的至少一个特定条目,由所述特定条目标识的所述至少一个存储元件在所述向量数据文件中具有任意的起始地址 。

    Dual Independent and Shared Resource Vector Execution Units with Shared Register File
    44.
    发明申请
    Dual Independent and Shared Resource Vector Execution Units with Shared Register File 有权
    具有共享寄存器文件的双独立和共享资源向量执行单元

    公开(公告)号:US20080082783A1

    公开(公告)日:2008-04-03

    申请号:US11924980

    申请日:2007-10-26

    IPC分类号: G06F9/02 G06F15/76

    摘要: The present invention is generally related to integrated circuit devices, and more particularly, to methods, systems and design structures for the field of image processing, and more specifically to vector units for supporting image processing. A dual vector unit implementation is described wherein two vector units are configured receive data from a common register file. The vector units may independently and simultaneously process instructions. Furthermore, the vector units may be adapted to perform scalar operations thereby integrating the vector and scalar processing. The vector units may also be configured to share resources to perform an operation, for example, a cross product operation.

    摘要翻译: 本发明通常涉及集成电路装置,更具体地涉及图像处理领域的方法,系统和设计结构,更具体地涉及用于支持图像处理的矢量单元。 描述了双向量单元实现,其中配置了两个向量单元从公共寄存器文件接收数据。 向量单元可以独立地并且同时处理指令。 此外,矢量单元可以适于执行标量运算,从而整合向量和标量处理。 矢量单元还可以被配置为共享资源以执行操作,例如交叉产品操作。

    Programmable digital signal processor having a clustered SIMD microarchitecture including a complex short multiplier and an independent vector load unit
    45.
    发明申请
    Programmable digital signal processor having a clustered SIMD microarchitecture including a complex short multiplier and an independent vector load unit 审中-公开
    具有集成SIMD微架构的可编程数字信号处理器,其包括复数短乘法器和独立向量负载单元

    公开(公告)号:US20070198815A1

    公开(公告)日:2007-08-23

    申请号:US11201841

    申请日:2005-08-11

    IPC分类号: G06F9/44

    摘要: A programmable digital signal processor with a clustered SIMD microarchitecture includes a plurality of accelerator units, a processor core, and a complex computing unit. Each of the accelerator units may perform one or more dedicated functions. The processor core includes an integer execution unit that may execute integer instructions. The complex computing unit may include a complex arithmetic logic unit execution pipeline that may include one or more datapaths configured to execute complex vector instructions, and a vector load unit. In addition, each datapath may include a complex short multiplier accumulator unit that may be configured to multiply a complex data value by values in the set of numbers including {0, +/−1}+{0, +/−i}. The vector load unit may cause the complex vector instructions to be fetched each clock cycle for use by any datapath in the complex arithmetic logic unit execution pipeline.

    摘要翻译: 具有集群SIMD微体系结构的可编程数字信号处理器包括多个加速器单元,处理器核心和复合计算单元。 每个加速器单元可以执行一个或多个专用功能。 处理器核心包括可执行整数指令的整数执行单元。 复合计算单元可以包括复杂算术逻辑单元执行流水线,其可以包括被配置为执行复向量指令的一个或多个数据路径和向量加载单元。 另外,每个数据路径可以包括复数的短乘法器累加器单元,其可以被配置为将复数数据值乘以包括{0,+/- 1} + {0,+/- i}的数字集合中的值。 矢量加载单元可能会导致每个时钟周期提取复矢量指令,以供复杂算术逻辑单元执行流水线中的任何数据路径使用。

    Vector register file with arbitrary vector addressing
    47.
    发明申请
    Vector register file with arbitrary vector addressing 失效
    矢量寄存器文件与任意向量寻址

    公开(公告)号:US20040103262A1

    公开(公告)日:2004-05-27

    申请号:US10713502

    申请日:2003-11-15

    IPC分类号: G06F015/00

    摘要: A system and method for processing operations that use data vectors each comprising a plurality of data elements, in accordance with the present invention, includes a vector data file comprising a plurality of storage elements for storing data elements of the data vectors. A pointer array is coupled by a bus to the vector data file. The pointer array includes a plurality of entries wherein each entry identifies at least one storage element in the vector data file. The at least one storage element stores at least one data element of the data vectors, wherein for at least one particular entry in the pointer array, the at least one storage element identified by the particular entry has an arbitrary starting address in the vector data file.

    摘要翻译: 根据本发明的用于处理使用包括多个数据元素的数据矢量的操作的系统和方法包括:矢量数据文件,包括用于存储数据矢量的数据元素的多个存储元件。 指针阵列通过总线耦合到矢量数据文件。 指针阵列包括多个条目,其中每个条目标识向量数据文件中的至少一个存储元件。 所述至少一个存储元件存储所述数据向量的至少一个数据元素,其中对于所述指针阵列中的至少一个特定条目,由所述特定条目标识的所述至少一个存储元件在所述向量数据文件中具有任意的起始地址 。

    Digital signal processor with cascaded SIMD organization
    48.
    发明申请
    Digital signal processor with cascaded SIMD organization 失效
    具有级联SIMD组织的数字信号处理器

    公开(公告)号:US20040078554A1

    公开(公告)日:2004-04-22

    申请号:US10456793

    申请日:2003-06-07

    IPC分类号: G06F015/00

    摘要: A digital signal processor (DSP) includes dual SIMD units that are connected in cascade, and wherein results of a first SIMD stage of the cascade may be stored in a register file of a second SIMD stage in the cascade. Each SIMD stage contains its own resources for storing operands and intermediate results (e.g., its own register file), as well as for decoding the operations that may be executed in that stage. Within each stage, hardware resources are organized to operate in SIMD manner, so that independent SIMD operations can be executed simultaneously, one in each stage of the cascade. Intermediate operands and results flowing through the cascade are stored at the register files of the stages, and may be accessed from those register files. Data may also be brought from memory directly into the register files of the stages in the cascade.

    摘要翻译: 数字信号处理器(DSP)包括串联连接的双SIMD单元,并且其中级联的第一SIMD级的结果可以存储在级联中的第二SIMD级的寄存器文件中。 每个SIMD阶段包含用于存储操作数和中间结果(例如,其自己的寄存器文件)的自己的资源,以及用于解码在该阶段可以执行的操作。 在每个阶段中,组织硬件资源以SIMD方式运行,从而可以在级联的每个阶段中同时执行独立的SIMD操作。 流经级联的中间操作数和结果存储在各级的寄存器文件中,可以从这些寄存器文件访问。 也可以将数据从存储器直接引入级联级的寄存器文件。

    Multiprocessor system with vector pipelines
    50.
    发明授权
    Multiprocessor system with vector pipelines 失效
    带有矢量管线的多处理器系统

    公开(公告)号:US5887182A

    公开(公告)日:1999-03-23

    申请号:US64678

    申请日:1993-05-21

    申请人: Koji Kinoshita

    发明人: Koji Kinoshita

    摘要: In a multiprocessor system having a plurality of processors and main memory common to the processors, each processor includes at least one vector calculation unit which is specific to a vector calculation and which is independent of the vector calculation units in the other processors. A register holds a configuration signal representative of configuration of the vector calculation units in each processor. An access control unit controls access operations of the processors on the basis of the configuration signals in the processors to make the processors selectively access the main memory. Thus, the processors individually carry out the vector calculations to individually access the main memory.

    摘要翻译: 在具有处理器共同的多个处理器和主存储器的多处理器系统中,每个处理器包括至少一个向量计算单元,其特定于向量计算,并且独立于其他处理器中的向量计算单元。 寄存器保存表示每个处理器中的向量计算单元的配置的配置信号。 访问控制单元基于处理器中的配置信号控制处理器的访问操作,以使处理器选择性地访问主存储器。 因此,处理器单独执行向量计算以单独访问主存储器。