Methods and apparatus for dynamic very long instruction word sub-instruction selection for execution time parallelism in an indirect very long instruction word processor
    22.
    发明授权
    Methods and apparatus for dynamic very long instruction word sub-instruction selection for execution time parallelism in an indirect very long instruction word processor 有权
    用于动态超长指令字子指令选择的方法和装置,用于间接非常长的指令字处理器中的执行时间并行性

    公开(公告)号:US06467036B1

    公开(公告)日:2002-10-15

    申请号:US09717992

    申请日:2000-11-21

    IPC分类号: G06F1580

    摘要: A pipelined data processing unit includes an instruction sequencer and n functional units capable of executing n operations in parallel. The instruction sequencer includes a random access memory for storing very-long-instruction-words (VLIWs) used in operations involving the execution of two or more functional units in parallel. Each VLIW comprises a plurality of short-instruction-words (SIWs) where each SIW corresponds to a unique type of instruction associated with a unique functional unit. VLIWs are composed in the VLIW memory by loading and concatenating SIWs in each address, or entry. VLIWs are executed via the execute-VLIW (XV) instruction. The iVLIWs can be compressed at a VLIW memory address by use of a mask field contained within the XV1 instruction which specifies which functional units are enabled, or disabled, during the execution of the VLIW. The mask can be changed each time the XV1 instruction is executed, effectively modifying the VLIW every time it is executed. The VLIW memory (VIM) can be further partitioned into separate memories each associated with a function decode-and-execute unit. With a second execute VLIW instruction XV2, each functional unit's VIM can be independently addressed thereby removing duplicate SIWs within the functional unit's VIM. This provides a further optimization of the VLIW storage thereby allowing the use of smaller VLIW memories in cost sensitive applications.

    摘要翻译: 流水线数据处理单元包括指令定序器和能够并行执行n个操作的n个功能单元。 指令定序器包括用于存储在涉及并行执行两个或多个功能单元的操作中使用的非常长的指令字(VLIW)的随机存取存储器。 每个VLIW包括多个短指令字(SIW),其中每个SIW对应于与唯一功能单元相关联的唯一类型的指令。 VLIW通过在每个地址或条目中加载和连接SIW来组成VLIW存储器。 VLIW通过执行VLIW(XV)指令执行。 通过使用XV1指令中包含的掩码字段,可以在VLIW存储器地址处压缩iVLIW,该掩码字段指定在执行VLIW期间启用或禁用哪些功能单元。 每次执行XV1指令时,可以更改掩码,每次执行时都可以有效地修改VLIW。 VLIW存储器(VIM)可以被进一步划分成各自与功能解码和执行单元相关联的存储器。 通过第二执行VLIW指令XV2,可以独立地对每个功能单元的VIM进行寻址,从而去除功能单元的VIM内的重复SIW。 这提供了VLIW存储器的进一步优化,从而允许在成本敏感的应用中使用较小的VLIW存储器。

    Methods and apparatus for instruction addressing in indirect VLIW processors

    公开(公告)号:US06356994B1

    公开(公告)日:2002-03-12

    申请号:US09350191

    申请日:1999-07-09

    IPC分类号: G06F1500

    摘要: An indirect VLIW (iVLIW) architecture is described which contains a minimum of two instruction memories. The first instruction memory (SIM) contains short-instruction-words (SIWs) of a fixed length. The second instruction memory (VIM), contains very-long-instruction-words (VLIWs) which allow execution of multiple instructions in parallel. Each SIW may be fetched and executed as an independent instruction by one of the available execution units. A special class of SIW is used to reference the VIM indirectly to either execute or load a specified VLIW instruction (called an “XV” instruction for “eXecute VLIW”, or LV for “Load VLIW”). In these cases, the SIW instruction specifies how the location of the VLIW is to be accessed. Other aspects of this approach relate to the application of data memory addressing techniques for execution or loading of VLIWs that parallel the addressing modes used for data memory accesses. These addressing techniques provide tremendous flexibility for VLIW instruction execution.

    Methods and apparatus for scalable instruction set architecture with dynamic compact instructions
    24.
    发明授权
    Methods and apparatus for scalable instruction set architecture with dynamic compact instructions 有权
    用于具有动态紧凑指令的可扩展指令集架构的方法和装置

    公开(公告)号:US06321322B1

    公开(公告)日:2001-11-20

    申请号:US09543473

    申请日:2000-04-05

    IPC分类号: G06F1580

    摘要: A hierarchical instruction set architecture (ISA) provides pluggable instruction set capability and support of array processors. The term pluggable is from the programmer's viewpoint and relates to groups of instructions that can easily be added to a processor architecture for code density and performance enhancements. One specific aspect addressed herein is the unique compacted instruction set which allows the programmer the ability to dynamically create a set of compacted instructions on a task by task basis for the primary purpose of improving control and parallel code density. These compacted instructions are parallelizable in that they are not specifically restricted to control code application but can be executed in the processing elements (PEs) in an array processor. The ManArray family of processors is designed for this dynamic compacted instruction set capability and also supports a scalable array of from one to N PEs. In addition, the ManArray ISA is defined as a hierarchy of ISAs which allows for future growth in instruction capability and supports the packing of multiple instructions within a hierarchy of instructions.

    摘要翻译: 分层指令集架构(ISA)提供可插拔指令集功能和阵列处理器的支持。 术语pluggable来自程序员的观点,并且涉及可以容易地添加到处理器架构中以用于代码密度和性能增强的指令组。 本文所述的一个具体方面是独特的压缩指令集,其允许程序员能够通过任务为任务动态地创建一组压缩指令,以提高控制和并行代码密度的主要目的。 这些压缩指令是可并行的,因为它们不特别地限于控制代码应用,而是可以在阵列处理器中的处理元件(PE)中执行。 ManArray系列处理器专为此动态压缩指令集功能而设计,并且还支持从一个到N个PE的可扩展阵列。 此外,ManArray ISA被定义为ISA的层次结构,其允许未来指令能力的增长并且支持在指令层次结构内的多个指令的打包。

    Methods and apparatus for dynamic very long instruction word sub-instruction selection for execution time parallelism in an indirect very long instruction word processor
    25.
    发明授权
    Methods and apparatus for dynamic very long instruction word sub-instruction selection for execution time parallelism in an indirect very long instruction word processor 有权
    用于动态超长指令字子指令选择的方法和装置,用于间接非常长的指令字处理器中的执行时间并行性

    公开(公告)号:US06173389B2

    公开(公告)日:2001-01-09

    申请号:US09205588

    申请日:1998-12-04

    IPC分类号: G06F1580

    摘要: A pipelined data processing unit includes an instruction sequencer and n functional units capable of executing n operations in parallel. The instruction sequencer includes a random access memory for storing very-long-instruction-words (VLIWs) used in operations involving the execution of two or more functional units in parallel. Each VLIW comprises a plurality of short-instruction-words (SIWs) where each SIW corresponds to a unique type of instruction associated with a unique functional unit. VLIWs are composed in the VLIW memory by loading and concatenating SIWs in each address, or entry. VLIWs are executed via the execute-VLIW (XV) instruction. The iVLIWs can be compressed at a VLIW memory address by use of a mask field contained within the XV1 instruction which specifics which functional units are enabled, or disabled, during the execution of the VLIW. The mask can be changed each time the XV1 instruction is executed, effectively modifying the VLIW every time it is executed. The VLIW memory (VIM) can be further partitioned into separate memories each associated with a function decode-and-execute unit. With a second execute VLIW instruction XV2, each functional unit's VIM can be independently addressed thereby removing duplicate SIWs within the functional unit's VIM. This provides a further optimization of the VLIW storage thereby allowing the use of smaller VLIW memories in cost sensitive applications.

    摘要翻译: 流水线数据处理单元包括指令定序器和能够并行执行n个操作的n个功能单元。 指令定序器包括用于存储在涉及并行执行两个或多个功能单元的操作中使用的非常长指令字(VLIW)的随机存取存储器。 每个VLIW包括多个短指令字(SIW),其中每个SIW对应于与唯一功能单元相关联的唯一类型的指令。 VLIW通过在每个地址或条目中加载和连接SIW来组成VLIW存储器。 VLIW通过执行VLIW(XV)指令执行。 通过使用包含在XV1指令中的掩码字段,可以在VLIW存储器地址处压缩iVLIW,该掩码字段指定在执行VLIW期间启用或禁用哪些功能单元。 每次执行XV1指令时,可以更改掩码,每次执行时都可以有效地修改VLIW。 VLIW存储器(VIM)可以被进一步划分成各自与功能解码和执行单元相关联的存储器。 通过第二执行VLIW指令XV2,可以独立地对每个功能单元的VIM进行寻址,从而去除功能单元的VIM内的重复SIW。 这提供了VLIW存储器的进一步优化,从而允许在成本敏感的应用中使用较小的VLIW存储器。

    Methods and Apparatus for Providing Bit-Reversal and Multicast Functions Utilizing DMA Controller
    26.
    发明申请
    Methods and Apparatus for Providing Bit-Reversal and Multicast Functions Utilizing DMA Controller 审中-公开
    使用DMA控制器提供位反转和组播功能的方法和装置

    公开(公告)号:US20140075081A1

    公开(公告)日:2014-03-13

    申请号:US14070657

    申请日:2013-11-04

    IPC分类号: G06F13/28

    CPC分类号: G06F13/28

    摘要: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FFT computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.

    摘要翻译: 描述了用于向多个存储器提供改进的数据分配和从多个存储器收集的技术。 这种存储器通常与阵列处理器内的处理元件(PE)相关联并且位于本地。 数据处理系统中的改进的数据传输控制通过在数字信号处理器(DSP)上与FFT计算并行执行的多个PE上的数据重排序或位反转寻址来提供基数2,4和8快速傅里叶变换(FFT)算法的支持 )数组。 还支持通过组播和数据包采集操作的并行数据分发和收集。

    Methods and Apparatus for Providing Bit-Reversal and Multicast Functions Utilizing DMA Controller
    27.
    发明申请
    Methods and Apparatus for Providing Bit-Reversal and Multicast Functions Utilizing DMA Controller 有权
    使用DMA控制器提供位反转和组播功能的方法和装置

    公开(公告)号:US20110302333A1

    公开(公告)日:2011-12-08

    申请号:US13205269

    申请日:2011-08-08

    IPC分类号: G06F13/28

    CPC分类号: G06F13/28

    摘要: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FFT computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.

    摘要翻译: 描述了用于向多个存储器提供改进的数据分配和从多个存储器收集的技术。 这种存储器通常与阵列处理器内的处理元件(PE)相关联并且位于本地。 数据处理系统中的改进的数据传输控制通过在数字信号处理器(DSP)上与FFT计算并行执行的多个PE上的数据重排序或位反转寻址来提供基数2,4和8快速傅里叶变换(FFT)算法的支持 )数组。 还支持通过组播和数据包采集操作的并行数据分发和收集。

    Methods and apparatus for providing bit-reversal and multicast functions utilizing DMA controller
    29.
    发明授权
    Methods and apparatus for providing bit-reversal and multicast functions utilizing DMA controller 有权
    使用DMA控制器提供位反转和多播功能的方法和装置

    公开(公告)号:US06834295B2

    公开(公告)日:2004-12-21

    申请号:US09791940

    申请日:2001-02-23

    IPC分类号: G06F15167

    CPC分类号: G06F13/28

    摘要: Techniques for providing improved data distribution to and collection from multiple memories are described. Such memories are often associated with and local to processing elements (PEs) within an array processor. Improved data transfer control within a data processing system provides support for radix 2, 4 and 8 fast Fourier transform (FFT) algorithms through data reordering or bit-reversed addressing across multiple PEs, carried out concurrently with FFT computation on a digital signal processor (DSP) array by a DMA unit. Parallel data distribution and collection through forms of multicast and packet-gather operations are also supported.

    摘要翻译: 描述了用于向多个存储器提供改进的数据分配和从多个存储器收集的技术。 这种存储器通常与阵列处理器内的处理元件(PE)相关联并且位于本地。 数据处理系统中的改进的数据传输控制通过在数字信号处理器(DSP)上与FFT计算并行执行的多个PE上的数据重排序或位反转寻址来提供基数2,4和8快速傅里叶变换(FFT)算法的支持 )数组。 还支持通过组播和数据包采集操作的并行数据分发和收集。

    Coprocessor instruction loading from port register based on interrupt vector table indication
    30.
    发明授权
    Coprocessor instruction loading from port register based on interrupt vector table indication 失效
    基于中断向量表指示的端口寄存器的协处理器指令加载

    公开(公告)号:US07017029B2

    公开(公告)日:2006-03-21

    申请号:US11040358

    申请日:2005-01-21

    申请人: Edwin F. Barry

    发明人: Edwin F. Barry

    IPC分类号: G06F9/48

    摘要: An interface source system providing at least two paths to load an instruction decode register of a coprocessor is disclosed. The interface source system includes an instruction port register, an instruction memory, an instruction decode register, and an interrupt vector table (IVT) stored in the instruction memory. The IVT stores an external instruction vector containing either a predetermined value indicating that the instruction decode register is to be loaded with contents from the instruction port register or an address of an instruction in the instruction memory. A first one of the at least two paths is used to load the instruction from the instruction memory containing the IVT if the external instruction vector contained the address of the instruction in the instruction memory. A second one of the at least two paths is used to load the instruction from the instruction port register if the external instruction vector contained the predetermined value.

    摘要翻译: 公开了一种提供至少两条路径以加载协处理器的指令解码寄存器的接口源系统。 接口源系统包括存储在指令存储器中的指令端口寄存器,指令存储器,指令解码寄存器和中断向量表(IVT)。 IVT存储外部指令向量,该外部指令向量包含表示指令解码寄存器将从指令端口寄存器中加载内容的预定值或指令存储器中的指令的地址。 如果外部指令矢量包含指令存储器中的指令地址,则至少两条路径中的第一路径用于从包含IVT的指令存储器加载指令。 如果外部指令矢量包含预定值,则至少两条路径中的第二路径用于加载来自指令端口寄存器的指令。