Scalable VLIW processor for high-speed viterbi and trellis coded modulation decoding
    2.
    发明授权
    Scalable VLIW processor for high-speed viterbi and trellis coded modulation decoding 有权
    可扩展的VLIW处理器,用于高速维特比和网格编码调制解码

    公开(公告)号:US08255780B2

    公开(公告)日:2012-08-28

    申请号:US12708323

    申请日:2010-02-18

    IPC分类号: H03M13/03

    摘要: An application specific processor to implement a Viterbi decode algorithm for channel decoding functions of received symbols. The Viterbi decode algorithm is at least one of a Bit Serial decode algorithm, and block based decode algorithm. The application specific processor includes a Load-Store, Logical and De-puncturing (LLD) slot that performs a Load-Store function, a Logical function, a De-puncturing function, and a Trace-back Address generation function, a Branch Metric Compute (BMU) slot that performs Radix-2 branch metric computations, Radix-4 branch metric computations, and Squared Euclidean Branch Metric computations, and an Add-Compare-Select (ACS) slot that performs Radix-2 Path metric computations, Radix-4 Path metric computations, best state computations, and a decision bit generation. The LLD slot, the BMU slot and the ACS slot perform in a software pipelined manner to enable high speed Viterbi decoding functions.

    摘要翻译: 一种应用专用处理器,用于实现用于接收符号的信道解码功能的维特比解码算法。 维特比解码算法是位串行解码算法和基于块的解码算法中的至少一种。 应用专用处理器包括执行加载存储功能的加载存储,逻辑和去穿孔(LLD)槽,逻辑功能,去穿孔功能和追踪地址生成功能,分支度量计算 (BMU)时隙,执行基数-2分支度量计算,基数-4分支度量计算和平方欧几里德分支度量计算,以及执行基数2路径度量计算的加法比较选择(ACS)时隙,基数-4 路径度量计算,最佳状态计算和决策位生成。 LLD插槽,BMU插槽和ACS插槽以软件流水线方式执行,以实现高速维特比解码功能。

    Testing of modules operating with different characteristics of control signals using scan based techniques
    3.
    发明申请
    Testing of modules operating with different characteristics of control signals using scan based techniques 有权
    使用基于扫描的技术测试使用不同控制信号特性的模块

    公开(公告)号:US20050091562A1

    公开(公告)日:2005-04-28

    申请号:US10710451

    申请日:2004-07-12

    IPC分类号: G01R31/3185 G01R31/28

    CPC分类号: G01R31/318563

    摘要: Testing of modules (such as Intellectual property (IP) cores) in integrated circuits (such as system on a chip units (SOCs)) in situations when different modules operate with different characteristics of a control signal. In an embodiment, another module (“subsystem module”) may be implemented to be tested with any of a multiple characteristics of a control signal, and a register which is programmable to generate a derived control signal of a desired characteristic from an original control signal, is provided. The derived control signal is provided to test the subsystem module. According to an aspect of the invention the desired characteristic may be determined, for example, to test a path between the two modules at the same speed as at which the path would be operated in a functional mode.

    摘要翻译: 在不同模块以不同控制信号的特性运行的情况下,集成电路(例如芯片单元(SOC)中的系统)中的模块(例如知识产权(IP)核心)的测试。 在一个实施例中,另一个模块(“子系统模块”)可以被实现为用控制信号的多个特性中的任何一个进行测试,以及寄存器,其被编程以从原始控制信号产生所需特性的导出控制信号 ,被提供。 提供导出的控制信号以测试子系统模块。 根据本发明的一个方面,可以确定期望的特性,例如,以与功能模式下操作路径相同的速度来测试两个模块之间的路径。

    Mechanism for efficient implementation of software pipelined loops in VLIW processors
    4.
    发明授权
    Mechanism for efficient implementation of software pipelined loops in VLIW processors 有权
    VLIW处理器软件流水线循环的有效实现机制

    公开(公告)号:US08447961B2

    公开(公告)日:2013-05-21

    申请号:US12708288

    申请日:2010-02-18

    IPC分类号: G06F9/40

    摘要: A system to implement a zero overhead software pipelined (SFP) loop includes a Very Long Instruction Word (VLIW) processor having an N number of execution slots. The VLIW processor executes a plurality of instructions in parallel without any limitation of an instruction buffer size. A program memory receives a Program Memory address to fetch an instruction packet. The program memory is closely coupled with the instruction buffer size to implement the zero overhead software pipelined (SFP) loop. The size of the zero overhead software pipelined (SFP) loop can exceed the instruction buffer size. A CPU control register includes a block count and an iteration count. The block count is loaded into a block counter and counts the plurality of instructions executed in the SFP loop, and the iteration count is loaded into an iteration counter and counts a number of iterations of the SFP loop based on the block count.

    摘要翻译: 实现零开销软件流水线(SFP)循环的系统包括具有N个执行时隙的超长指令字(VLIW)处理器。 VLIW处理器并行执行多个指令,而不受指令缓冲器大小的任何限制。 程序存储器接收程序存储器地址以获取指令包。 程序存储器与指令缓冲区大小紧密相连,以实现零开销软件流水线(SFP)循环。 零开销软件流水线(SFP)循环的大小可以超过指令缓冲区大小。 CPU控制寄存器包括块计数和迭代计数。 块计数被加载到块计数器中并对在SFP循环中执行的多个指令进行计数,并且将迭代计数加载到迭代计数器中,并且基于块计数对SFP循环的迭代次数进行计数。

    Mechanism for Efficient Implementation of Software Pipelined Loops in VLIW Processors
    5.
    发明申请
    Mechanism for Efficient Implementation of Software Pipelined Loops in VLIW Processors 有权
    VLIW处理器软件流水线循环有效实现机制

    公开(公告)号:US20100211762A1

    公开(公告)日:2010-08-19

    申请号:US12708288

    申请日:2010-02-18

    IPC分类号: G06F9/38 G06F9/30

    摘要: A system to implement a zero overhead software pipelined (SFP) loop includes a Very Long Instruction Word (VLIW) processor having an N number of execution slots. The VLIW processor executes a plurality of instructions in parallel without any limitation of an instruction buffer size. A program memory receives a Program Memory address to fetch an instruction packet. The program memory is closely coupled with the instruction buffer size to implement the zero overhead software pipelined (SFP) loop. The size of the zero overhead software pipelined (SFP) loop can exceed the instruction buffer size. A CPU control register includes a block count and an iteration count. The block count is loaded into a block counter and counts the plurality of instructions executed in the SFP loop, and the iteration count is loaded into an iteration counter and counts a number of iterations of the SFP loop based on the block count.

    摘要翻译: 实现零开销软件流水线(SFP)循环的系统包括具有N个执行时隙的超长指令字(VLIW)处理器。 VLIW处理器并行执行多个指令,而不受指令缓冲器大小的任何限制。 程序存储器接收程序存储器地址以获取指令包。 程序存储器与指令缓冲区大小紧密相连,以实现零开销软件流水线(SFP)循环。 零开销软件流水线(SFP)循环的大小可以超过指令缓冲区大小。 CPU控制寄存器包括块计数和迭代计数。 块计数被加载到块计数器中并对在SFP循环中执行的多个指令进行计数,并且将迭代计数加载到迭代计数器中,并且基于块计数对SFP循环的迭代次数进行计数。

    Memory access system providing increased throughput rates when accessing large volumes of data by determining worse case throughput rate delays
    6.
    发明授权
    Memory access system providing increased throughput rates when accessing large volumes of data by determining worse case throughput rate delays 有权
    存储器访问系统通过确定更糟糕的吞吐率延迟来访问大量数据,从而提高吞吐率

    公开(公告)号:US07200690B2

    公开(公告)日:2007-04-03

    申请号:US10709234

    申请日:2004-04-22

    IPC分类号: G06F13/28 G06F15/76

    CPC分类号: G06F13/28

    摘要: Enhancing the throughput rate of a memory access system by using store and forward buffers (SFB) in combination with a DMA engine. According to an aspect of the present invention, the worst case throughput rate (without use of SFBs) is computed, and maximization factor equaling a desired throughput rate divided by the worst case throughput rate is computed. A number of SFBs is determined as equaling one less than the maximization factor. By placing the SFBs at appropriate locations in the data transfer path, the desired throughput rate may be attained when transferring large volumes of data.

    摘要翻译: 通过使用存储和转发缓冲区(SFB)与DMA引擎组合来提高存储器访问系统的吞吐率。 根据本发明的一个方面,计算最坏情况吞吐率(不使用SFB),并且计算等于期望吞吐速率除以最坏情况吞吐率的最大化因子。 多个SFB被确定为比最大化因子小1。 通过将SFB放置在数据传输路径中的适当位置,可以在传输大量数据时获得所需的吞吐率。

    At-speed ATPG testing and apparatus for SoC designs having multiple clock domain using a VLCT test platform
    7.
    发明授权
    At-speed ATPG testing and apparatus for SoC designs having multiple clock domain using a VLCT test platform 有权
    使用VLCT测试平台的高速ATPG测试和SoC设计的设备具有多个时钟域

    公开(公告)号:US07134061B2

    公开(公告)日:2006-11-07

    申请号:US10731714

    申请日:2003-12-09

    IPC分类号: G01R31/28

    CPC分类号: G01R31/318552

    摘要: A scan test circuitry design imbedded on an SoC having the scan architecture of a VLCT platform is disclosed herein. This BIST circuitry design that is not limited in the number of scan test ports supported includes at least one scan chain group having a corresponding clock domain that couples to receive test stimulus data. Each scan chain group has a corresponding test mode signal to shift the test stimulus data at a shift clock rate derived from its corresponding clock domain. A controlling demultiplexer connects to each multiplexer unit within each scan chain group to provide control signals for shifting in the test stimulus. A clock control mechanism provides a control signal for each scan chain to shift test stimulus and capture resultant data. Furthermore, when a simultaneous test mode signal is enabled, the clock control mechanism couples to each scan chain to enable simultaneous capture of each scan chain group.

    摘要翻译: 本文公开了嵌入在具有VLCT平台的扫描架构的SoC上的扫描测试电路设计。 该BIST电路设计不受所支持的扫描测试端口数量的限制,包括至少一个具有耦合以接收测试激励数据的相应时钟域的扫描链组。 每个扫描链组具有对应的测试模式信号,以从其对应的时钟域导出的移位时钟速率移动测试激励数据。 控制解复用器连接到每个扫描链组内的每个多路复用器单元,以提供用于在测试刺激中移位的控制信号。 时钟控制机制为每个扫描链提供控制信号,以移动测试刺激并捕获结果数据。 此外,当启用同时测试模式信号时,时钟控制机制耦合到每个扫描链以实现每个扫描链组的同时捕获。

    Vector Slot Processor Execution Unit for High Speed Streaming Inputs
    8.
    发明申请
    Vector Slot Processor Execution Unit for High Speed Streaming Inputs 有权
    用于高速流输入的矢量插槽处理器执行单元

    公开(公告)号:US20120284487A1

    公开(公告)日:2012-11-08

    申请号:US13462144

    申请日:2012-05-02

    IPC分类号: G06F15/76 G06F9/302

    摘要: A vector slot processor that is capable of supporting multiple signal processing operations for multiple demodulation standards is provided. The vector slot processor includes a plurality of micro execution slot (MES) that performs the multiple signal processing operations on the high speed streaming inputs. Each of the MES includes one or more n-way signal registers that receive the high speed streaming inputs, one or more n-way coefficient registers that store filter coefficients for the multiple signal processing, and one or more n-way Multiply and Accumulate (MAC) units that receive the high speed streaming inputs from the one or more n-way signal registers and filter coefficients from one or more n-way coefficient registers. The one or more n-way MAC units perform a vertical MAC operation and a horizontal multiply and add operation on the high speed streaming inputs.

    摘要翻译: 提供了能够支持用于多个解调标准的多个信号处理操作的向量时隙处理器。 矢量时隙处理器包括对高速流输入进行多信号处理操作的多个微执行时隙(MES)。 每个MES包括接收高速流输入的一个或多个n路信号寄存器,存储多信号处理的滤波器系数的一个或多个n路系数寄存器和一个或多个n路乘法和累加( MAC)单元,其从一个或多个n路信号寄存器接收高速流输入和来自一个或多个n路系数寄存器的滤波器系数。 一个或多个n路MAC单元在高速流输入上执行垂直MAC操作和水平乘法和相加操作。

    Zero Overhead Block Floating Point Implementation in CPU's
    9.
    发明申请
    Zero Overhead Block Floating Point Implementation in CPU's 有权
    CPU中的零顶点块浮点实现

    公开(公告)号:US20120284464A1

    公开(公告)日:2012-11-08

    申请号:US13461902

    申请日:2012-05-02

    IPC分类号: G06F12/00

    CPC分类号: G06F7/483

    摘要: A system for computing a block floating point scaling factor by detecting a dynamic range of an input signal in a central processing unit without additional overhead cycles is provided. The system includes a dynamic range monitoring unit that detects the dynamic range of the input signal by snooping outgoing write data and incoming memory read data of the input signal. The dynamic range monitoring unit includes a running maximum count unit that stores a least value of a count of leading zeros and leading ones, and a running minimum count that stores a least value of the count of trailing zeros. The dynamic range is detected based on the least value of the count of leading zeros and leading ones and the count of trailing zeros. The system further includes a scaling factor computation module that computes the block floating point (BFP) scaling factor based on the dynamic range.

    摘要翻译: 提供了一种用于通过检测中央处理单元中的输入信号的动态范围来计算块浮点缩放因子的系统,而没有额外的开销周期。 该系统包括动态范围监测单元,其通过窥探输出写入数据和输入信号的输入存储器读取数据来检测输入信号的动态范围。 动态范围监视单元包括运行的最大计数单元,其存储前导零和前导零的计数的最小值,以及存储尾随零计数的最小值的运行最小计数。 基于前导零和前导零的计数的最小值和尾随零的计数来检测动态范围。 该系统还包括一个缩放因子计算模块,它根据动态范围计算块浮点(BFP)缩放因子。

    Scalable VLIW Processor For High-Speed Viterbi and Trellis Coded Modulation Decoding
    10.
    发明申请
    Scalable VLIW Processor For High-Speed Viterbi and Trellis Coded Modulation Decoding 有权
    可扩展VLIW处理器用于高速维特比和网格编码调制解码

    公开(公告)号:US20100211858A1

    公开(公告)日:2010-08-19

    申请号:US12708323

    申请日:2010-02-18

    摘要: An application specific processor to implement a Viterbi decode algorithm for channel decoding functions of received symbols. The Viterbi decode algorithm is at least one of a Bit Serial decode algorithm, and block based decode algorithm. The application specific processor includes a Load-Store, Logical and De-puncturing (LLD) slot that performs a Load-Store function, a Logical function, a De-puncturing function, and a Trace-back Address generation function, a Branch Metric Compute (BMU) slot that performs a Radix-2 branch metric computations, a Radix-4 branch metric computations, and Squared Euclidean Branch Metric computations, and an Add-Compare-Select (ACS) slot that performs a Radix-2 Path metric computations, a Radix-4 Path metric computations, a best state computations, and a decision bit generation. The LLD slot, the BMU slot and the ACS slot perform in a software pipelined manner to enable high speed Viterbi decoding functions.

    摘要翻译: 一种应用专用处理器,用于实现用于接收符号的信道解码功能的维特比解码算法。 维特比解码算法是位串行解码算法和基于块的解码算法中的至少一种。 应用专用处理器包括执行加载存储功能的加载存储,逻辑和去穿孔(LLD)槽,逻辑功能,去穿孔功能和追踪地址生成功能,分支度量计算 (BMU)时隙,执行基数-2分支度量计算,基数-4分支度量计算和平方欧几里德分支度量计算,以及执行基数2路径度量计算的加法比较选择(ACS) 基数4路径度量计算,最佳状态计算和决策位生成。 LLD插槽,BMU插槽和ACS插槽以软件流水线方式执行,以实现高速维特比解码功能。