Partitioned shift right logic circuit having rounding support
    1.
    发明授权
    Partitioned shift right logic circuit having rounding support 有权
    具有四舍五入支持的分区右移逻辑电路

    公开(公告)号:US06243728B1

    公开(公告)日:2001-06-05

    申请号:US09351273

    申请日:1999-07-12

    IPC分类号: G06F501

    摘要: A partitioned shift right logic circuit that is programmable and contains rounding support. The circuit of the present invention accepts a 32-bit value and a shift amount and then performs a right shift operation on the 32-bits and automatically rounds the result(s). Signed or unsigned values can be accepted. The right shift circuit is partitioned so that the 32-bit value can represent: (1) a single 32-bit number; or (2) two 16-bit values. A 1 bit selection input indicates the particular partition format. In operation, if the input value is not negative, then one (“1”) is added at the guard bit position and a right shift with truncate is performed. If the input is negative and the guard bit is zero, then no addition is done and a right shift with truncate is performed. If the input is negative and the guard bit is one and the sticky bit is zero, then no addition is done and a right shift with truncate is performed. If the input is negative and the guard bit is one and the sticky bit is one, then one is added at the guard bit position and a right shift with truncate is performed. The shift circuitry used by the present invention is fully partitioned to accept word or half-word input and contains multiple cascaded multiplexer stages for performing partitioned right shifting and supports signed shifting. Each multiplexer stage can be programmed to perform a selected shift amount (including 0 shift). The right shift circuit of the present invention can be used in multi-media applications and can also be used for general purpose and VLIW (very long instruction word) processor without performance degradation.

    摘要翻译: 分配的右移逻辑电路,可编程并包含四舍五入支持。 本发明的电路接受32位值和移位量,然后对32位执行右移操作,并自动舍入结果。 可以接受签名或无符号值。 右移位电路被分区,使得32位值可以表示为:(1)单个32位数; 或(2)两个16位值。 1位选择输入表示特定的分​​区格式。 在操作中,如果输入值不为负,则在保护位位置添加一个(“1”),并执行具有截断的右移位。 如果输入为负并且保护位为零,则不进行任何加法,并且执行具有截断的右移位。 如果输入为负,保护位为1,粘滞位为零,则不进行加法,并执行带截断的右移位。 如果输入为负并且保护位为1,粘滞位为1,则在保护位位置添加一个,并执行带有截断的右移位。 本发明使用的移位电路被完全划分为接受字或半字输入,并且包含用于执行分区右移的多个级联多路复用器级并且支持符号移位。 每个复用器级可以被编程以执行所选择的移位量(包括0移位)。 本发明的右移位电路可用于多媒体应用,也可用于通用和VLIW(非常长的指令字)处理器,而不会降低性能。

    Multiplier circuit having an optimized booth encoder/selector
    2.
    发明授权
    Multiplier circuit having an optimized booth encoder/selector 有权
    具有优化的展位编码器/选择器的乘法器电路

    公开(公告)号:US06301599B1

    公开(公告)日:2001-10-09

    申请号:US09280176

    申请日:1999-03-29

    IPC分类号: G06F752

    CPC分类号: G06F7/5338

    摘要: An improved Booth encoder/selector circuit having an optimized critical path. The Booth encoder has a number of inverters coupled to several of the input multiplier bits. The inverted/non-inverted multiplier bits are then fed as inputs to NAND gates as well as a series of pass gates. The outputs of the pass gates are then fed as inputs to other NAND gates. The output from the NAND gates serve as control signals for controlling the Booth selector. The Booth selector is comprised of inverters and pass gates. Multiplicand bits are input to the pass gates. The control signals generated by the Booth encoder are selectively coupled to the inverters and pass gates such that they control which one of a plurality of multiplicand bits are selected for output. Basically, the Booth selector functions as a multiplexer whereby one of the following is output: the multiplicand bit is multiplied by zero, multiplied by one, multiplied by negative one, multiplied by two, or multiplied by negative two. The Booth encoder/selector is used in a multiplier circuit to minimize the number of partial products. An adder is then used to sum all of the partial products to arrive at the final answer. In the present invention, the critical path has been optimized such that the overall speed of the multiplier is greatly improved.

    摘要翻译: 具有优化的关键路径的改进的布斯编码器/选择器电路。 布斯编码器具有耦合到多个输入乘法器位的多个反相器。 然后将反相/非反相乘法器位作为输入馈送到NAND门以及一系列通路。 然后将通过栅极的输出作为输入馈送到其他NAND门。 来自NAND门的输出用作控制Booth选择器的控制信号。 展位选择器由逆变​​器和通过门组成。 乘数位被输入到通过门。 由布斯编码器产生的控制信号选择性地耦合到反相器并传递门,使得它们控制多个被乘数位中的哪一个被选择用于输出。 基本上,展位选择器用作多路复用器,其中输出以下之一:被乘数位乘以零乘以1,乘以负1乘以2乘以乘以2。 Booth编码器/选择器用于乘法器电路中以最小化部分乘积的数量。 然后使用加法器来求出所有部分乘积以得出最终答案。 在本发明中,关键路径已被优化,使得乘法器的总体速度大大提高。

    High performance pipelined data path for a media processor
    3.
    发明授权
    High performance pipelined data path for a media processor 有权
    用于媒体处理器的高性能流水线数据路径

    公开(公告)号:US06282556B1

    公开(公告)日:2001-08-28

    申请号:US09451669

    申请日:1999-11-30

    IPC分类号: G06F752

    摘要: A pipelined data path architecture for use, in one embodiment, in a multimedia processor. The data path architecture requires a maximum of two execution pipestages to perform all instructions including wide data format multiply instructions and specially adapted multimedia instructions, such as the sum of absolute differences (SABD) instruction and other multiply with add (MADD) instructions. The data path architecture includes two wide data format input registers that feed four partitioned 32×32 multiplier circuits. Within two pipestages, the multiply circuit can perform one 128×128 multiply operation, four 32×32 multiply operations, eight 16×16 multiply operations or sixteen 8×8 multiply operations in parallel. The multiply circuit contains a compressor tree which generates a 256-bit sum and a 256-bit carry vector. These vectors are supplied to four 64-bit carry propagate adder circuits which generate the multiply results. When the data path architecture is performing specially adapted multimedia instructions the input registers are supplied to a pipelined logic unit containing adders, subtractors, shifters, average/round/absolute value circuits, and other logic operation circuits, compressor circuits and multiplexers. The output of the pipelined logic unit is then fed to the four 64-bit carry propagate adder circuits. In this way, the adder circuits of the multiply operation can be effectively used to also process the specially adapted multimedia instructions thereby saving IC area. Multiply circuitry is disabled to save power when the data path architecture is not processing a multiplication instruction.

    摘要翻译: 在一个实施例中,在多媒体处理器中使用的流水线数据路径架构。 数据路径架构最多需要两个执行管道来执行所有指令,包括宽数据格式乘法指令和特别适应的多媒体指令,例如绝对差(SABD)指令和其他乘以加法(MADD)指令的和。 数据路径架构包括两个宽的数据格式输入寄存器,它们提供四个分区的32x32乘法器电路。 在两个管道中,乘法电路可以并行执行一个128x128乘法运算,四个32x32乘法运算,八个16x16乘法运算或十八个8x8乘法运算。 乘法电路包含一个产生256位和的256位进位向量的压缩器树。 这些矢量被提供给产生乘法结果的四个64位进位传播加法器电路。 当数据路径架构执行特别适应的多媒体指令时,输入寄存器被提供给包含加法器,减法器,移位器,平均/舍入/绝对值电路和其它逻辑运算电路,压缩器电路和多路复用器的流水线逻辑单元。 然后将流水线逻辑单元的输出馈送到四个64位进位传播加法器电路。 以这种方式,乘法运算的加法器电路可以有效地用于处理特别适应的多媒体指令,从而节省IC区域。 当数据路径架构不处理乘法指令时,禁用乘法电路以节省电力。

    High performance universal multiplier circuit
    4.
    发明授权
    High performance universal multiplier circuit 有权
    高性能通用乘法器电路

    公开(公告)号:US06353843B1

    公开(公告)日:2002-03-05

    申请号:US09415485

    申请日:1999-10-08

    IPC分类号: G06F752

    摘要: A partitioned multiplier circuit which is designed for high speed operations. The multiplier of the present invention can perform one 32×32 bit multiplication, two 16×16 bit multiplications (simultaneously) or four 8×8 bit multiplications (simultaneously) depending on input partitioning signals. The time required to perform either the 32×32 bit or the 16×16 bit or the 8×8 bit multiplications is constant. Therefore, multiplication results are available with a constant latency regardless of operand bit-size. In one embodiment, the latency is two clock cycles but the multiplier circuit has a throughput of one clock cycle due to pipelining. The input operands can be signed or unsigned. The hardware is partitioned without any significant increase in the delay or area and the multiplier can provide six different modes of operation. In one embodiment, Booth encoding is used for the generation of 17 partial products which are compressed using a compression tree into two 64-bit values. This is performed in the first pipeline stage to generate a sum and a carry vector. These values are then added, in the second pipestage, using a carry propagate adder circuit to provide a single 64-bit result. In the case of 16×16 bit multiplication, the 64-bit result contains two 32-bit results. In the case of 8×8 bit multiplication, the 64-bit result contains four 16-bit results. Due to its high operating speed, the multiplier circuit is advantageous for use in multi-media applications, such as audio/visual rendering and playback.

    摘要翻译: 分频乘法电路,专为高速运行而设计。 根据输入的分频信号,本发明的乘法器可以执行一个32×32位乘法,两次16×16位乘法(同时)或四个8×8位乘法(同时)。 执行32x32位或16x16位或8x8位乘法所需的时间是常数。 因此,无论操作数位大小如何,乘法结果都可以使用恒定的延迟。 在一个实施例中,等待时间是两个时钟周期,但乘法器电路由于流水线而具有一个时钟周期的吞吐量。 输入操作数可以是有符号的或无符号的。 硬件被分配,而延迟或区域没有任何显着增加,乘法器可以提供六种不同的操作模式。 在一个实施例中,布斯编码用于生成使用压缩树压缩为两个64位值的17个部分乘积。 这在第一流水线阶段执行以产生和和进位向量。 然后,在第二个管道中,使用进位传播加法器电路来添加这些值,以提供单个64位结果。 在16×16位乘法的情况下,64位结果包含两个32位结果。 在8×8位乘法的情况下,64位结果包含四个16位结果。 由于其高的操作速度,乘法器电路有利于在多媒体应用中使用,例如音频/视觉呈现和回放。

    Encryption processor for performing accelerated computations to establish secure network sessions connections
    6.
    发明授权
    Encryption processor for performing accelerated computations to establish secure network sessions connections 有权
    加密处理器用于执行加速计算以建立安全的网络会话连接

    公开(公告)号:US07509486B1

    公开(公告)日:2009-03-24

    申请号:US09611809

    申请日:2000-07-07

    IPC分类号: H04L29/00 H04L9/00

    摘要: Methods and apparatus for an encryption processor for performing accelerated computations to establish secure network sessions. The encryption processor includes an execution unit and a decode unit. The execution unit is configured to execute Montgomery operations and including at least one adder and at least two multipliers. The decode unit is configured to determine if a square operation or a product operation needs to be performed and to issue the appropriate instructions so that certain multiply and/or addition operations are performed in parallel in the execution unit while performing either the Montgomery square or Montgomery product operation.

    摘要翻译: 用于执行加速计算以建立安全网络会话的加密处理器的方法和装置。 加密处理器包括执行单元和解码单元。 执行单元被配置为执行蒙哥马利操作并且包括至少一个加法器和至少两个乘法器。 解码单元被配置为确定是否需要执行平方操作或产品操作,并且发出适当的指令,使得在执行单元中执行某些乘法和/或附加操作,同时执行蒙哥马利广场或蒙哥马利 产品操作。

    Data processing unit with hardware assisted context switching capability
    7.
    发明授权
    Data processing unit with hardware assisted context switching capability 失效
    具有硬件辅助上下文切换功能的数据处理单元

    公开(公告)号:US6128641A

    公开(公告)日:2000-10-03

    申请号:US928252

    申请日:1997-09-12

    摘要: The present invention relates to a method of context switching from a first task to a second task in a data processing unit having a register file with a plurality of general purpose registers and a context switch register, a memory comprising a previous context save area and an unused context save area. The memory is coupled with the register file and an instruction control unit with a program counter register and a program status word register coupled with the memory and the register file. The method comprises the steps of acquiring a new save area from said unused save area, storing the context of the first task in said new area, linking the new area with said previous context save area.

    摘要翻译: 本发明涉及一种在具有多个通用寄存器和上下文切换寄存器的寄存器文件的数据处理单元中的从第一任务到第二任务的上下文切换的方法,包括先前的上下文保存区域的存储器和 未使用的上下文保存区域。 存储器与寄存器文件和具有程序计数器寄存器的指令控制单元和与存储器和寄存器文件耦合的程序状态字寄存器耦合。 该方法包括以下步骤:从所述未使用的保存区域获取新的保存区域,将第一任务的上下文存储在所述新区域中,将新区域与所述先前的上下文保存区域相链接。

    Flip-flop
    8.
    发明授权
    Flip-flop 失效
    拖鞋

    公开(公告)号:US06232810B1

    公开(公告)日:2001-05-15

    申请号:US09208618

    申请日:1998-12-08

    IPC分类号: H03K312

    摘要: An improved SR latch has a two stages. A generation block generates Q and {overscore (Q)} signals from a set signal and a reset signal. The generation block also has an inactive state. A storage block receives the Q and {overscore (Q)} signals and maintains the Q signal and {overscore (Q)} signals at the voltage level that was output by the generation block prior to when the generation block blocks becomes inactive. In another embodiment, an improved D flip-flop has a sensing block with the improved SR latch of the present invention.

    摘要翻译: 改进的SR锁存器有两个阶段。 生成块从设置信号和复位信号产生Q和{overscore(Q)}信号。 生成块也具有非活动状态。 存储块接收Q和{overscore(Q)}信号,并且在生成块块变为不活动之前将Q信号和{overscore(Q)}信号维持在由生成块输出的电压电平。 在另一个实施例中,改进的D触发器具有具有本发明的改进的SR锁存器的感测块。

    Register selection mechanism and organization of an instruction prefetch
buffer
    10.
    发明授权
    Register selection mechanism and organization of an instruction prefetch buffer 失效
    注册选择机制和指令预取缓冲区的组织

    公开(公告)号:US4847759A

    公开(公告)日:1989-07-11

    申请号:US237615

    申请日:1987-08-04

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3814 G06F9/3802

    摘要: A register selection mechanism for an instruction prefetch buffer which allows instructions having different lengths to be accessed on the instruction boundaries. The instruction prefetch buffer comprises a one-port-write, two-port-read array (10). Address generation and control logic (16) is responsive to a read pointer (15) for controlling access to odd and oven addresses in the array. Additional logic may be provided to provide an indication that the instruction prefetch buffer is empty.

    摘要翻译: 用于指令预取缓冲器的寄存器选择机制,其允许在指令边界上访问具有不同长度的指令。 指令预取缓冲器包括单端口写入双端口读取阵列(10)。 地址生成和控制逻辑(16)响应于读指针(15)来控制对阵列中的奇数和加热炉地址的访问。 可以提供附加逻辑以提供指令预取缓冲器为空的指示。