Processing pipeline with zero loop overhead

    公开(公告)号:US11989554B2

    公开(公告)日:2024-05-21

    申请号:US17131970

    申请日:2020-12-23

    Abstract: Techniques are disclosed for reducing or eliminating loop overhead caused by function calls in processors that form part of a pipeline architecture. The processors in the pipeline process data blocks in an iterative fashion, with each processor in the pipeline completing one of several iterations associated with a processing loop for a commonly-executed function. The described techniques leverage the use of message passing for pipelined processors to enable an upstream processor to signal to a downstream processor when processing has been completed, and thus a data block is ready for further processing in accordance with the next loop processing iteration. The described techniques facilitate a zero loop overhead architecture, enable continuous data block processing, and allow the processing pipeline to function indefinitely within the main body of the processing loop associated with the commonly-executed function where efficiency is greatest.

    VECTOR PROCESSOR UTILIZING MASSIVELY FUSED OPERATIONS

    公开(公告)号:US20230004389A1

    公开(公告)日:2023-01-05

    申请号:US17358231

    申请日:2021-06-25

    Abstract: Techniques are disclosed for the use of fused vector processor instructions by a vector processor architecture. Each fused vector processor instruction may include a set of fields associated with individual vector processing instructions. The vector processor architecture may implement local buffers facilitating a single vector processor instruction to be used to execute each of the individual vector processing instructions without re-accessing vector registers between each executed individual vector processing instruction. The vector processor architecture enables less communication across the interconnection network, thereby increasing interconnection network bandwidth and the speed of computations, and decreasing power usage.

    Vector processor having instruction set with sliding window non-linear convolutional function
    14.
    发明授权
    Vector processor having instruction set with sliding window non-linear convolutional function 有权
    矢量处理器具有滑动窗非线性卷积函数的指令集

    公开(公告)号:US09363068B2

    公开(公告)日:2016-06-07

    申请号:US14168615

    申请日:2014-01-30

    Abstract: A processor is provided having an instruction set with a sliding window non-linear convolution function. A processor obtains a software instruction that performs a non-linear convolution function for a plurality of input delayed signal samples. In response to the software instruction for the non-linear convolution function, the processor generates a weighted sum of two or more of the input delayed signal samples, wherein the weighted sum comprises a plurality of variable coefficients defined as a sum of one or more non-linear functions of a magnitude of the input delayed signal samples; and repeats the generating step for at least one time-shifted version of the input delayed signal samples to compute a plurality of consecutive outputs. The software instruction for the non-linear convolution function is optionally part of an instruction set of the processor. The non-linear convolution function can model a non-linear system with memory, such as a power amplifier model and/or a digital pre-distortion function.

    Abstract translation: 提供具有具有滑动窗非线性卷积函数的指令集的处理器。 处理器获得对多个输入延迟信号样本执行非线性卷积函数的软件指令。 响应于用于非线性卷积函数的软件指令,处理器生成两个或更多个输入延迟信号样本的加权和,其中加权和包括被定义为一个或多个非线性卷积的和的多个可变系数, 输入延迟信号采样幅度的线性函数; 并重复所述生成步骤,用于输入延迟信号采样的至少一个时移版本,以计算多个连续输出。 用于非线性卷积函数的软件指令可选地是处理器的指令集的一部分。 非线性卷积函数可以对具有存储器的非线性系统进行建模,例如功率放大器模型和/或数字预失真功能。

    Processor embedded streaming buffer

    公开(公告)号:US12248333B2

    公开(公告)日:2025-03-11

    申请号:US17358218

    申请日:2021-06-25

    Inventor: Joseph Williams

    Abstract: Techniques are disclosed for the use of local buffers integrated into the execution units of a vector processor architecture. The use of local buffers results in less communication across the interconnection network implemented by vector processors, and increases interconnection network bandwidth, increases the speed of computations, and decreases power usage.

    Vector processor supporting linear interpolation on multiple dimensions

    公开(公告)号:US12106101B2

    公开(公告)日:2024-10-01

    申请号:US17131939

    申请日:2020-12-23

    CPC classification number: G06F9/30036 G06F9/3812 G06F9/3873 G06F16/9017

    Abstract: Techniques are disclosed for a vector processor architecture that enables data interpolation in accordance with multiple dimensions, such as one-, two-, and three-dimensional linear interpolation. The vector processor architecture includes a vector processor and accompanying vector addressable memory that enable a simultaneous retrieval of multiple entries in the vector addressable memory to facilitate linear interpolation calculations. The vector processor architecture vastly increases the speed in which such calculations may occur compared to conventional processing architectures. Example implementations include the calculation of digital pre-distortion (DPD) coefficients for use with radio frequency (RF) transmitter chains to support multi-band applications.

    PROCESSOR EMBEDDED STREAMING BUFFER

    公开(公告)号:US20220413852A1

    公开(公告)日:2022-12-29

    申请号:US17358218

    申请日:2021-06-25

    Inventor: Joseph Williams

    Abstract: Techniques are disclosed for the use of local buffers integrated into the execution units of a vector processor architecture. The use of local buffers results in less communication across the interconnection network implemented by vector processors, and increases interconnection network bandwidth, increases the speed of computations, and decreases power usage.

    PROCESSING PIPELINE WITH ZERO LOOP OVERHEAD

    公开(公告)号:US20220197641A1

    公开(公告)日:2022-06-23

    申请号:US17131970

    申请日:2020-12-23

    Abstract: Techniques are disclosed for reducing or eliminating loop overhead caused by function calls in processors that form part of a pipeline architecture. The processors in the pipeline process data blocks in an iterative fashion, with each processor in the pipeline completing one of several iterations associated with a processing loop for a commonly-executed function. The described techniques leverage the use of message passing for pipelined processors to enable an upstream processor to signal to a downstream processor when processing has been completed, and thus a data block is ready for further processing in accordance with the next loop processing iteration. The described techniques facilitate a zero loop overhead architecture, enable continuous data block processing, and allow the processing pipeline to function indefinitely within the main body of the processing loop associated with the commonly-executed function where efficiency is greatest.

Patent Agency Ranking