Abstract:
Methods, systems, and devices for wireless communication are described. A user equipment (UE) may tune an auxiliary receiver within a first radio to a transmission frequency of a co-located second radio. The auxiliary receiver may downconvert a signal from the second radio so that the UE may generate an interference estimate and perform interference cancellation. In some cases, the auxiliary receiver may also be used to perform transmission corrections for transmissions of the first radio. For example, the auxiliary receiver may be used to enable gain control or digital predistortion. The auxiliary receiver may be selectively tuned to the transmission frequency of the first radio or the second radio based on whether the auxiliary receiver is being used to perform interference cancellation or transmission correction.
Abstract:
Embodiments disclosed herein include vector processing engines (VPEs) having programmable data path configurations for providing multi-mode vector processing. Related vector processors, systems, and methods are also disclosed. The VPEs include a vector processing stage(s) configured to process vector data according to a vector instruction executed in the vector processing stage. Each vector processing stage includes vector processing blocks each configured to process vector data based on the vector instruction being executed. The vector processing blocks are capable of providing different vector operations for different types of vector instructions based on data path configurations. Data paths of the vector processing blocks are programmable to be reprogrammable to process vector data differently according to the particular vector instruction being executed. In this manner, a VPE can be provided with its data paths configuration programmable to execute different types of functions based on data path configuration according to the vector instruction being executed.
Abstract:
Vector processing engines (VPEs) having programmable data path configurations for providing multi-mode Radix-2X butterfly vector processing circuits. Related vector processors, systems, and methods are also disclosed. The VPEs disclosed herein include a plurality of vector processing stages each having vector processing blocks that have programmable data path configurations for performing Radix-2X butterfly vector operations to perform Fast Fourier Transform (FFT) vector processing operations efficiently. The data path configurations of the vector processing blocks can be programmed to provide different types of Radix-2X butterfly vector operations as well as other arithmetic logic vector operations. As a result, fewer VPEs can provide desired Radix-2X butterfly vector operations and other types arithmetic logic vector operations in a vector processor, thus saving area in the vector processor while still retaining vector processing advantages of fewer register writes and faster vector instruction execution times over scalar processing engines.
Abstract:
Systems and methods for generating twiddle factors are described herein according to various embodiments of the present disclosure. In one embodiment, a method for twiddle factor generation comprises generating a first twiddle phase, wherein the first twiddle phase is from a set of radix-M1 twiddle phases, and M1 is an integer. The method also comprises converting the first twiddle phase into a second twiddle phase, wherein the second twiddle phase is from a set of radix-M2 twiddle phases, and M2 is an integer that is different from M1. The method further comprises generating a twiddle factor based on the second twiddle phase.
Abstract:
Systems and methods for performing on-the-fly format conversion on data vectors during load/store operations are described herein. In one embodiment, a method for loading a data vector from a memory into a vector unit comprises reading a plurality of samples from the memory, wherein the plurality of samples are packed in the memory. The method also comprises unpacking the samples to obtain a plurality of unpacked samples, performing format conversion on the unpacked samples in parallel, and sending at least a portion of the format-converted samples to the vector unit.
Abstract:
Systems and method for reading data samples in reverse group order are described herein according to various embodiments of the present disclosure. In one embodiment, a method for reading data samples in a memory is provided, wherein the data samples correspond to an operand of a vector operation, the data samples are grouped into a plurality of different groups, and the different groups are spaced apart by a plurality of addresses in the memory. The method comprises reading the groups of data samples in reverse group order, and, for each group, reading the data samples in the group in forward order.
Abstract:
Transmission of data over a serial link based on a unidirectional clock signal is provided. A unidirectional clock signal is generated based on a first clock of a master device. The unidirectional clock signal is sent to a slave device that is connected to the serial link. The master device transmits data to the slave device over the serial link based on the first clock. The slave device receives the unidirectional clock signal from a master device. The slave device transmits data over the serial link to the master device based on the unidirectional clock signal.
Abstract:
Vector processing engines (VPEs) employing format conversion circuitry in data flow paths between vector data memory and execution units to provide in-flight format-converting of input vector data to execution units for vector processing operations are disclosed. Related vector processor systems and methods are also disclosed. Format conversion circuitry is provided in data flow paths between vector data memory and execution units in the VPE. The format conversion circuitry is configured to convert input vector data sample sets fetched from vector data memory in-flight while the input vector data sample sets are being provided over the data flow paths to the execution units to be processed. In this manner, format conversion of the input vector data sample sets does not require pre-processing, storage, and re-fetching from vector data memory, thereby reducing power consumption and not limiting efficiency of the data flow paths by format conversion pre-processing delays.
Abstract:
Systems, methods, and apparatus for synchronizing timing in devices coupled to a data communication link are disclosed. In one example, a first device programs a future system time value in a second device. The first device launches a low-latency trigger signal that causes the future system time value to be loaded into a timer of the second device when a timer of the first device matches the future system time value. The second device measures phase difference between the trigger signal and edges of a clock signal used for timing in the second device. The phase difference is measured using an oversampling clock that provides a desired measurement reliability. The measured phase difference permits the first device to accurately determine system time as applied to the second device. The trigger signal can be provided on existing pins used by first and second devices in accordance with communication protocols and specifications.
Abstract:
Vector processing engines (VPEs) employing a tapped-delay line(s) for providing precision correlation/covariance vector processing operations with reduced sample re-fetching and/or power consumption are disclosed. The VPEs disclosed herein are configured to provide correlation/covariance vector processing operations, such as code division multiple access (CDMA) correlation/covariance vector processing operations as a non-limiting example. A tapped-delay line(s) is included in the data flow paths between memory and execution units in the VPE. The tapped-delay line (s) is configured to receive and provide an input vector data sample set to execution units for performing correlation/covariance vector processing operations. The tapped-delay line(s) is also configured to shift the input vector data sample set for each filter delay tap and provide the shifted input vector data sample set to the execution units, so the shifted input vector data sample set need not be re-fetched from the vector data file during the filter vector processing operations.