摘要:
This invention presents a unique implementation of the extrinsic block the turbo decoder that solves the problem of generation and use of precision extension and normalization in the alpha and beta metrics blocks. Both alpha metric inputs and beta metric inputs are processed via a circle boundary detector indicating the quadrant of the two's complement input and a precision extend block receiving an input and a corresponding circle boundary input. An extrinsics block includes a two's complement adder of the precision extended alpha and beta metrics inputs. The proposed solution obviates the need for normalization in the alpha and beta metric blocks.
摘要:
This invention describes implementation approaches for sliding window turbo decoders. Sliding windows are used for both the beta and alpha state metric calculations. Initialization of the beta/alpha prolog sections with data from a previous iteration is employed in conjunction with a reduced length prolog section. For subsequent sliding windows the trellis values of the prolog sections are dynamically initialized based upon data derived from the signal to noise ratio of the calculated extrinsic data or the difference between the two most probable trellis states.
摘要:
The concurrent memory control turbo decoder solution of this invention uses a single port main memory and a simplified scratch memory. This approach uses an interleaved forward-reverse addressing which greatly relieves the amount of memory required. This approach is in marked contrast to conventional turbo decoders which employ either a dual port main memory or a single port main memory in conjunction with a complex ping-ponged scratch memory. In the system of this invention, during each cycle accomplishes one read and one write operation in the scratch memories. If a particular location in memory, has been read, then that location is free. The next write cycle can use that location to store its data. Similarly a simplified beta RAM is implemented using a unique addressing scheme which also obviates the need for a complex ping-ponged beta RAM.
摘要:
The addition of a specialized instruction to perform the MAX star function provides a way to get better performance turbo decoding on a digital signal processor. A subtractor forms the difference between inputs A and B. The sign of this difference controls a multiplexer selection of the max function maximum of inputs A and B. The difference is applied to a lookup table built to handle both positive and negative inputs. The look up table output is summed with with the difference to form the MAX star result. The size of the lookup table is selected to match the required resolution.
摘要:
A digital signal processor (DSP) co-processor according to a clustered architecture with local memories. Each cluster in the architecture includes multiple sub-clusters, each sub-cluster capable of executing one or two instructions that may be specifically directed to a particular DSP operation. The sub-clusters in each cluster communicate with global memory resources by way of a crossbar switch in the cluster. One or more of the sub-clusters has a dedicated local memory that can be accessed in a random access manner, in a vector access manner, or in a streaming or stack manner. The local memory is arranged as a plurality of banks. In response to certain vector access instructions, the input data may be permuted among the banks prior to a write, or permuted after being read from the banks, according to a permutation pattern stored in a register.
摘要:
This invention provides the correct Viterbi decode traceback starting index is obtained for all constraint lengths and frame sizes. Reverse transpose operations that depend on the last active add-compare-select unit a cascade block of the state metric update process. This last active add-compare-select unit controls selection of T counter signals used in the decode.
摘要:
A programmable logic device, such as a digital signal processor (DSP) (130), having a Chien search unit (116) is disclosed. The Chien search unit (116) is arranged to perform finite field arithmetic functions useful in identifying roots of a polynomial, as is useful in Reed-Solomon decoding, particularly, after the execution of a Euclidean array function. Galois field multipliers (306) perform finite field multiplication of coefficient values (&Lgr;) and powers of symbol values (&agr;); the products of such multiplications are written into the coefficient register (304) for use in connection with the next symbol value. Finite field adders (308, 310; 318, 320) produce a final sum that is interrogated by zero detection circuitry (206) to determine whether a root is presented by the current symbol value. The provision of a Chien search execution unit (116) provides important efficiency so as to enable programmable logic devices, such as digital signal processors (130) and microprocessors to effect Reed-Solomon decoding.
摘要:
A combined Chien search and error position circuit (116), for use in Reed-Solomon decoding, is disclosed. The circuit (116) operates in response to a zero signal (ZRO) issued by a root detection block (200) that iteratively evaluates an error locator polynomial .LAMBDA.(x) over the Galois field used in the coding. A zeroes register (218) and a position register (22) are provided, each of which have a plurality of stages (218.sub.0 through 218.sub.t ; 220.sub.0 through 220.sub.t). An index counter (208) maintains a count over the Galois field, corresponding to the Galois field element under evaluation in the root detection block (200). An exponentiation circuit (212) performs a Galois field exponentiation of the count, and applies the result to the inputs of each of the zeroes register stages (218.sub.0 through 218.sub.t); the count is subtracted from the maximum Galois field index (e.g., from 255 for Galois field 256) and, for all but the zeroth iteration, the difference is applied to the inputs of each of the position register stages (220.sub.0 through 220.sub.t). A root counter (207) maintains a count of the number of roots identified by the root detection block (200), which is used to sequentially select the register stages (218.sub.0 through 218.sub.t ; 220.sub.0 through 220.sub.t) into which the zeroes and position values are stored.
摘要:
A programmable logic device, such as a digital signal processor (DSP) (130), having a Euclidean array unit (115; 115') is disclosed. The Euclidean array unit (115; 115') is arranged to perform finite field arithmetic functions useful in determining the greatest common factor among two polynomial series, in a sequential fashion beginning with a highest order pair of operands (A.sub.0, B.sub.0) and proceeding along the sequence. A source register (SRC) receives each pair of operands, and the results are stored in a result register (RES) in reverse order, prior to writing the results in memory. As a result, B result values are stored in the same location as the A input operand, and vice versa. This reversal of memory locations permits successive passes of the Euclidean operation to be carried out with simple incrementing of the starting byte address (SBA) at which the operands are located in memory, thus eliminating the need for large memory shifts. The Euclidean array unit (115') may also operate upon more than one A and B input operand at a time, for further efficiency.
摘要:
A combinatorial polynomial multiplier for Galois Field 256 arithmetic utilizes fewer components than an iterative Galois Field 256 arithmetic multiplier and operates 8 times faster. The combinatorial multiplier employs AND and XOR functions and operates in a single clock cycle. It can reduce the number of transistors required for the Galois Field 256 arithmetic multiplier for a Reed-Solomon decoder by almost 90%.
摘要翻译:Galois Field 256算法的组合多项式乘法器比迭代Galois Field 256算术乘法器使用更少的分量,并且运算速度提高了8倍。 组合乘法器采用AND和XOR功能,并在单个时钟周期内运行。 它可以将Reed-Solomon解码器的Galois Field 256算术乘法器所需的晶体管数量减少近90%。