Abstract:
A data sorting apparatus comprising 1) a storage sorter that sorts a data set according to a defined criteria; and 2) a query mechanism that receives intermediate sorted data values from the storage sorter and compares the intermediate sorted data values to at least one key value. The storage sorter comprises a priority queue for sorting the data set, wherein the priority queue comprises M processing elements. The query mechanism receives the intermediate sorted data values from the M processing elements. The query mechanism comprises a plurality of comparison circuits, each of the comparison circuits capable of detecting if one of the intermediate sorted data values is equal to the at least one key value or, if no match exists, extracting the minimal value greater than (or less than according to a defined criteria) the at least one key value.
Abstract:
An electronic system, an integrated circuit and a method for display are disclosed. The electronic system contains a first device, a memory and a video/audio compression/decompression device such as a decoder/encoder. The electronic system is configured to allow the first device and the video/audio compression/decompression device to share the memory. The electronic system may be included in a computer in which case the memory is a main memory. Memory access is accomplished by one or more memory interfaces, direct coupling of the memory to a bus, or direct coupling of the first device and decoder/encoder to a bus. An arbiter selectively provides access for the first device and/or the decoder/encoder to the memory. The arbiter may be monolithically integrated into a memory interface. The decoder may be a video decoder configured to comply with the MPEG-2 standard. The memory may store predicted images obtained from a preceding image.
Abstract:
Full predication of instruction execution is provided by operand predicates, where each operand has an associated predicate bit intuitively indicating the validity of the operand value. In a programmable processor supporting operand predication, an instruction will execute only if the predicate bit of every register containing a source operand is true. The predicate bit, if any, of the destination register is set to the logical AND of the source registers' predicates. Similarly, in a non-programmable processor synthesized with predicated operand support, an operator will perform the associated function depending on the state of inputs' predicates. The output predicate is evaluated as the logical AND of the inputs' predicates. An additional bit for each data register, a change in the semantics of the instructions to include predication, and a few additional instructions to save and restore register predicate bits and to specifically set or reset a register's predicate bit are required.
Abstract:
Clustered VLIW processing elements, each preferably simple and identical, are coupled by a runtime reconfigurable inter-cluster interconnect to form a coprocessor executing only those portions of a program having high instruction level parallelism. The initial portion of each program segment executed by the coprocessor reconfigures the interconnect, if necessary, or is skipped. Clusters may be directly connected to a subset of neighboring clusters, or indirectly connected to any other cluster, a hierarchy exposed to the programming model and enabling a larger number of clusters to be employed. The coprocessor is idled during remaining portions of the program to reduce power dissipation.
Abstract:
A coprocessor executing one among a set of candidate kernel loops within an application operates at the minimal clock frequency satisfying schedule constraints imposed by the compiler and data bandwidth constraints. The optimal clock frequency is statically determined by the compiler and enforced at runtime by software-controlled clock circuitry. Power dissipation savings and optimal resource usage are therefore achieved by the adaptation at runtime of the coprocessor clock rate for each of the various kernel loop implementations.