Abstract:
The data processing system loads three input operands, including two input vectors and a control vector, into vector registers and performs a permutation of the two input vectors as specified by the control vector, and further stores the result of the operation as the output operand in an output register. The control vector consists of sixteen indices, each uniquely identifying a single byte of input data in either of the input registers, and can be specified in the operational code or be the result of a computation previously performed within the vector registers. The control vector is specified by calculating the offset of a selected vector element of the input vector relative to a base address of the input vector and loading each element with an index equal to the relative offset. Alternatively, the generation of the alignment vector is made by performing a look-up within a look-up table. For additional loads from the same vector, the control vector does not change, since the alignment shift amount of the vector from an address boundary does not change. A permutation instruction can then be executed to load and shift the data to realign it in the output register at the vector boundary.
Abstract:
A verification system for still images that embeds a watermark so that no visual artifacts are created in the images and thus maintains the visual quality of the image. The algorithm embeds information in an uncompressed image so as to later detect the alteration of the image, as well as the location of the alteration. The embedding of information into a source image is based on a defined mapping process. An image plane consists of macroblocks, which are themselves comprised of microblocks. A code is embedded corresponding to the value of this image property in each macroblock. The specific sequence of microblocks used for embedding this information in the watermarking image plane is a unique function of this property for the corresponding set of microblocks in the indexing image plane. This information can be later decoded from the stamped image. The watermark is embedded by combining the pixel values of the image with the watermark. The watermark is altered if the image is altered.
Abstract:
Dynamic migration of a cache prefetch request is performed. A prefetch candidate table maintains at least one prefetch candidate which may be executed as a prefetch request. The prefetch candidate includes one or more trigger addresses which correspond to locations in the instruction stream where the prefetch candidate is to be executed as a prefetch request. A jump history table maintains a record of target addresses of program branches which have been executed. The trigger addresses in the prefetch candidate are defined by the target addresses of recently executed program branches maintained in the jump history table. A pending prefetch table maintains a record of executed prefetch requests. When an operation such as a cache miss, cache hit, touch instruction or program branch is identified, the pending prefetch table is scanned to determine whether a prefetch request has been executed. If a prefetch request has been executed, the prefetch candidate which was used to execute that prefetch request is updated. That is, a new trigger address in the prefetch candidate is selected in order to reduce access latency.
Abstract:
In a computer system having a hierarchical memory, the problem of tolerating cache miss latency is solved by dynamically switching appropriately between two different code sequences, one optimized at compile-time, assuming a cache-hit, and the other optimized at compile-time, assuming a cache-miss. A method for processing instructions and data in a computer system including a hierarchical memory and a static instruction sequence including a memory access instruction and associated memory access latency specific code sequences, each code sequence optimized dependent on an execution of the memory access instruction causing one of a hit or a miss at a level of the memory hierarchy, includes the steps of: decoding and executing the memory access instruction and storing information indicating whether the execution of the memory access instruction caused the hit or the miss; and branching to a cache hit optimized code sequence when the information indicates the hit and a miss optimized code sequence when the information indicates the miss, responsive to the step of storing. Preferably, the memory access latency specific code sequences are associated with one or more identified critical miss-points. The step of branching may be responsive to an inserted branch instruction associated with the memory access instruction. The branch instruction may also specify a level of the cache memory upon which the step of branching is recommended.