摘要:
An approach is provided for vectorizing misaligned references in compiled code for SIMD architectures that support only aligned loads and stores. In this framework, a loop is first simdized as if the memory unit imposes no alignment constraints. The compiler then inserts data reorganization operations to satisfy the actual alignment requirements of the hardware. Finally, the code generation algorithm generates SIMD codes based on the data reorganization graph, addressing realistic issues such as runtime alignments, unknown loop bounds, residual iteration counts, and multiple statements with arbitrary alignment combinations. Loop peeling is used to reduce the computational overhead associated with misaligned data. A loop prologue and epilogue are peeled from individual iterations in the simdized loop, and vector-splicing instructions are applied to the peeled iterations, while the steady-state loop body incurs no additional computational overhead.
摘要:
A method for generating code, including identifying at least one portion of source code that is simdizable and has a dependence, analyzing the dependence for characteristics, based upon the characteristics, selecting a transformation from a predefined group of transformations, applying the transformation to the at least one portion to generate SIMD code for the at least one portion.
摘要:
The present invention provides methods, systems and a machine readable medium including machine readable code for identifying an ill-exposed image. An image including a first image block is received. The luminance data and the texture energy data associated with the first image block are assessed. A determination is made regarding whether the received image is an ill-exposed image based on the assessment of the luminance data and the assessment of the texture energy data associated with the first image block.
摘要:
A series of AB-type amphiphilic dendritic polyesters have been prepared divergently, in which two hybrids were coupled via the copper(1)-catalyzed triazole formation.
摘要:
Generating loop code to execute on Single-Instruction Multiple-Datapath (SIMD) architectures, where the loop operates on datatypes having different lengths, is disclosed. Further, a preferred embodiment of the present invention includes a novel technique to efficiently realign or shift arbitrary streams to an arbitrary offset, regardless whether the alignments or offsets are known at the compile time or not. This technique enables the application of advanced alignment optimizations to runtime alignment. Length conversion operations, for packing and unpacking data values, are included in the alignment handling framework. These operations are formally defined in terms of standard SIMD instructions that are readily available on various SIMD platforms. This allows sequential loop code operating on datatypes of disparate length to be transformed (“simdized”) into optimized SIMD code through a fully automated process.
摘要:
Loop code is generated to execute on Single-Instruction Multiple-Datapath (SIMD) architectures, where the loop operates on datatypes having different lengths. Further, a preferred embodiment of the present invention includes a novel technique to efficiently realign or shift arbitrary streams to an arbitrary offset, regardless whether the alignments or offsets are known at the compile time or not. This technique enables the application of advanced alignment optimizations to runtime alignment. This allows sequential loop code operating on datatypes of disparate length to be transformed (“simdized”) into optimized SIMD code through a fully automated process.
摘要:
A personalized slide show generation system is comprised of a script generator and a personalized slide show generation engine. The script generator is configured for employing a user interaction associated with an image to generate an interaction script. The personalized slideshow generation engine is coupled to the script generator and configured for utilizing the image and the interaction script to generate a personalized slide show.
摘要:
A process for easily synthesizing a zeolite substance containing an element having a large ionic radius in the framework at a high ratio. This process comprises the following first to fourth steps:First Step: a step of heating a mixture containing a template compound, a compound containing a Group 13 element of the periodic table, a silicon-containing compound and water to obtain a precursor (A); Second Step: a step of acid-treating the precursor (A) obtained in the first step; Third Step: a step of heating the acid-treated precursor (A) obtained in the second step together with a mixture containing a template compound and water to obtain a precursor (B); and Fourth Step: a step of calcining the precursor (B) obtained in the third step to obtain a zeolite substance.
摘要:
Computer implemented method, system and computer program product for aligning vectors to be processed by SIMD code. A pair of vectors to be aligned at runtime and having a known relative alignment at compile time is identified. A modified second memory reference is generated by modifying an address of the second memory reference to be in a same congruence class as the first memory reference, wherein the congruence class is mod V and wherein V is SIMD byte width. A first SIMD load located at the modified second memory reference and a next adjacent SIMD load located at a third memory reference corresponding to the modified second memory reference address plus V are loaded, and the first SIMD load and the next adjacent SIMD load are concatenated to generate a resultant vector of length 2V. The resultant vector is left shifted by an amount corresponding to a difference between the addresses of the first memory reference and the second memory reference mod V, and the leftmost V bytes of the resultant vector are retained to align the first and second vectors.
摘要:
A computer-implemented method, system, and program product for optimizing a distributed (software) application are provided. Specifically, a configuration of a target computing environment, in which the distributed application is deployed, is discovered upon deployment of the distributed application. Thereafter, based on a set of rules and the discovered configuration, one or more optimization techniques are applied to optimize the distributed application. In a typical embodiment, the set of rules can be embedded in the distributed application, or they can be accessed from an external source such as a repository. Regardless, the optimization techniques applied can include at least one of the following: (1) identification and replacement of an underperforming component of the distributed application with a new component; (2) generation of interface layers (to allow selection of optimal bindings) between distributed objects of the distributed application; and/or (3) execution of code transformation of the distributed application using program analysis techniques.