摘要:
A cell broadband engine processor includes a memory a power processing element (PPE) coupled with the memory, and a plurality of synergistic processing elements. The PPE creates a SPE as a computing SPE for an application. The PPE determines idles ones of the plurality of SPEs, and creates an idle one of the plurality SPEs as a managing SPE. Each of the plurality of SPEs is associated with a local storage. The managing SPE informs the computing SPE of a starting effective address of the local storage of the managing SPE and an effective address for a command queue. The managing SPE manages movement of data associated with computing of the computing SPE based on one or more commands associated with the application. A computing SPE sends the one or more commands to the managing SPE for insertion into the command queue.
摘要:
A method of managing data movement in a cell broadband engine processor, comprising: determining one or more idle synergistic processing elements among multiple SPEs in the cell broadband engine processor as a managing SPE, and informing a computing SPE among said multiple SPEs of a starting effective address of a LS of said managing SPE and an effective address for a command queue; and said managing SPE managing movement of data associated with computing of said computing SPE based on the command queue from the computing SPE.
摘要:
A cell broadband engine processor includes memory, a power processing element (PPE) coupled with the memory, and a plurality of synergistic processing elements. The PPE creates a SPE as a computing SPE for an application. The PPE determines idles ones of the plurality of SPEs, and creates a managing SPE from one of the idle SPEs. Each of the plurality of SPEs is associated with a local storage. The managing SPE informs the computing SPE of a starting effective address of the local storage of the managing SPE and an effective address for a command queue. The managing SPE manages movement of data associated with computing of the computing SPE based on one or more commands associated with the application. A computing SPE sends the one or more commands to the managing SPE for insertion into the command queue.
摘要:
A cell broadband engine processor includes memory, a power processing element (PPE) coupled with the memory, and a plurality of synergistic processing elements. The PPE creates a SPE as a computing SPE for an application. The PPE determines idles ones of the plurality of SPEs, and creates a managing SPE from one of the idle SPEs. Each of the plurality of SPEs is associated with a local storage. The managing SPE informs the computing SPE of a starting effective address of the local storage of the managing SPE and an effective address for a command queue. The managing SPE manages movement of data associated with computing of the computing SPE based on one or more commands associated with the application. A computing SPE sends the one or more commands to the managing SPE for insertion into the command queue.
摘要:
An information handling system includes a processor that may perform preprocessing on a variable-length code (VLC) bitstream before decoding the bitstream. The bitstream includes multiple codewords. The processor analyzes incoming VLC bitstream information and generates codeword table information for storage in a system memory or a VLC codeword tables location. The processor generates a VLC lookup table from the information in the VLC codeword tables and stores that VLC lookup table in a system memory of the IHS. The VLC lookup table may exhibit two dimensional indexing by leading zero count and bit-length possibility.
摘要:
The present invention provides an overlay instruction accessing unit and method, and a method and apparatus for compressing and storing a program. The overlay instruction accessing unit is used to execute a program stored in a memory in the form of a plurality of compressed program segments, and compresses: a buffer; a processing unit for issuing an instruction reading request, reading an instruction from the buffer, and executing the instruction; and a decompressing unit for reading a requested compressed instruction segment from the memory in response to the instruction reading request of the processing unit, decompressing the compressed instruction segment, and storing the decompressed instruction segment in the buffer, wherein while the processing unit is executing the instruction segment, the decompressing unit reads, according to a storage address of a compressed program segment to be invoked in a header corresponding to the instruction segment, a corresponding compressed instruction segment from the memory, decompresses the compressed instruction segment, and stores the decompressed instruction segment in the buffer for later use by the processing unit.
摘要:
The present invention provides a method and apparatus for partitioning, sorting a data set on a multi-processor system. Herein, the multi-processor system has at least one core processor and a plurality of accelerators. The method for partitioning a data set comprises: partitioning iteratively said data set into a plurality of buckets corresponding to different data ranges by using said plurality of accelerators in parallel, wherein each of the plurality of buckets could be stored in local storage of said plurality of accelerators; wherein in each iteration, the method comprises: roughly partitioning said data set into a plurality of large buckets; obtaining parameters of said data set that can indicate the distribution of data values in that data set; determining a plurality of data ranges for said data set based on said parameters; and partitioning said plurality of large buckets into a plurality of small buckets corresponding to the plurality of data ranges respectively by using said plurality of accelerators in parallel, wherein each of said plurality of accelerators, for each element in the large bucket it is partitioning, determines a data range to which that element belongs among the plurality of data ranges by computation.
摘要:
An information handling system includes a processor that may perform preprocessing on a variable-length code (VLC) bitstream before decoding the bitstream. The bitstream includes multiple codewords. The processor analyzes incoming VLC bitstream information and generates codeword table information for storage in a system memory or a VLC codeword tables location. The processor generates a VLC lookup table from the information in the VLC codeword tables and stores that VLC lookup table in a system memory of the IHS. The VLC lookup table may exhibit two dimensional indexing by leading zero count and bit-length possibility.
摘要:
An information handling system includes a processor that may perform decoding of a variable-length code (VLC) bitstream after preprocessing the bitstream. The bitstream includes multiple VLC symbols as binary codewords. The processor analyzes incoming VLC bitstream information and generates VLC codeword symbol information in conformance with a VLC lookup table. The processor may access a 2 dimensional VLC lookup table in real time or on-the-fly. The VLC lookup table may reside in a system memory of the IHS. The single VLC lookup table may exhibit two dimensional indexing by leading zero count and bit-length possibility.
摘要:
The present invention provides a method and apparatus for partitioning, sorting a data set on a multi-processor system. Herein, the multi-processor system has at least one core processor and a plurality of accelerators. The method for partitioning a data set comprises: partitioning iteratively said data set into a plurality of buckets corresponding to different data ranges by using said plurality of accelerators in parallel, wherein each of the plurality of buckets could be stored in local storage of said plurality of accelerators; wherein in each iteration, the method comprises: roughly partitioning said data set into a plurality of large buckets; obtaining parameters of said data set that can indicate the distribution of data values in that data set; determining a plurality of data ranges for said data set based on said parameters; and partitioning said plurality of large buckets into a plurality of small buckets corresponding to the plurality of data ranges respectively by using said plurality of accelerators in parallel, wherein each of said plurality of accelerators, for each element in the large bucket it is partitioning, determines a data range to which that element belongs among the plurality of data ranges by computation.