摘要:
A data buffering device which contains an input unit adapted to sequentially receive a two-dimensional array of data structures organized by an index pair with a first index stepwise traversing first-index values in a meandering manner defined by a first and a second meandering direction. The invention further includes a data buffering method, and a data processing method and device; each of which incorporates the above described features of the data buffering device.
摘要:
A data processing circuit comprises an instruction execution circuit (14) and a plurality of memory banks. The instruction execution circuit (14) is capable of processing blocks of data values (e.g. pixel values for a two-dimensional block of pixels) in parallel. The data values are stored (preferably cached) in the memory banks and supplied in parallel. A plurality of translation circuits (22) is coupled between block addressing outputs of the instruction execution circuits and address inputs of the memory banks. The translation circuits provide for the possibilty of addressing more than one block in parallel from different memory banks. The data is routed to the execution circuit from the selected memory banks by routing circuits. In an embodiment each translation circuit is able to address all memory of the banks. In another embodiment the translation circuits support a plurality of ways of distributing a data of a pixel image over the memory banks, using only a few banks for example for data that is accessed in small blocks and more banks for data that accessed with higher parallelism.
摘要:
An array of data values, such as an image of pixel values, is stored in a main memory (12). A processing operation is performed using the pixel values. The processing operation defines time points of movement of a multidimensional region (20, 22) of locations in the image. Pixel values from inside and around the region are cached for processing. At least when a cache miss occurs for a pixel value from outside the region, cache replacement of data in cache locations (142) is performed. Locations that store pixel data for locations in the image outside the region (20, 22) are selected for replacement, selectively exempting from replacement cache locations (142) that store pixel data locations in the image inside the region. In embodiments, different types of cache structure are used for caching data values inside and outside the region. In an embodiment the cache locations for pixel data inside the regions support a higher level of output parallelism than the cache locations for pixel data around the region. In a further embodiment the cache for locations inside the region contains sets of banks, each set for a respective line from the image, data from the lines being distributed in a cyclically repeating fashion over the banks.
摘要:
The present invention relates to a data processing device (10) comprising a processing unit (12) and a memory unit (14), and to a method for controlling operation of a memory unit (14) of a data processing device. The memory unit (14) comprises a main memory (16), a low- level cache memory (20.2), which is directly connected to the processing unit (12) and adapted to hold all pixels of a currently active sliding search area for reading access by the processing unit (12), a high-level cache memory (18), which is connected between the low-level cache memory and the frame memory, and a first pre-fetch buffer (20.1), which is connected between the high-level cache memory and the low- level cache memory and which is adapted to hold one search-area column or one search-area line of pixel blocks, depending on the scan direction and scan Reading and fetching functionalities are decoupled in the memory unit (14). The fetching functionality is concentrated on the higher cache level, while the reading functionality is concentrated on the lower cache level. This way concurrent reading and fetching can be achieved, thus enhancing the performance of a data processing device.
摘要:
The invention relates to a very long instruction word (VLIW) processor comprising a plurality of functional units (110, 130, 135), each for executing an operation, and a VLIW controller (100) connected to each of said functional units (110, 130, 135) and adapted to controlling said functional units (110, 130, 135). The VLIW processor comprises at least one indication means (140) associated with one of said functional units (135) and adapted to registering and indicating to the VLIW controller (100) whether said one functional unit (135) is idle or operating.
摘要:
In one embodiment, an apparatus includes: a storage having a plurality of entries each to store address information of an instruction and a count value of a number of executions of the instruction during execution of code including the instruction; and at least one comparator circuit to compare a count value from one of the plurality of entries to a threshold value, where the instruction is a tagged instruction of the code, the tagged instruction tagged by a static compiler prior to execution of the code. Other embodiments are described and claimed.
摘要:
The data processing device has a plurality of functional units and issues instructions in successive instruction cycles. Instructions of a first type are each intended for one functional unit at a time. An instruction of a second type causes a combination of functional units to respond in the same instruction execution cycle, a result from one functional unit being used by another as part of the execution of the same instruction. Preferably, the device supports alternative operation at a number of different instruction cycle rates, dependent on whether an executed program segment contains instructions of the second type. The fastest instruction cycle rate does not allow execution of the instruction of the second type, because operation by different functional units does not fit within the instruction execution cycle. When possible, the device saves power by switching to a slower clock rate, in which case instructions of the second type are executed to save additional power, by reducing the number of instructions that have to be issued.
摘要:
The present invention relates to a data buffering device (600) particularly suited for use in a data processing device (700), which sequentially provides a two-dimensional array of data structures in a meandering manner. The data buffering device (600) comprises a circular buffer memory having a number of memory locations and a buffer-control unit, which is adapted to assign to an index pair of a current incoming data structure a write-pointer value from a pointer-value set in a periodical manner one write-pointer assignment period having -a first write-pointer assignment phase, during which the first index stepwise traverses the first index-value set in the first stepwise traverses pointer values in a first rotation direction defined within the pointer-value set, -a second write-pointer assignment phase, during which the first index value stepwise traverses the first index-value set in the second meandering direction, and the write pointer stepwise traverses pointer values in the first rotation direction, -a third write-pointer assignment phase, during which the first index stepwise traverses the first index-value set in the first meandering direction, and the write pointer stepwise traverses pointer values in a second rotation direction opposite to the first rotation direction, and a fourth write-pointer assignment phase, during which the first index value stepwise traverses the first index-value set in the second meandering direction, and the write pointer value stepwise traverses pointer values in the second rotation direction. The invention is particularly useful in the field of video processing, where a motion estimator provides a two-dimensional array of motion vectors in a meandering manner, which is used by a motion compensator having a non-meandering scan order
摘要:
The present invention relates to the field of motion estimation in video processing. Specifically, the invention relates to a video-processing method and device for ascertaining motion vectors for a plurality of first pixel blocks forming a currently processed image region of a currently processed image of an image sequence. The invention addresses the problem of the impact of borders between neighboring image regions in region-based motion estimation on the quality of the video output in video applications like picture-rate up conversion. The video-processing device (100) of the invention comprises a processing unit (104), which is adapted to perform motion estimation on an image according to a fragmentation of the image into a number of image regions, each image a region containing the pixel blocks shared by a first number of pixel-block lines and a second number of pixel-block columns in accordance with an adjustable value of an aspect ratio of the image region, and to set a different aspect-ratio value for processing a next image of the image sequence, such that the number of image regions per image remains constant. The dynamic change of the aspect ratio of the image regions implemented in the motion estimation device of the invention reduces the impact of the borders between neighboring image regions and thus improves the quality of region-based motion estimation.
摘要:
The present invention relates to the field of motion estimation in video processing. Specifically the invention relates to a video-processing method and device for ascertaining motion vectors for a plurality of first pixel blocks forming a currently processed image region of a currently processed image of an image sequence. The invention addresses the problem of the impact of region-based motion estimation on the quality of the video output in video applications like picture-rate up conversion. The video-processing device of the invention comprises a processing unit, which is adapted to ascertain motion vectors for a plurality of first pixel blocks (C), which form a currently processed image region (200.1 to 200.14) of a currently processed image (200) of an image sequence, proceeding from image region to image region and processing a respective image region at least twice before proceeding to a next image region. Ascertaining a motion vector for a currently processed first pixel block (C) of the image region is performed by evaluating a respective set of candidate motion vectors containing at least one temporal candidate vector, which is a motion vector that was ascertained for a second pixel block (T) of a preceding image of the image sequence. The video-processing device of the invention is adapted to update, before processing a respective image region (200.2) of the currently processed image a second time, a temporal candidate vector, which was ascertained for a third pixel block located outside the currently processed image region (200.2) in the preceding image, by ascertaining a motion vector for the third pixel block (216) in the currently processed image and replacing the temporal candidate vector with it. By updating temporal motion vector candidates assigned to pixel blocks located outside the currently processed region in a first motion estimation pass, the quality of a motion estimation algorithm after the second or further motion estimation pass is improved in comparison with prior-art solutions.