Enhancing performance of a memory unit of a data processing device by separating reading and fetching functionalities
    2.
    发明授权
    Enhancing performance of a memory unit of a data processing device by separating reading and fetching functionalities 有权
    通过分离读取和取出功能来提高数据处理设备的存储单元的性能

    公开(公告)号:US07797493B2

    公开(公告)日:2010-09-14

    申请号:US11815981

    申请日:2006-02-13

    IPC分类号: G06F13/28

    摘要: The present invention relates to a data processing device (10) comprising a processing unit (12) and a memory unit (14), and to a method for controlling operation of a memory unit (14) of a data processing device. The memory unit (14) comprises a main memory (16), a low- level cache memory (20.2), which is directly connected to the processing unit (12) and adapted to hold all pixels of a currently active sliding search area for reading access by the processing unit (12), a high-level cache memory (18), which is connected between the low-level cache memory and the frame memory, and a first pre-fetch buffer (20.1), which is connected between the high-level cache memory and the low- level cache memory and which is adapted to hold one search-area column or one search-area line of pixel blocks, depending on the scan direction and scan order followed by the processing unit. Reading and fetching functionalities are decoupled in the memory unit (14). The fetching functionality is concentrated on the higher cache level, while the reading functionality is concentrated on the lower cache level. This way concurrent reading and fetching can be achieved, thus enhancing the performance of a data processing device.

    摘要翻译: 本发明涉及包括处理单元(12)和存储单元(14)的数据处理设备(10),以及用于控制数据处理设备的存储单元(14)的操作的方法。 存储单元(14)包括主存储器(16),低级高速缓存存储器(20.2),其直接连接到处理单元(12)并且适于保持当前活动的滑动搜索区域的所有像素用于读取 处理单元(12)的访问,连接在低级缓存存储器和帧存储器之间的高级缓存存储器(18)和第一预取缓冲器(20.1),其连接在 高级缓存存储器和低级高速缓存存储器,并且其适于保持像素块的一个搜索区域列或一个搜索区域行,这取决于处理单元后面的扫描方向和扫描顺序。 读取和取出功能在存储器单元(14)中解耦。 获取功能集中在较高的缓存级别,而读取功能集中在较低的缓存级别。 这样可以实现并行读取和取出,从而提高数据处理设备的性能。

    Device and Method for Composing Codes
    5.
    发明申请
    Device and Method for Composing Codes 审中-公开
    编写代码的装置和方法

    公开(公告)号:US20080059551A1

    公开(公告)日:2008-03-06

    申请号:US10565926

    申请日:2004-07-13

    IPC分类号: G06F7/38

    CPC分类号: H04J13/105

    摘要: Configurable vector processors can be equipped with code generators, so that they are capable of handling different standards and codes. Furthermore, they can be arranged to provide support for related functions such as cyclic redundancy check (CRC). A configurable vector processor would then be equipped with a plurality of generators which generate basic codes in vector format. However, a disadvantage of such a configurable vector processor is that it cannot provide a composite code which is dependent on such basic codes. This is necessary if the configurable vector processors should be flexible enough to support a variety of CDMA-like standards. The device according to the invention is provided with at least two weighted sum units, which are able to make a selection out of a plurality of incoming basic-code vectors by means of a weighted sum operation, under the control of a configuration word. The elements of this configuration word represent the weighting factors which are used to select or deselect a basic-code vector. The selected basic-code vectors are added together and the result of the weighted sum operation is then output as an intermediate-code vector. Subsequently, the intermediate-code vectors are added together by an add unit and output as a composite-code vector. The ability to make selections out of a plurality of incoming basic-code vectors and to add intermediate-code vectors into a composite-code vector, together with the ability to configure the operations of the functional units of the device by means of configuration words, increases the flexibility of the device significantly. This flexibility is needed to support a variety of transmission standards.

    摘要翻译: 可配置向量处理器可以配备代码生成器,以便它们能够处理不同的标准和代码。 此外,它们可以被布置成为诸如循环冗余校验(CRC)之类的相关功能提供支持。 然后,可配置的向量处理器将配备有以矢量格式生成基本代码的多个生成器。 然而,这种可配置向量处理器的缺点在于它不能提供依赖于这种基本代码的复合代码。 如果可配置矢量处理器应足够灵活以支持各种类似CDMA的标准,则这是必要的。 根据本发明的装置具有至少两个加权和单元,它们能够在配置字的控制下通过加权和运算从多个输入的基本码矢量中进行选择。 该配置字的元素表示用于选择或取消选择基本码矢量的加权因子。 所选择的基本码矢量相加在一起,然后将加权和运算的结果作为中间码矢量输出。 随后,通过加法单元将中间码矢量相加在一起,作为复合码矢量输出。 从多个输入的基本代码向量中进行选择并将中间代码向量添加到复合代码向量中的能力以及通过配置字配置设备的功能单元的操作的能力, 显着增加了设备的灵活性。 需要这种灵活性来支持各种传输标准。

    Data Processing Apparatus that Provides Parallel Access to Multi-Dimensional Array of Data Values
    7.
    发明申请
    Data Processing Apparatus that Provides Parallel Access to Multi-Dimensional Array of Data Values 有权
    提供并行访问数据值多维数组的数据处理设备

    公开(公告)号:US20080282038A1

    公开(公告)日:2008-11-13

    申请号:US11568004

    申请日:2005-04-21

    IPC分类号: G06F12/00

    摘要: An array of data values, such as an image of pixel values, is stored in a main memory (12). A processing operation is performed using the pixel values. The processing operation defines time points of movement of a multidimensional region (20, 22) of locations in the image. Pixel values from inside and around the region are cached for processing. At least when a cache miss occurs for a pixel value from outside the region, cache replacement of data in cache locations (142) is performed. Locations that store pixel data for locations in the image outside the region (20, 22) are selected for replacement, selectively exempting from replacement cache locations (142) that store pixel data locations in the image inside the region. In embodiments, different types of cache structure are used for caching data values inside and outside the region. In an embodiment the cache locations for pixel data inside the regions support a higher level of output parallelism than the cache locations for pixel data around the region. In a further embodiment the cache for locations inside the region contains sets of banks, each set for a respective line from the image, data from the lines being distributed in a cyclically repeating fashion over the banks.

    摘要翻译: 诸如像素值的图像的数据值阵列存储在主存储器(12)中。 使用像素值执行处理操作。 处理操作定义图像中位置的多维区域(20,22)的移动时间点。 内部和周围区域的像素值被缓存进行处理。 至少当从区域外的像素值发生高速缓存未命中时,执行高速缓存位置(142)中的数据的高速缓存替换。 选择存储用于区域(20,22)以外的图像中的位置的像素数据的位置用于替换,以选择性地免除存储区域内的图像中的像素数据位置的替换高速缓存位置(142)。 在实施例中,不同类型的高速缓存结构被用于缓存区域内外的数据值。 在一个实施例中,区域内的像素数据的高速缓存位置支持比围绕该区域的像素数据的高速缓存位置更高级的输出并行性。 在另一实施例中,区域内的高速缓冲存储器包含一组存储体,每个存储体集合用于来自图像的相应行,来自行的数据以循环重复的方式分布在存储体上。

    Enhancing Performance of a Memory Unit of a Data Processing Device By Separating Reading and Fetching Functionalities
    8.
    发明申请
    Enhancing Performance of a Memory Unit of a Data Processing Device By Separating Reading and Fetching Functionalities 有权
    通过分离读取和获取功能来提高数据处理设备的存储单元的性能

    公开(公告)号:US20080147980A1

    公开(公告)日:2008-06-19

    申请号:US11815981

    申请日:2006-02-13

    IPC分类号: G06F12/08

    摘要: The present invention relates to a data processing device (10) comprising a processing unit (12) and a memory unit (14), and to a method for controlling operation of a memory unit (14) of a data processing device. The memory unit (14) comprises a main memory (16), a low- level cache memory (20.2), which is directly connected to the processing unit (12) and adapted to hold all pixels of a currently active sliding search area for reading access by the processing unit (12), a high-level cache memory (18), which is connected between the low-level cache memory and the frame memory, and a first pre-fetch buffer (20.1), which is connected between the high-level cache memory and the low- level cache memory and which is adapted to hold one search-area column or one search-area line of pixel blocks, depending on the scan direction and scan Reading and fetching functionalities are decoupled in the memory unit (14). The fetching functionality is concentrated on the higher cache level, while the reading functionality is concentrated on the lower cache level. This way concurrent reading and fetching can be achieved, thus enhancing the performance of a data processing device.

    摘要翻译: 本发明涉及包括处理单元(12)和存储单元(14)的数据处理设备(10),以及用于控制数据处理设备的存储单元(14)的操作的方法。 存储单元(14)包括主存储器(16),低级高速缓存存储器(20.2),其直接连接到处理单元(12)并且适于保持当前活动的滑动搜索区域的所有像素用于读取 处理单元(12)的访问,连接在低级缓存存储器和帧存储器之间的高级缓存存储器(18)和第一预取缓冲器(20.1),其连接在 高级缓存存储器和低级高速缓存存储器,其适于保持一个搜索区域列或一个搜索区域的像素块行,这取决于扫描方向和扫描读取和取出功能在存储器单元中去耦 (14)。 获取功能集中在较高的缓存级别,而读取功能集中在较低的缓存级别。 这样可以实现并行读取和取出,从而提高数据处理设备的性能。

    Data processing apparatus that provides parallel access to multi-dimensional array of data values
    9.
    发明授权
    Data processing apparatus that provides parallel access to multi-dimensional array of data values 有权
    提供并行访问数据值多维数组的数据处理设备

    公开(公告)号:US07694078B2

    公开(公告)日:2010-04-06

    申请号:US11568004

    申请日:2005-04-21

    IPC分类号: G06F12/00

    摘要: An array of data values, such as an image of pixel values, is stored in a main memory (12). A processing operation is performed using the pixel values. The processing operation defines time points of movement of a multidimensional region (20, 22) of locations in the image. Pixel values from inside and around the region are cached for processing. At least when a cache miss occurs for a pixel value from outside the region, cache replacement of data in cache locations (142) is performed. Locations that store pixel data for locations in the image outside the region (20, 22) are selected for replacement, selectively exempting from replacement cache locations (142) that store pixel data locations in the image inside the region. In embodiments, different types of cache structure are used for caching data values inside and outside the region. In an embodiment the cache locations for pixel data inside the regions support a higher level of output parallelism than the cache locations for pixel data around the region. In a further embodiment the cache for locations inside the region contains sets of banks, each set for a respective line from the image, data from the lines being distributed in a cyclically repeating fashion over the banks.

    摘要翻译: 诸如像素值的图像的数据值阵列存储在主存储器(12)中。 使用像素值执行处理操作。 处理操作定义图像中位置的多维区域(20,22)的移动时间点。 内部和周围区域的像素值被缓存进行处理。 至少当从区域外的像素值发生高速缓存未命中时,执行高速缓存位置(142)中的数据的高速缓存替换。 选择存储用于区域(20,22)以外的图像中的位置的像素数据的位置用于替换,以选择性地免除存储区域内的图像中的像素数据位置的替换高速缓存位置(142)。 在实施例中,不同类型的高速缓存结构被用于缓存区域内外的数据值。 在一个实施例中,区域内的像素数据的高速缓存位置支持比围绕该区域的像素数据的高速缓存位置更高级的输出并行性。 在另一实施例中,区域内的高速缓冲存储器包含一组存储体,每个存储体集合用于来自图像的相应行,来自行的数据以循环重复的方式分布在存储体上。