Data processing apparatus address range dependent parallelization of instructions
    1.
    发明授权
    Data processing apparatus address range dependent parallelization of instructions 有权
    数据处理装置地址范围依赖于指令的并行化

    公开(公告)号:US08364935B2

    公开(公告)日:2013-01-29

    申请号:US10530495

    申请日:2003-10-01

    IPC分类号: G06F15/76 G06F9/30

    摘要: A data processing apparatus has an instruction memory system arranged to output an instruction word addressed by an instruction address. An instruction execution unit, processes a plurality of instructions from the instruction word in parallel. A detection unit, detects in which of a plurality of ranges the instruction address lies. The detection unit is coupled to the instruction execution unit and/or the instruction memory system, to control a way in which the instruction execution unit parallelizes processing of the instructions from the instruction word, dependent on a detected range. In an embodiment the instruction execution unit and/or the instruction memory system adjusts a width of the instruction word that determines a number of instructions from the instruction word that is processed in parallel, dependent on the detected range.

    摘要翻译: 数据处理装置具有布置成输出由指令地址寻址的指令字的指令存储器系统。 指令执行单元,并行地从指令字处理多个指令。 检测单元,检测指示地址所在的多个范围中的哪一个。 检测单元耦合到指令执行单元和/或指令存储器系统,以根据检测到的范围来控制指令执行单元将来自指令字的指令的处理并行化的方式。 在一个实施例中,指令执行单元和/或指令存储器系统根据检测到的范围来调整从并行处理的指令字确定指令字数的指令字的宽度。

    Data processing apparatus with parallel operating functional units
    3.
    发明授权
    Data processing apparatus with parallel operating functional units 有权
    具有并行运行功能单元的数据处理装置

    公开(公告)号:US07664929B2

    公开(公告)日:2010-02-16

    申请号:US10530375

    申请日:2003-09-17

    IPC分类号: G06F9/38

    CPC分类号: G06F9/3802 G06F9/3853

    摘要: A program of instruction words is executed with a VLIW data processing apparatus. The apparatus comprises a plurality of functional units capable of executing a plurality of instructions from each instruction word in parallel. The instructions from each of at least some of the instruction words are fetched from respective memory units in parallel, addressed with an instruction address that is common for the functional units. Translation of the instruction address into a physical address can be modified for one or more particular ones of the memory units. Modification is controlled by modification update instructions in the program. Thus, it can be selected dependent on program execution which instructions from the memory units will be combined into the instruction word in response to the instruction address.

    摘要翻译: 用VLIW数据处理装置执行指令字的程序。 该装置包括能够并行地从每个指令字执行多个指令的多个功能单元。 来自各个指令字中的至少一些的指令被并行地从相应的存储器单元中取出,用功能单元共用的指令地址寻址。 将指令地址转换为物理地址可以针对一个或多个特定存储器单元进行修改。 修改由程序中的修改更新指令控制。 因此,可以根据程序执行来选择来自存储器单元的指令将响应于指令地址而组合到指令字中。

    Data Processing Apparatus that Provides Parallel Access to Multi-Dimensional Array of Data Values
    4.
    发明申请
    Data Processing Apparatus that Provides Parallel Access to Multi-Dimensional Array of Data Values 有权
    提供并行访问数据值多维数组的数据处理设备

    公开(公告)号:US20080282038A1

    公开(公告)日:2008-11-13

    申请号:US11568004

    申请日:2005-04-21

    IPC分类号: G06F12/00

    摘要: An array of data values, such as an image of pixel values, is stored in a main memory (12). A processing operation is performed using the pixel values. The processing operation defines time points of movement of a multidimensional region (20, 22) of locations in the image. Pixel values from inside and around the region are cached for processing. At least when a cache miss occurs for a pixel value from outside the region, cache replacement of data in cache locations (142) is performed. Locations that store pixel data for locations in the image outside the region (20, 22) are selected for replacement, selectively exempting from replacement cache locations (142) that store pixel data locations in the image inside the region. In embodiments, different types of cache structure are used for caching data values inside and outside the region. In an embodiment the cache locations for pixel data inside the regions support a higher level of output parallelism than the cache locations for pixel data around the region. In a further embodiment the cache for locations inside the region contains sets of banks, each set for a respective line from the image, data from the lines being distributed in a cyclically repeating fashion over the banks.

    摘要翻译: 诸如像素值的图像的数据值阵列存储在主存储器(12)中。 使用像素值执行处理操作。 处理操作定义图像中位置的多维区域(20,22)的移动时间点。 内部和周围区域的像素值被缓存进行处理。 至少当从区域外的像素值发生高速缓存未命中时,执行高速缓存位置(142)中的数据的高速缓存替换。 选择存储用于区域(20,22)以外的图像中的位置的像素数据的位置用于替换,以选择性地免除存储区域内的图像中的像素数据位置的替换高速缓存位置(142)。 在实施例中,不同类型的高速缓存结构被用于缓存区域内外的数据值。 在一个实施例中,区域内的像素数据的高速缓存位置支持比围绕该区域的像素数据的高速缓存位置更高级的输出并行性。 在另一实施例中,区域内的高速缓冲存储器包含一组存储体,每个存储体集合用于来自图像的相应行,来自行的数据以循环重复的方式分布在存储体上。

    Data processing apparatus that provides parallel access to multi-dimensional array of data values
    5.
    发明授权
    Data processing apparatus that provides parallel access to multi-dimensional array of data values 有权
    提供并行访问数据值多维数组的数据处理设备

    公开(公告)号:US07694078B2

    公开(公告)日:2010-04-06

    申请号:US11568004

    申请日:2005-04-21

    IPC分类号: G06F12/00

    摘要: An array of data values, such as an image of pixel values, is stored in a main memory (12). A processing operation is performed using the pixel values. The processing operation defines time points of movement of a multidimensional region (20, 22) of locations in the image. Pixel values from inside and around the region are cached for processing. At least when a cache miss occurs for a pixel value from outside the region, cache replacement of data in cache locations (142) is performed. Locations that store pixel data for locations in the image outside the region (20, 22) are selected for replacement, selectively exempting from replacement cache locations (142) that store pixel data locations in the image inside the region. In embodiments, different types of cache structure are used for caching data values inside and outside the region. In an embodiment the cache locations for pixel data inside the regions support a higher level of output parallelism than the cache locations for pixel data around the region. In a further embodiment the cache for locations inside the region contains sets of banks, each set for a respective line from the image, data from the lines being distributed in a cyclically repeating fashion over the banks.

    摘要翻译: 诸如像素值的图像的数据值阵列存储在主存储器(12)中。 使用像素值执行处理操作。 处理操作定义图像中位置的多维区域(20,22)的移动时间点。 内部和周围区域的像素值被缓存进行处理。 至少当从区域外的像素值发生高速缓存未命中时,执行高速缓存位置(142)中的数据的高速缓存替换。 选择存储用于区域(20,22)以外的图像中的位置的像素数据的位置用于替换,以选择性地免除存储区域内的图像中的像素数据位置的替换高速缓存位置(142)。 在实施例中,不同类型的高速缓存结构被用于缓存区域内外的数据值。 在一个实施例中,区域内的像素数据的高速缓存位置支持比围绕该区域的像素数据的高速缓存位置更高级的输出并行性。 在另一实施例中,区域内的高速缓冲存储器包含一组存储体,每个存储体集合用于来自图像的相应行,来自行的数据以循环重复的方式分布在存储体上。

    Enhancing performance of a memory unit of a data processing device by separating reading and fetching functionalities
    6.
    发明授权
    Enhancing performance of a memory unit of a data processing device by separating reading and fetching functionalities 有权
    通过分离读取和取出功能来提高数据处理设备的存储单元的性能

    公开(公告)号:US07797493B2

    公开(公告)日:2010-09-14

    申请号:US11815981

    申请日:2006-02-13

    IPC分类号: G06F13/28

    摘要: The present invention relates to a data processing device (10) comprising a processing unit (12) and a memory unit (14), and to a method for controlling operation of a memory unit (14) of a data processing device. The memory unit (14) comprises a main memory (16), a low- level cache memory (20.2), which is directly connected to the processing unit (12) and adapted to hold all pixels of a currently active sliding search area for reading access by the processing unit (12), a high-level cache memory (18), which is connected between the low-level cache memory and the frame memory, and a first pre-fetch buffer (20.1), which is connected between the high-level cache memory and the low- level cache memory and which is adapted to hold one search-area column or one search-area line of pixel blocks, depending on the scan direction and scan order followed by the processing unit. Reading and fetching functionalities are decoupled in the memory unit (14). The fetching functionality is concentrated on the higher cache level, while the reading functionality is concentrated on the lower cache level. This way concurrent reading and fetching can be achieved, thus enhancing the performance of a data processing device.

    摘要翻译: 本发明涉及包括处理单元(12)和存储单元(14)的数据处理设备(10),以及用于控制数据处理设备的存储单元(14)的操作的方法。 存储单元(14)包括主存储器(16),低级高速缓存存储器(20.2),其直接连接到处理单元(12)并且适于保持当前活动的滑动搜索区域的所有像素用于读取 处理单元(12)的访问,连接在低级缓存存储器和帧存储器之间的高级缓存存储器(18)和第一预取缓冲器(20.1),其连接在 高级缓存存储器和低级高速缓存存储器,并且其适于保持像素块的一个搜索区域列或一个搜索区域行,这取决于处理单元后面的扫描方向和扫描顺序。 读取和取出功能在存储器单元(14)中解耦。 获取功能集中在较高的缓存级别,而读取功能集中在较低的缓存级别。 这样可以实现并行读取和取出,从而提高数据处理设备的性能。

    Enhancing Performance of a Memory Unit of a Data Processing Device By Separating Reading and Fetching Functionalities
    7.
    发明申请
    Enhancing Performance of a Memory Unit of a Data Processing Device By Separating Reading and Fetching Functionalities 有权
    通过分离读取和获取功能来提高数据处理设备的存储单元的性能

    公开(公告)号:US20080147980A1

    公开(公告)日:2008-06-19

    申请号:US11815981

    申请日:2006-02-13

    IPC分类号: G06F12/08

    摘要: The present invention relates to a data processing device (10) comprising a processing unit (12) and a memory unit (14), and to a method for controlling operation of a memory unit (14) of a data processing device. The memory unit (14) comprises a main memory (16), a low- level cache memory (20.2), which is directly connected to the processing unit (12) and adapted to hold all pixels of a currently active sliding search area for reading access by the processing unit (12), a high-level cache memory (18), which is connected between the low-level cache memory and the frame memory, and a first pre-fetch buffer (20.1), which is connected between the high-level cache memory and the low- level cache memory and which is adapted to hold one search-area column or one search-area line of pixel blocks, depending on the scan direction and scan Reading and fetching functionalities are decoupled in the memory unit (14). The fetching functionality is concentrated on the higher cache level, while the reading functionality is concentrated on the lower cache level. This way concurrent reading and fetching can be achieved, thus enhancing the performance of a data processing device.

    摘要翻译: 本发明涉及包括处理单元(12)和存储单元(14)的数据处理设备(10),以及用于控制数据处理设备的存储单元(14)的操作的方法。 存储单元(14)包括主存储器(16),低级高速缓存存储器(20.2),其直接连接到处理单元(12)并且适于保持当前活动的滑动搜索区域的所有像素用于读取 处理单元(12)的访问,连接在低级缓存存储器和帧存储器之间的高级缓存存储器(18)和第一预取缓冲器(20.1),其连接在 高级缓存存储器和低级高速缓存存储器,其适于保持一个搜索区域列或一个搜索区域的像素块行,这取决于扫描方向和扫描读取和取出功能在存储器单元中去耦 (14)。 获取功能集中在较高的缓存级别,而读取功能集中在较低的缓存级别。 这样可以实现并行读取和取出,从而提高数据处理设备的性能。