Translation of SIMD instructions in a data processing system
    1.
    发明申请
    Translation of SIMD instructions in a data processing system 有权
    SIMD指令在数据处理系统中的翻译

    公开(公告)号:US20080141012A1

    公开(公告)日:2008-06-12

    申请号:US11905160

    申请日:2007-09-27

    IPC分类号: G06F9/318

    摘要: A data processing system is provided having a processor and analysing circuitry for identifying a SIMD instruction associated with a first SIMD instruction set and replacing it by a functionally-equivalent scalar representation and marking that functionally-equivalent scalar representation. The marked functionally-equivalent scalar representation is dynamically translated using translation circuitry upon execution of the program to generate one or more corresponding translated instructions corresponding to a instruction set architecture different from the first SIMD architecture corresponding to the identified SIMD instruction.

    摘要翻译: 提供了一种数据处理系统,其具有处理器和分析电路,用于识别与第一SIMD指令集相关联的SIMD指令,并通过功能等效的标量表示代替它并标记该功能等效的标量表示。 标记的功能等效标量表示在执行程序时使用转换电路进行动态转换,以生成对应于与所识别的SIMD指令相对应的第一SIMD架构不同的指令集架构的一个或多个相应的转换指令。

    Translation of SIMD instructions in a data processing system
    2.
    发明授权
    Translation of SIMD instructions in a data processing system 有权
    SIMD指令在数据处理系统中的翻译

    公开(公告)号:US08505002B2

    公开(公告)日:2013-08-06

    申请号:US11905160

    申请日:2007-09-27

    IPC分类号: G06F9/45

    摘要: A data processing system is provided having a processor and analysing circuitry for identifying a SIMD instruction associated with a first SIMD instruction set and replacing it by a functionally-equivalent scalar representation and marking that functionally-equivalent scalar representation. The marked functionally-equivalent scalar representation is dynamically translated using translation circuitry upon execution of the program to generate one or more corresponding translated instructions corresponding to a instruction set architecture different from the first SIMD architecture corresponding to the identified SIMD instruction.

    摘要翻译: 提供了一种数据处理系统,其具有处理器和分析电路,用于识别与第一SIMD指令集相关联的SIMD指令,并通过功能等效的标量表示代替它并标记该功能等效的标量表示。 标记的功能等效标量表示在执行程序时使用转换电路进行动态转换,以生成对应于与所识别的SIMD指令相对应的第一SIMD架构不同的指令集架构的一个或多个相应的转换指令。

    Data processing apparatus and method for accelerating execution of subgraphs
    3.
    发明授权
    Data processing apparatus and method for accelerating execution of subgraphs 有权
    用于加速执行子图的数据处理装置和方法

    公开(公告)号:US07769982B2

    公开(公告)日:2010-08-03

    申请号:US11884362

    申请日:2005-06-22

    IPC分类号: G06F9/38 G06F9/318

    摘要: A data processing apparatus and method are provided for processing data under control of a program having program instructions including sequences of individual program instructions corresponding to computational subgraphs identified within the program. Each computational subgraph has a number of input operands and produces one or more output operands. The apparatus comprises an operand store for storing the input and output operands, and processing logic for executing individual program instructions from the program. Also provided is configurable accelerator logic which, in response to reaching an execution point within the program corresponding to a sequence of individual program instructions corresponding to a computational subgraph, evaluates one or more output functions associated with the computational subgraph. The evaluation of each output function generates an output operand for storing in the operand store, and each output operand corresponds to an output that would have been generated had the sequence of individual program instructions corresponding to the computational subgraph have been executed by the processing logic. Configuration storage stores a single look-up table (LUT) configuration for each output function, and for each output function to be evaluated, the accelerator logic is configured dependent on the associated single LUT configuration from the configuration storage, such that on receipt of the input operands of the computational subgraph, the accelerator logic will generate the output operand. This technique has been found to provide a particularly efficient accelerator logic for evaluating output functions associated with computational subgraphs.

    摘要翻译: 提供了一种数据处理装置和方法,用于在具有程序指令的程序的控制下处理数据,该程序指令包括与程序内识别的计算子图相对应的各个程序指令的序列。 每个计算子图具有多个输入操作数,并产生一个或多个输出操作数。 该装置包括用于存储输入和输出操作数的操作数存储器和用于从程序执行各个程序指令的处理逻辑。 还提供了可配置加速器逻辑,其响应于到达程序内的执行点,对应于与计算子图对应的单独程序指令的序列,来评估与计算子图相关联的一个或多个输出函数。 每个输出函数的评估产生用于存储在操作数存储中的输出操作数,并且每个输出操作数对应于如果已经由处理逻辑执行了与计算子图对应的单个程序指令的序列,则该输出将被产生。 配置存储器存储用于每个输出功能的单个查找表(LUT)配置,并且对于要评估的每个输出功能,加速器逻辑被配置为取决于来自配置存储器的相关联的单个LUT配置,使得在接收到 输入操作数的计算子图,加速器逻辑将产生输出操作数。 已经发现这种技术提供了用于评估与计算子图相关联的输出函数的特别有效的加速器逻辑。

    Data Processing Apparatus and Method for Accelerating Execution Subgraphs
    4.
    发明申请
    Data Processing Apparatus and Method for Accelerating Execution Subgraphs 有权
    用于加速执行子图的数据处理装置和方法

    公开(公告)号:US20080263332A1

    公开(公告)日:2008-10-23

    申请号:US11884362

    申请日:2005-06-22

    IPC分类号: G06F9/30

    摘要: A data processing apparatus and method are provided for processing data under control of a program having program instructions including sequences of individual program instructions corresponding to computational subgraphs identified within the program. Each computational subgraph has a number of input operands and produces one or more output operands. The apparatus comprises an operand store for storing the input and output operands, and processing logic for executing individual program instructions from the program. Also provided is configurable accelerator logic which, in response to reaching an execution point within the program corresponding to a sequence of individual program instructions corresponding to a computational subgraph, evaluates one or more output functions associated with the computational subgraph. The evaluation of each output function generates an output operand for storing in the operand store, and each output operand corresponds to an output that would have been generated had the sequence of individual program instructions corresponding to the computational subgraph have been executed by the processing logic. Configuration storage stores a single look-up table (LUT) configuration for each output function, and for each output function to be evaluated, the accelerator logic is configured dependent on the associated single LUT configuration from the configuration storage, such that on receipt of the input operands of the computational subgraph, the accelerator logic will generate the output operand. This technique has been found to provide a particularly efficient accelerator logic for evaluating output functions associated with computational subgraphs.

    摘要翻译: 提供了一种数据处理装置和方法,用于在具有程序指令的程序的控制下处理数据,该程序指令包括与程序内识别的计算子图相对应的各个程序指令的序列。 每个计算子图具有多个输入操作数,并产生一个或多个输出操作数。 该装置包括用于存储输入和输出操作数的操作数存储器和用于从程序执行各个程序指令的处理逻辑。 还提供了可配置加速器逻辑,其响应于到达程序内的执行点,对应于与计算子图对应的单独程序指令的序列,来评估与计算子图相关联的一个或多个输出函数。 每个输出函数的评估产生用于存储在操作数存储中的输出操作数,并且每个输出操作数对应于如果已经由处理逻辑执行了与计算子图对应的各个程序指令的序列,则该输出将被产生。 配置存储器存储用于每个输出功能的单个查找表(LUT)配置,并且对于要评估的每个输出功能,加速器逻辑被配置为取决于来自配置存储器的相关联的单个LUT配置,使得在接收到 输入操作数的计算子图,加速器逻辑将产生输出操作数。 已经发现这种技术提供了用于评估与计算子图相关联的输出函数的特别有效的加速器逻辑。

    Instruction subgraph identification for a configurable accelerator
    5.
    发明申请
    Instruction subgraph identification for a configurable accelerator 审中-公开
    可配置加速器的指令子图识别

    公开(公告)号:US20070220235A1

    公开(公告)日:2007-09-20

    申请号:US11375572

    申请日:2006-03-15

    IPC分类号: G06F9/40

    摘要: An integrated circuit 2 includes a configurable accelerator 14. An instruction identifier 22 identifies subgraphs of program instructions which are capable of being performed as combined complex operations by the configurable accelerator 14. The subgraph identifier 22 reorders the sequence of fetched instructions to enable larger subgraphs of program instructions to be formed for acceleration and uses a postpone buffer 24 to store any postponed instructions which have been pushed later in the instruction stream by the reordering action of the subgraph identifier 22.

    摘要翻译: 集成电路2包括可配置加速器14。 指令标识符22识别能够由可配置加速器14作为组合复合操作执行的程序指令的子图。 子图标识符22重新排序获取的指令的顺序,以使得能够形成用于加速的程序指令的较大的子图,并且使用推迟缓冲器24来存储在子图识别符的重新排列动作中已被推送到指令流中的任何推迟的指令 22。

    Methods and systems for coherence protocol tuning
    6.
    发明授权
    Methods and systems for coherence protocol tuning 有权
    一致性协议调优的方法和系统

    公开(公告)号:US08185697B1

    公开(公告)日:2012-05-22

    申请号:US11030937

    申请日:2005-01-07

    IPC分类号: G06F12/00

    CPC分类号: G06F12/0815

    摘要: A method and system for selectively applying one of a plurality of different memory coherence protocols are described. When an application is executed to generate a memory access transaction, a table can be evaluated to determine whether the transaction should be processed in accordance with a first memory coherence protocol or a second memory coherence protocol. Then, the transaction can be processed according to the selected memory coherence protocol. Alternatively, or in conjunction therewith, the application can be modified to execute more efficiently on a particular memory coherence protocol.

    摘要翻译: 描述了用于选择性地应用多个不同的存储器一致性协议之一的方法和系统。 当执行应用程序以生成存储器访问事务时,可以评估表以确定是否应该根据第一存储器一致性协议或第二存储器一致性协议来处理事务。 然后,可以根据选择的存储器一致性协议来处理事务。 替代地,或与其结合,可以修改应用程序以在特定存储器一致性协议上更有效地执行。

    Multiprocessor system
    7.
    发明申请
    Multiprocessor system 审中-公开
    多处理器系统

    公开(公告)号:US20100153685A1

    公开(公告)日:2010-06-17

    申请号:US12622674

    申请日:2009-11-20

    申请人: Sami Yehia

    发明人: Sami Yehia

    IPC分类号: G06F15/76 G06F9/02

    CPC分类号: G06F9/3879 G06F15/8007

    摘要: The invention relates to a multiprocessor system on an electronic chip (300) comprising at least two computing tiles, each of the computing tiles comprising a generalist processor, and means for access to a communication network (320), the said computing tiles being connected together via the said communication network, the said multiprocessor system being characterized in that: a generalist processor using an instruction set which defines all the operations to be executed by the said processor, the generalist processors have one and the same instruction set; at least one of the computing tiles also comprises an accelerator coupled to the generalist processor accelerating computing tasks of the said generalist processor.

    摘要翻译: 本发明涉及包括至少两个计算瓦片的电子芯片上的多处理器系统(300),每个计算瓦片包括通用处理器,以及用于接入通信网络(320)的装置,所述计算瓦片连接在一起 通过所述通信网络,所述多处理器系统的特征在于:使用定义由所述处理器执行的所有操作的指令集的通用处理器,所述通用处理器具有相同的指令集; 所述计算瓦片中的至少一个还包括耦合到所述通用处理器的加速器,其加速所述通用处理器的计算任务。

    Entry replacement within a data store
    8.
    发明申请
    Entry replacement within a data store 有权
    数据存储中的条目替换

    公开(公告)号:US20080183986A1

    公开(公告)日:2008-07-31

    申请号:US12010093

    申请日:2008-01-18

    IPC分类号: G06F12/00 G06F9/38

    CPC分类号: G06F12/121 G06F12/126

    摘要: A data processing system 2 includes a data store 14 having storage locations storing entries which can be used for a variety of purposes, such as operand value prediction, branch prediction, etc. An entry profile store 16 stores profile data in respect of more candidate entries than there are storage locations within the data store 14. The profile data is used to determine replacement policy for entries within the data store 14. The profile data 16 can include hash values used to determine whether or not predictions associated with candidate entries were or were not correct without having to store the full predictions within the profile data.

    摘要翻译: 数据处理系统2包括具有存储条目的数据存储器14,存储条目可以用于各种目的,诸如操作数值预测,分支预测等。入口简档存储器16存储关于更多候选条目的简档数据 比数据存储器14内的存储位置。 简档数据用于确定数据存储14内的条目的替换策略。 简档数据16可以包括哈希值,用于确定与候选条目相关联的预测是否是不正确的,而不必将简档数据中的完整预测存储。

    Entry replacement within a data store using entry profile data and runtime performance gain data
    9.
    发明授权
    Entry replacement within a data store using entry profile data and runtime performance gain data 有权
    使用条目配置文件数据和运行时性能增益数据在数据存储中进行条目替换

    公开(公告)号:US08271750B2

    公开(公告)日:2012-09-18

    申请号:US12010093

    申请日:2008-01-18

    CPC分类号: G06F12/121 G06F12/126

    摘要: A data processing system includes a data store having storage locations storing entries which can be used for a variety of purposes, such as operand value prediction, branch prediction, etc. An entry profile store stores profile data for more candidate entries than there are storage locations within the data store. The profile data is used to determine replacement policy for entries within the data store. The profile data can include hash values used to determine whether predictions associated with candidate entries were correct without having to store the full predictions within the profile data.

    摘要翻译: 数据处理系统包括具有存储条目的数据存储器,存储条目可以用于各种目的,例如操作数值预测,分支预测等。入口简档存储器存储用于更多候选条目的简档数据,而不是存储位置 在数据存储区内。 配置文件数据用于确定数据存储区内条目的替换策略。 简档数据可以包括哈希值,用于确定与候选条目相关联的预测是否正确,而不必将简档数据中的完整预测存储。