专利检索 ap:("Sami Yehia" OR "Krisztian Flautner" OR "Nathan Clark" OR "Amir Hormati" OR "Scott Mahlke") AND inv:"Sami Yehia" 第 1 页

1.

发明申请
Translation of SIMD instructions in a data processing system 有权
标题翻译： SIMD指令在数据处理系统中的翻译

公开(公告)号：US20080141012A1

公开(公告)日：2008-06-12

申请号：US11905160

申请日：2007-09-27

申请人： Sami Yehia , Krisztian Flautner , Nathan Clark , Amir Hormati , Scott Mahlke

发明人： Sami Yehia , Krisztian Flautner , Nathan Clark , Amir Hormati , Scott Mahlke

IPC分类号： G06F9/318

CPC分类号： G06F9/30054 , G06F9/30032 , G06F9/30036 , G06F9/30174 , G06F9/3802 , G06F9/3808 , G06F9/3885 , G06F9/3887 , G06F9/45516

摘要： A data processing system is provided having a processor and analysing circuitry for identifying a SIMD instruction associated with a first SIMD instruction set and replacing it by a functionally-equivalent scalar representation and marking that functionally-equivalent scalar representation. The marked functionally-equivalent scalar representation is dynamically translated using translation circuitry upon execution of the program to generate one or more corresponding translated instructions corresponding to a instruction set architecture different from the first SIMD architecture corresponding to the identified SIMD instruction.

摘要翻译： 提供了一种数据处理系统，其具有处理器和分析电路，用于识别与第一SIMD指令集相关联的SIMD指令，并通过功能等效的标量表示代替它并标记该功能等效的标量表示。标记的功能等效标量表示在执行程序时使用转换电路进行动态转换，以生成对应于与所识别的SIMD指令相对应的第一SIMD架构不同的指令集架构的一个或多个相应的转换指令。

2.

发明授权
Translation of SIMD instructions in a data processing system 有权
标题翻译： SIMD指令在数据处理系统中的翻译

公开(公告)号：US08505002B2

公开(公告)日：2013-08-06

申请号：US11905160

申请日：2007-09-27

申请人： Sami Yehia , Krisztian Flautner , Nathan Clark , Amir Hormati , Scott Mahlke

发明人： Sami Yehia , Krisztian Flautner , Nathan Clark , Amir Hormati , Scott Mahlke

IPC分类号： G06F9/45

CPC分类号： G06F9/30054 , G06F9/30032 , G06F9/30036 , G06F9/30174 , G06F9/3802 , G06F9/3808 , G06F9/3885 , G06F9/3887 , G06F9/45516

摘要： A data processing system is provided having a processor and analysing circuitry for identifying a SIMD instruction associated with a first SIMD instruction set and replacing it by a functionally-equivalent scalar representation and marking that functionally-equivalent scalar representation. The marked functionally-equivalent scalar representation is dynamically translated using translation circuitry upon execution of the program to generate one or more corresponding translated instructions corresponding to a instruction set architecture different from the first SIMD architecture corresponding to the identified SIMD instruction.

摘要翻译： 提供了一种数据处理系统，其具有处理器和分析电路，用于识别与第一SIMD指令集相关联的SIMD指令，并通过功能等效的标量表示代替它并标记该功能等效的标量表示。标记的功能等效标量表示在执行程序时使用转换电路进行动态转换，以生成对应于与所识别的SIMD指令相对应的第一SIMD架构不同的指令集架构的一个或多个相应的转换指令。

3.

发明授权
Data processing apparatus and method for accelerating execution of subgraphs 有权
标题翻译：用于加速执行子图的数据处理装置和方法

公开(公告)号：US07769982B2

公开(公告)日：2010-08-03

申请号：US11884362

申请日：2005-06-22

申请人： Sami Yehia , Krisztian Flautner

发明人： Sami Yehia , Krisztian Flautner

IPC分类号： G06F9/38 , G06F9/318

CPC分类号： G06F9/3885 , G06F9/30181 , G06F9/3877

摘要： A data processing apparatus and method are provided for processing data under control of a program having program instructions including sequences of individual program instructions corresponding to computational subgraphs identified within the program. Each computational subgraph has a number of input operands and produces one or more output operands. The apparatus comprises an operand store for storing the input and output operands, and processing logic for executing individual program instructions from the program. Also provided is configurable accelerator logic which, in response to reaching an execution point within the program corresponding to a sequence of individual program instructions corresponding to a computational subgraph, evaluates one or more output functions associated with the computational subgraph. The evaluation of each output function generates an output operand for storing in the operand store, and each output operand corresponds to an output that would have been generated had the sequence of individual program instructions corresponding to the computational subgraph have been executed by the processing logic. Configuration storage stores a single look-up table (LUT) configuration for each output function, and for each output function to be evaluated, the accelerator logic is configured dependent on the associated single LUT configuration from the configuration storage, such that on receipt of the input operands of the computational subgraph, the accelerator logic will generate the output operand. This technique has been found to provide a particularly efficient accelerator logic for evaluating output functions associated with computational subgraphs.

摘要翻译： 提供了一种数据处理装置和方法，用于在具有程序指令的程序的控制下处理数据，该程序指令包括与程序内识别的计算子图相对应的各个程序指令的序列。每个计算子图具有多个输入操作数，并产生一个或多个输出操作数。该装置包括用于存储输入和输出操作数的操作数存储器和用于从程序执行各个程序指令的处理逻辑。还提供了可配置加速器逻辑，其响应于到达程序内的执行点，对应于与计算子图对应的单独程序指令的序列，来评估与计算子图相关联的一个或多个输出函数。每个输出函数的评估产生用于存储在操作数存储中的输出操作数，并且每个输出操作数对应于如果已经由处理逻辑执行了与计算子图对应的单个程序指令的序列，则该输出将被产生。配置存储器存储用于每个输出功能的单个查找表（LUT）配置，并且对于要评估的每个输出功能，加速器逻辑被配置为取决于来自配置存储器的相关联的单个LUT配置，使得在接收到输入操作数的计算子图，加速器逻辑将产生输出操作数。已经发现这种技术提供了用于评估与计算子图相关联的输出函数的特别有效的加速器逻辑。

4.

发明申请
Data Processing Apparatus and Method for Accelerating Execution Subgraphs 有权
标题翻译：用于加速执行子图的数据处理装置和方法

公开(公告)号：US20080263332A1

公开(公告)日：2008-10-23

申请号：US11884362

申请日：2005-06-22

申请人： Sami Yehia , Krisztian Flautner

发明人： Sami Yehia , Krisztian Flautner

IPC分类号： G06F9/30

CPC分类号： G06F9/3885 , G06F9/30181 , G06F9/3877

摘要： A data processing apparatus and method are provided for processing data under control of a program having program instructions including sequences of individual program instructions corresponding to computational subgraphs identified within the program. Each computational subgraph has a number of input operands and produces one or more output operands. The apparatus comprises an operand store for storing the input and output operands, and processing logic for executing individual program instructions from the program. Also provided is configurable accelerator logic which, in response to reaching an execution point within the program corresponding to a sequence of individual program instructions corresponding to a computational subgraph, evaluates one or more output functions associated with the computational subgraph. The evaluation of each output function generates an output operand for storing in the operand store, and each output operand corresponds to an output that would have been generated had the sequence of individual program instructions corresponding to the computational subgraph have been executed by the processing logic. Configuration storage stores a single look-up table (LUT) configuration for each output function, and for each output function to be evaluated, the accelerator logic is configured dependent on the associated single LUT configuration from the configuration storage, such that on receipt of the input operands of the computational subgraph, the accelerator logic will generate the output operand. This technique has been found to provide a particularly efficient accelerator logic for evaluating output functions associated with computational subgraphs.

摘要翻译： 提供了一种数据处理装置和方法，用于在具有程序指令的程序的控制下处理数据，该程序指令包括与程序内识别的计算子图相对应的各个程序指令的序列。每个计算子图具有多个输入操作数，并产生一个或多个输出操作数。该装置包括用于存储输入和输出操作数的操作数存储器和用于从程序执行各个程序指令的处理逻辑。还提供了可配置加速器逻辑，其响应于到达程序内的执行点，对应于与计算子图对应的单独程序指令的序列，来评估与计算子图相关联的一个或多个输出函数。每个输出函数的评估产生用于存储在操作数存储中的输出操作数，并且每个输出操作数对应于如果已经由处理逻辑执行了与计算子图对应的各个程序指令的序列，则该输出将被产生。配置存储器存储用于每个输出功能的单个查找表（LUT）配置，并且对于要评估的每个输出功能，加速器逻辑被配置为取决于来自配置存储器的相关联的单个LUT配置，使得在接收到输入操作数的计算子图，加速器逻辑将产生输出操作数。已经发现这种技术提供了用于评估与计算子图相关联的输出函数的特别有效的加速器逻辑。

5.

发明申请
Instruction subgraph identification for a configurable accelerator 审中-公开
标题翻译：可配置加速器的指令子图识别

公开(公告)号：US20070220235A1

公开(公告)日：2007-09-20

申请号：US11375572

申请日：2006-03-15

申请人： Sami Yehia , Krisztian Flautner

发明人： Sami Yehia , Krisztian Flautner

IPC分类号： G06F9/40

CPC分类号： G06F9/3802 , G06F9/3836 , G06F9/3838 , G06F9/3855 , G06F9/3879 , G06F9/3897

摘要： An integrated circuit 2 includes a configurable accelerator 14. An instruction identifier 22 identifies subgraphs of program instructions which are capable of being performed as combined complex operations by the configurable accelerator 14. The subgraph identifier 22 reorders the sequence of fetched instructions to enable larger subgraphs of program instructions to be formed for acceleration and uses a postpone buffer 24 to store any postponed instructions which have been pushed later in the instruction stream by the reordering action of the subgraph identifier 22.

摘要翻译： 集成电路2包括可配置加速器14。指令标识符22识别能够由可配置加速器14作为组合复合操作执行的程序指令的子图。子图标识符22重新排序获取的指令的顺序，以使得能够形成用于加速的程序指令的较大的子图，并且使用推迟缓冲器24来存储在子图识别符的重新排列动作中已被推送到指令流中的任何推迟的指令 22。

6.

发明授权
Methods and systems for coherence protocol tuning 有权
标题翻译：一致性协议调优的方法和系统

公开(公告)号：US08185697B1

公开(公告)日：2012-05-22

申请号：US11030937

申请日：2005-01-07

申请人： Jean-Francois Collard , Sami Yehia

发明人： Jean-Francois Collard , Sami Yehia

IPC分类号： G06F12/00

CPC分类号： G06F12/0815

摘要： A method and system for selectively applying one of a plurality of different memory coherence protocols are described. When an application is executed to generate a memory access transaction, a table can be evaluated to determine whether the transaction should be processed in accordance with a first memory coherence protocol or a second memory coherence protocol. Then, the transaction can be processed according to the selected memory coherence protocol. Alternatively, or in conjunction therewith, the application can be modified to execute more efficiently on a particular memory coherence protocol.

摘要翻译： 描述了用于选择性地应用多个不同的存储器一致性协议之一的方法和系统。当执行应用程序以生成存储器访问事务时，可以评估表以确定是否应该根据第一存储器一致性协议或第二存储器一致性协议来处理事务。然后，可以根据选择的存储器一致性协议来处理事务。替代地，或与其结合，可以修改应用程序以在特定存储器一致性协议上更有效地执行。

7.

发明申请
Multiprocessor system 审中-公开
标题翻译：多处理器系统

公开(公告)号：US20100153685A1

公开(公告)日：2010-06-17

申请号：US12622674

申请日：2009-11-20

申请人： Sami Yehia

发明人： Sami Yehia

IPC分类号： G06F15/76 , G06F9/02

CPC分类号： G06F9/3879 , G06F15/8007

摘要： The invention relates to a multiprocessor system on an electronic chip (300) comprising at least two computing tiles, each of the computing tiles comprising a generalist processor, and means for access to a communication network (320), the said computing tiles being connected together via the said communication network, the said multiprocessor system being characterized in that: a generalist processor using an instruction set which defines all the operations to be executed by the said processor, the generalist processors have one and the same instruction set; at least one of the computing tiles also comprises an accelerator coupled to the generalist processor accelerating computing tasks of the said generalist processor.

摘要翻译： 本发明涉及包括至少两个计算瓦片的电子芯片上的多处理器系统（300），每个计算瓦片包括通用处理器，以及用于接入通信网络（320）的装置，所述计算瓦片连接在一起通过所述通信网络，所述多处理器系统的特征在于：使用定义由所述处理器执行的所有操作的指令集的通用处理器，所述通用处理器具有相同的指令集; 所述计算瓦片中的至少一个还包括耦合到所述通用处理器的加速器，其加速所述通用处理器的计算任务。

8.

发明申请
Entry replacement within a data store 有权
标题翻译：数据存储中的条目替换

公开(公告)号：US20080183986A1

公开(公告)日：2008-07-31

申请号：US12010093

申请日：2008-01-18

申请人： Sami Yehia , Marios Kleanthous

发明人： Sami Yehia , Marios Kleanthous

IPC分类号： G06F12/00 , G06F9/38

CPC分类号： G06F12/121 , G06F12/126

摘要： A data processing system 2 includes a data store 14 having storage locations storing entries which can be used for a variety of purposes, such as operand value prediction, branch prediction, etc. An entry profile store 16 stores profile data in respect of more candidate entries than there are storage locations within the data store 14. The profile data is used to determine replacement policy for entries within the data store 14. The profile data 16 can include hash values used to determine whether or not predictions associated with candidate entries were or were not correct without having to store the full predictions within the profile data.

摘要翻译： 数据处理系统2包括具有存储条目的数据存储器14，存储条目可以用于各种目的，诸如操作数值预测，分支预测等。入口简档存储器16存储关于更多候选条目的简档数据比数据存储器14内的存储位置。简档数据用于确定数据存储14内的条目的替换策略。简档数据16可以包括哈希值，用于确定与候选条目相关联的预测是否是不正确的，而不必将简档数据中的完整预测存储。

9.

发明授权
Entry replacement within a data store using entry profile data and runtime performance gain data 有权
标题翻译：使用条目配置文件数据和运行时性能增益数据在数据存储中进行条目替换

公开(公告)号：US08271750B2

公开(公告)日：2012-09-18

申请号：US12010093

申请日：2008-01-18

申请人： Sami Yehia , Marios Kleanthous

发明人： Sami Yehia , Marios Kleanthous

IPC分类号： G06F12/00 , G06F13/00 , G06F13/28 , G06F12/08 , G06F9/26 , G06F9/34 , G06F9/30 , G06F9/40 , G06F7/38 , G06F9/00 , G06F9/44

CPC分类号： G06F12/121 , G06F12/126

摘要： A data processing system includes a data store having storage locations storing entries which can be used for a variety of purposes, such as operand value prediction, branch prediction, etc. An entry profile store stores profile data for more candidate entries than there are storage locations within the data store. The profile data is used to determine replacement policy for entries within the data store. The profile data can include hash values used to determine whether predictions associated with candidate entries were correct without having to store the full predictions within the profile data.

摘要翻译： 数据处理系统包括具有存储条目的数据存储器，存储条目可以用于各种目的，例如操作数值预测，分支预测等。入口简档存储器存储用于更多候选条目的简档数据，而不是存储位置在数据存储区内。配置文件数据用于确定数据存储区内条目的替换策略。简档数据可以包括哈希值，用于确定与候选条目相关联的预测是否正确，而不必将简档数据中的完整预测存储。

10.

发明申请
ULTRA LOW POWER ARCHITECTURE TO SUPPORT ALWAYS ON PATH TO MEMORY 有权
标题翻译：超低功耗体系结构支持始终保持通向存储器

公开(公告)号：US20160019936A1

公开(公告)日：2016-01-21

申请号：US14336128

申请日：2014-07-21

申请人： Suketu R. Partiwala , Prashanth Kalluraya , Bruce L. Fleming , Shreekant S. Thakkar , Kenneth D. Shoemaker , Sridhar Lakshmanamurthy , Sami Yehia , Joydeep Ray

发明人： Suketu R. Partiwala , Prashanth Kalluraya , Bruce L. Fleming , Shreekant S. Thakkar , Kenneth D. Shoemaker , Sridhar Lakshmanamurthy , Sami Yehia , Joydeep Ray

IPC分类号： G11C5/14

CPC分类号： G11C5/148 , G06F1/189 , G06F1/26 , G06F1/32 , G06F1/3275 , Y02D10/14

摘要： An apparatus with an ultra low power architecture is described herein. The apparatus includes a first power supply rail, wherein a plurality of subsystems are to be powered by the first power supply rail. The apparatus also includes a second power supply rail, wherein a plurality of autonomous subsystems are to be powered by the power supply rail, wherein the second power supply rail is to be always on, always available, and low power.

摘要翻译： 这里描述了具有超低功率结构的装置。该装置包括第一供电轨道，其中多个子系统由第一供电轨道供电。该设备还包括第二电源轨，其中多个自主子系统将由电源轨供电，其中第二电源轨将始终处于开启状态，始终可用且功率低。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类