专利检索 ap:("Shlomo Raikin" OR "Raanan Sade" OR "Robert Valentine" OR "Julius Yuli Mandelblat" OR "Ron Shalev" OR "Larisa Novakovsky") AND inv:"Ron Shalev" 第 1 页

1.

发明授权
Apparatus and method for memory-hierarchy aware producer-consumer instruction 有权

公开(公告)号：US09990287B2

公开(公告)日：2018-06-05

申请号：US13994122

申请日：2011-12-21

申请人： Shlomo Raikin , Raanan Sade , Robert Valentine , Julius Yuli Mandelblat , Ron Shalev , Larisa Novakovsky

发明人： Shlomo Raikin , Raanan Sade , Robert Valentine , Julius Yuli Mandelblat , Ron Shalev , Larisa Novakovsky

IPC分类号： G06F13/38 , G06T1/20 , G06F12/0811 , G06F9/30 , G06F9/38 , G06F13/16 , G06T1/60 , G09G5/00 , G06F12/0866

CPC分类号： G06F12/0811 , G06F9/30043 , G06F9/30047 , G06F9/30087 , G06F9/3881 , G06F12/0866 , G06F13/1673 , G06F13/38 , G06T1/20 , G06T1/60 , G09G5/006

摘要： An apparatus and method are described for efficiently transferring data from a core of a central processing unit (CPU) to a graphics processing unit (GPU). For example, one embodiment of a method comprises: writing data to a buffer within the core of the CPU until a designated amount of data has been written; upon detecting that the designated amount of data has been written, responsively generating an eviction cycle, the eviction cycle causing the data to be transferred from the buffer to a cache accessible by both the core and the GPU; setting an indication to indicate to the GPU that data is available in the cache; and upon the GPU detecting the indication, providing the data to the GPU from the cache upon receipt of a read signal from the GPU.

2.

发明申请
APPARATUS AND METHOD FOR MEMORY-HIERARCHY AWARE PRODUCER-CONSUMER INSTRUCTION 有权
标题翻译：用于记忆级别生产者消费者指令的装置和方法

公开(公告)号：US20140192069A1

公开(公告)日：2014-07-10

申请号：US13994122

申请日：2011-12-21

申请人： Shlomo Raikin , Raanan Sade , Robert Valentine , Julius Yuli Mandelblat , Ron Shalev , Larisa Novakovsky

发明人： Shlomo Raikin , Raanan Sade , Robert Valentine , Julius Yuli Mandelblat , Ron Shalev , Larisa Novakovsky

IPC分类号： G06F13/38 , G06F13/16 , G06T1/60 , G06F12/08 , G06T1/20

CPC分类号： G06F12/0811 , G06F9/30043 , G06F9/30047 , G06F9/30087 , G06F9/3881 , G06F12/0866 , G06F13/1673 , G06F13/38 , G06T1/20 , G06T1/60 , G09G5/006

摘要： An apparatus and method are described for efficiently transferring data from a core of a central processing unit (CPU) to a graphics processing unit (GPU). For example, one embodiment of a method comprises: writing data to a buffer within the core of the CPU until a designated amount of data has been written; upon detecting that the designated amount of data has been written, responsively generating an eviction cycle, the eviction cycle causing the data to be transferred from the buffer to a cache accessible by both the core and the GPU; setting an indication to indicate to the GPU that data is available in the cache; and upon the GPU detecting the indication, providing the data to the GPU from the cache upon receipt of a read signal from the GPU.

摘要翻译： 描述了一种有效地将数据从中央处理单元（CPU）的核心传输到图形处理单元（GPU）的装置和方法。例如，一种方法的一个实施例包括：将数据写入CPU的核心内的缓冲器，直到指定的数据量被写入为止; 在检测到指定量的数据已被写入时，响应地产生驱逐周期，驱逐循环使数据从缓冲器传送到可由核心和GPU访问的高速缓存; 设置指示以向GPU指示数据在高速缓存中可用; 并且在GPU检测到指示时，在从GPU接收到读取信号时，从高速缓存提供数据给GPU。

3.

发明申请
APPARATUS AND METHOD FOR MEMORY-HIERARCHY AWARE PRODUCER-CONSUMER INSTRUCTIONS 审中-公开
标题翻译：用于记忆级别生产者消费者指令的装置和方法

公开(公告)号：US20140208031A1

公开(公告)日：2014-07-24

申请号：US13994724

申请日：2011-12-21

申请人： Shlomo Raikin , Robert Valentine , Raanan Sade , Julius Yuli Mandelbalt , Ron Shalev , Larisa Novakovsky

发明人： Shlomo Raikin , Robert Valentine , Raanan Sade , Julius Yuli Mandelbalt , Ron Shalev , Larisa Novakovsky

IPC分类号： G06F12/08 , G06T1/60

CPC分类号： G06F12/0811 , G06F9/3828 , G06F9/3891 , G06F12/0891 , G06T1/60

摘要： An apparatus and method are described for efficiently transferring data from a producer core to a consumer core within a central processing unit (CPU). For example, one embodiment of a method comprises: A method for transferring a chunk of data from a producer core of a central processing unit (CPU) to consumer core of the CPU, comprising: writing data to a buffer within the producer core of the CPU until a designated amount of data has been written; upon detecting that the designated amount of data has been written, responsively generating an eviction cycle, the eviction cycle causing the data to be transferred from the fill buffer to a cache accessible by both the producer core and the consumer core; and upon the consumer core detecting that data is available in the cache, providing the data to the consumer core from the cache upon receipt of a read signal from the consumer core.

摘要翻译： 描述了一种用于在中央处理单元（CPU）内有效地将数据从生产者核心传送到消费者核心的装置和方法。例如，一种方法的一个实施例包括：一种用于将数据块从中央处理单元（CPU）的生产者核心转移到CPU的消费者核心的方法，包括：将数据写入到所述CPU的生产者核心内的缓冲器 CPU直到指定数据量被写入; 在检测到指定量的数据被写入时，响应地产生驱逐周期，使得将数据从填充缓冲器传送到可由生产者核心和消费者核心访问的高速缓存的逐出循环; 并且在消费者核心检测到数据在高速缓存中可用时，在从消费者核心接收到读取信号时从高速缓存提供数据给消费者核心。

4.

发明申请
METHOD AND APPARATUS FOR CUTTING SENIOR STORE LATENCY USING STORE PREFETCHING 有权
标题翻译：使用商店预购切割高级商店的方法和装置

公开(公告)号：US20140223105A1

公开(公告)日：2014-08-07

申请号：US13993508

申请日：2011-12-30

申请人： Stanislav Shwartsman , Melih Ozgul , Sebastien Hily , Shlomo Raikin , Raanan Sade , Ron Shalev

发明人： Stanislav Shwartsman , Melih Ozgul , Sebastien Hily , Shlomo Raikin , Raanan Sade , Ron Shalev

IPC分类号： G06F9/38 , G06F12/08

CPC分类号： G06F9/3814 , G06F9/383 , G06F9/3834 , G06F9/3861 , G06F12/0808 , G06F12/0862 , G06F2212/6028 , G06F2212/62

摘要： In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for cutting senior store latency using store prefetching. For example, in one embodiment, such means may include an integrated circuit or an out of order processor means that processes out of order instructions and enforces in-order requirements for a cache. Such an integrated circuit or out of order processor means further includes means for receiving a store instruction; means for performing address generation and translation for the store instruction to calculate a physical address of the memory to be accessed by the store instruction; and means for executing a pre-fetch for a cache line based on the store instruction and the calculated physical address before the store instruction retires.

摘要翻译： 根据本文公开的实施例，提供了使用商店预取来切割高级商店延迟的方法，系统，机制，技术和装置。例如，在一个实施例中，这种装置可以包括集成电路或乱序处理器装置，其处理不一致的指令并对高速缓存执行按顺序的要求。这样的集成电路或不按顺序的处理器装置还包括用于接收存储指令的装置; 用于执行所述存储指令的地址生成和转换以计算由所述存储指令访问的存储器的物理地址的装置; 以及用于在存储指令退出之前基于所述存储指令和所计算的物理地址来执行用于高速缓存行的预取的装置。

5.

发明授权
Method and apparatus for cutting senior store latency using store prefetching 有权
标题翻译：使用存储预取来切割高级存储延迟的方法和装置

公开(公告)号：US09405545B2

公开(公告)日：2016-08-02

申请号：US13993508

申请日：2011-12-30

申请人： Stanislav Shwartsman , Melih Ozgul , Sebastien Hily , Shlomo Raikin , Raanan Sade , Ron Shalev

发明人： Stanislav Shwartsman , Melih Ozgul , Sebastien Hily , Shlomo Raikin , Raanan Sade , Ron Shalev

IPC分类号： G06F12/08 , G06F9/38

CPC分类号： G06F9/3814 , G06F9/383 , G06F9/3834 , G06F9/3861 , G06F12/0808 , G06F12/0862 , G06F2212/6028 , G06F2212/62

摘要： In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for cutting senior store latency using store prefetching. For example, in one embodiment, such means may include an integrated circuit or an out of order processor means that processes out of order instructions and enforces in-order requirements for a cache. Such an integrated circuit or out of order processor means further includes means for receiving a store instruction; means for performing address generation and translation for the store instruction to calculate a physical address of the memory to be accessed by the store instruction; and means for executing a pre-fetch for a cache line based on the store instruction and the calculated physical address before the store instruction retires.

摘要翻译： 根据本文公开的实施例，提供了使用商店预取来切割高级商店延迟的方法，系统，机制，技术和装置。例如，在一个实施例中，这种装置可以包括集成电路或乱序处理器装置，其处理不一致的指令并对高速缓存执行按顺序的要求。这样的集成电路或不按顺序的处理器装置还包括用于接收存储指令的装置; 用于执行所述存储指令的地址生成和转换以计算由所述存储指令访问的存储器的物理地址的装置; 以及用于在存储指令退出之前基于所述存储指令和所计算的物理地址来执行用于高速缓存行的预取的装置。

6.

发明授权
Methods and apparatus for efficient communication between caches in hierarchical caching design 有权
标题翻译：用于层次化缓存设计中高速缓存之间高效通信的方法和设备

公开(公告)号：US09411728B2

公开(公告)日：2016-08-09

申请号：US13994399

申请日：2011-12-23

申请人： Ron Shalev , Yiftach Gilad , Shlomo Raikin , Igor Yanover , Stanislav Shwartsman , Raanan Sade

发明人： Ron Shalev , Yiftach Gilad , Shlomo Raikin , Igor Yanover , Stanislav Shwartsman , Raanan Sade

IPC分类号： G06F13/00 , G06F12/08 , G06F13/14 , G06F13/38

CPC分类号： G06F12/0811 , G06F12/08 , G06F12/0844 , G06F12/0897 , G06F13/14 , G06F13/38

摘要： In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing efficient communication between caches in hierarchical caching design. For example, in one embodiment, such means may include an integrated circuit having a data bus; a lower level cache communicably interfaced with the data bus; a higher level cache communicably interfaced with the data bus; one or more data buffers and one or more dataless buffers. The data buffers in such an embodiment being communicably interfaced with the data bus, and each of the one or more data buffers having a buffer memory to buffer a full cache line, one or more control bits to indicate state of the respective data buffer, and an address associated with the full cache line. The dataless buffers in such an embodiment being incapable of storing a full cache line and having one or more control bits to indicate state of the respective dataless buffer and an address for an inter-cache transfer line associated with the respective dataless buffer. In such an embodiment, inter-cache transfer logic is to request the inter-cache transfer line from the higher level cache via the data bus and is to further write the inter-cache transfer line into the lower level cache from the data bus.

摘要翻译： 根据本文公开的实施例，提供了用于在分级缓存设计中实现高速缓存之间的有效通信的方法，系统，机制，技术和装置。例如，在一个实施例中，这种装置可以包括具有数据总线的集成电路; 与数据总线可通信地接口的低级缓存; 与数据总线可通信地接口的更高级别的缓存; 一个或多个数据缓冲器和一个或多个无数据缓冲器。这种实施例中的数据缓冲器与数据总线可通信地接口，并且一个或多个数据缓冲器中的每一个具有缓冲存储器以缓冲全高速缓存线，一个或多个控制位以指示各个数据缓冲器的状态，以及与完整缓存行相关联的地址。在这种实施例中的无数据缓冲器不能存储完整的高速缓存行并且具有一个或多个控制位以指示相应无数据缓冲器的状态和与相应无数据缓冲器相关联的高速缓存间传输线的地址。在这样的实施例中，高速缓存间传输逻辑是经由数据总线从高级缓存请求高速缓存间传输线，并且进一步将数据总线上的缓存间传输线写入低级缓存。

7.

发明申请
METHODS AND APPARATUS FOR EFFICIENT COMMUNICATION BETWEEN CACHES IN HIERARCHICAL CACHING DESIGN 有权
标题翻译：用于分层缓存设计中的高速缓存之间的有效通信的方法和设备

公开(公告)号：US20130326145A1

公开(公告)日：2013-12-05

申请号：US13994399

申请日：2011-12-23

申请人： Ron Shalev , Yiftach Gilad , Shlomo Raikin , Igor Yanover , Stanislav Shwartsman , Raanan Sade

发明人： Ron Shalev , Yiftach Gilad , Shlomo Raikin , Igor Yanover , Stanislav Shwartsman , Raanan Sade

IPC分类号： G06F12/08

CPC分类号： G06F12/0811 , G06F12/08 , G06F12/0844 , G06F12/0897 , G06F13/14 , G06F13/38

摘要： In accordance with embodiments disclosed herein, there are provided methods, systems, mechanisms, techniques, and apparatuses for implementing efficient communication between caches in hierarchical caching design. For example, in one embodiment, such means may include an integrated circuit having a data bus; a lower level cache communicably interfaced with the data bus; a higher level cache communicably interfaced with the data bus; one or more data buffers and one or more dataless buffers. The data buffers in such an embodiment being communicably interfaced with the data bus, and each of the one or more data buffers having a buffer memory to buffer a full cache line, one or more control bits to indicate state of the respective data buffer, and an address associated with the full cache line. The dataless buffers in such an embodiment being incapable of storing a full cache line and having one or more control bits to indicate state of the respective dataless buffer and an address for an inter-cache transfer line associated with the respective dataless buffer. In such an embodiment, inter-cache transfer logic is to request the inter-cache transfer line from the higher level cache via the data bus and is to further write the inter-cache transfer line into the lower level cache from the data bus.

摘要翻译： 根据本文公开的实施例，提供了用于在分级缓存设计中实现高速缓存之间的有效通信的方法，系统，机制，技术和装置。例如，在一个实施例中，这种装置可以包括具有数据总线的集成电路; 与数据总线可通信地接口的低级缓存; 与数据总线可通信地接口的更高级别的缓存; 一个或多个数据缓冲器和一个或多个无数据缓冲器。这种实施例中的数据缓冲器与数据总线可通信地接口，并且一个或多个数据缓冲器中的每一个具有缓冲存储器以缓冲全高速缓存线，一个或多个控制位以指示各个数据缓冲器的状态，以及与完整缓存行相关联的地址。在这种实施例中的无数据缓冲器不能存储完整的高速缓存行并且具有一个或多个控制位以指示相应无数据缓冲器的状态和与相应无数据缓冲器相关联的高速缓存间传输线的地址。在这样的实施例中，高速缓存间传输逻辑是经由数据总线从高级缓存请求高速缓存间传输线，并且进一步将数据总线上的缓存间传输线写入低级缓存。

8.

发明申请
TASKING SYSTEM INTERFACE METHODS AND APPARATUSES FOR USE IN WIRELESS DEVICES 有权
标题翻译：在无线设备中使用的测试系统接口方法和设备

公开(公告)号：US20110296415A1

公开(公告)日：2011-12-01

申请号：US12896636

申请日：2010-10-01

申请人： Raheel Khan , Joseph C. Chan , Ron Shalev , Naveed U. Zaman

发明人： Raheel Khan , Joseph C. Chan , Ron Shalev , Naveed U. Zaman

IPC分类号： G06F9/46

CPC分类号： G06F9/544

摘要： Techniques are provided which may be implemented in various methods and/or apparatuses that to provide a tasking system buffer interface capability to interface with a plurality of shared processes/engines.

摘要翻译： 提供了技术，其可以在提供任务系统缓冲器接口能力以与多个共享进程/引擎接口的各种方法和/或装置中实现。

9.

发明授权
Early energy measurement 有权
标题翻译：早期能量测量

公开(公告)号：US07991085B2

公开(公告)日：2011-08-02

申请号：US11427651

申请日：2006-06-29

申请人： Ron Shalev

发明人： Ron Shalev

IPC分类号： H04L27/28

CPC分类号： H03G3/3078 , H04L27/2607 , H04L27/2647

摘要： In a described implementation of early energy measurement, a wireless device adjusts a receiver gain during each current symbol time responsive to a signal energy level measured in a previous symbol time.

摘要翻译： 在所描述的早期能量测量的实现中，无线设备响应于在先前符号时间中测量的信号能级而调整在每个当前符号时间期间的接收机增益。

10.

发明申请
ENERGY AND AREA OPTIMIZED HETEROGENEOUS MULTIPROCESSOR FOR CASCADE CLASSIFIERS 审中-公开
标题翻译：能源和区域优化分类器的异构多重分配器

公开(公告)号：US20160275043A1

公开(公告)日：2016-09-22

申请号：US14662089

申请日：2015-03-18

申请人： Edward T. Grochowski , Michael E. Kounavis , Ron Shalev

发明人： Edward T. Grochowski , Michael E. Kounavis , Ron Shalev

IPC分类号： G06F15/80 , G06F9/38

CPC分类号： G06F9/3887 , G06F9/3836 , G06F9/3867 , G06F9/3869 , G06F9/3885

摘要： In one embodiment, a heterogeneous multicore processor is described that is optimized to execute multi-stage computer vision algorithms such as cascade classifier workloads. In such embodiment the heterogeneous processor includes at least one SIMD core, such as a vector processor core, coupled with one or more scalar cores. In one embodiment the heterogeneous multiprocessor executes multi-stage compute operations, where the SIMD core computes a first set of stages and the one or more scalar cores compute the second set of stages. In one embodiment, a process for designing a heterogeneous multicore processor is disclosed which optimizes the ratio of scalar to SIMD cores based on execution time of the multi-stage compute operation in relation to processor die area consumed by a processor configuration having the ratio.

摘要翻译： 在一个实施例中，描述了被优化以执行诸如级联分类器工作负载的多级计算机视觉算法的异构多核处理器。在这种实施例中，异构处理器包括与一个或多个标量核耦合的至少一个SIMD核，例如向量处理器核。在一个实施例中，异构多处理器执行多阶段计算操作，其中SIMD核心计算第一组阶段，并且一个或多个标量核心计算第二组阶段。在一个实施例中，公开了一种用于设计异构多核处理器的过程，其基于多级计算操作的执行时间相对于具有该比率的处理器配置消耗的处理器管芯面积来优化标量到SIMD核的比率。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类