专利检索 ap:("Kevin B. Normoyle" OR "Michael A. Csoppenszky" OR "Jaybharat Boddu" OR "Jui-Cheng Su" OR "Alex S. Han" OR "Rajasekhar Cherabuddi" OR "Tzungren Tzeng") AND inv:"Rajasekhar Cherabuddi" 第 1 页

1.

发明授权
DMA transfer method for a system including a single-chip processor with a processing core and a device interface in different clock domains 有权
标题翻译：包括具有处理核心的单芯片处理器和不同时钟域中的器件接口的系统的DMA传输方法

公开(公告)号：US06553435B1

公开(公告)日：2003-04-22

申请号：US09229013

申请日：1999-01-12

申请人： Kevin B. Normoyle , Michael A. Csoppenszky , Jaybharat Boddu , Jui-Cheng Su , Alex S. Han , Rajasekhar Cherabuddi , Tzungren Tzeng

发明人： Kevin B. Normoyle , Michael A. Csoppenszky , Jaybharat Boddu , Jui-Cheng Su , Alex S. Han , Rajasekhar Cherabuddi , Tzungren Tzeng

IPC分类号： G06F1328

CPC分类号： G06F12/0897 , G06F12/0888 , G06F12/1054 , G06F12/1081 , G06F13/28 , G06F15/7832

摘要： A single-chip central processing unit (CPU) includes a processing core and a complete cache-coherent I/O system that operates asynchronously with the processing core. An internal communications protocol uses synchronizers and data buffers to transfer information between a clock domain of the processing core and a clock domain of the I/O system. The synchronizers transfer control and handshake signal between clock domains, but the data buffer transfers data without input or output synchronization circuitry for data bits. Throughput for the system is high because the processing unit has direct access to I/O system so that no delays are incurred for complex mechanisms which are commonly employed between a CPU and an external I/O chip-set. Throughput is further increased by holding data from one DMA transfer in the data buffer for use in a subsequent DMA transfer. In one embodiment, the integrated I/O system contains a dedicated memory management unit including a translation lookaside buffer which converts I/O addresses to physical addresses for the processing core.

摘要翻译： 单芯片中央处理单元（CPU）包括处理核心和与处理核心异步运行的完整高速缓存相干I / O系统。内部通信协议使用同步器和数据缓冲器在处理核心的时钟域和I / O系统的时钟域之间传输信息。同步器在时钟域之间传送控制和握手信号，但是数据缓冲器传输数据，而不需要输入或输出同步电路来进行数据位。系统的吞吐量很高，因为处理单元可以直接访问I / O系统，以便在CPU和外部I / O芯片组之间通常采用的复杂机制不会产生延迟。通过保存来自数据缓冲器中的一个DMA传输的数据以进行后续DMA传输来进一步增加吞吐量。在一个实施例中，集成I / O系统包含专用存储器管理单元，其包括将I / O地址转换为处理核的物理地址的翻译后备缓冲器。

2.

发明授权
Low-latency, high-throughput, integrated cache coherent I/O system for a single-chip processor 失效
标题翻译：用于单芯片处理器的低延迟，高吞吐量的集成缓存一致I / O系统

公开(公告)号：US5884100A

公开(公告)日：1999-03-16

申请号：US660026

申请日：1996-06-06

申请人： Kevin B. Normoyle , Michael A. Csoppenszky , Jaybharat Boddu , Jui-Cheng Su , Alex S. Han , Rajasekhar Cherabuddi , Tzungren Tzeng

发明人： Kevin B. Normoyle , Michael A. Csoppenszky , Jaybharat Boddu , Jui-Cheng Su , Alex S. Han , Rajasekhar Cherabuddi , Tzungren Tzeng

IPC分类号： G06F12/08 , G06F12/10 , G06F13/28 , G06F15/78 , G06F13/00

CPC分类号： G06F12/0897 , G06F12/1081 , G06F13/28 , G06F15/7832 , G06F12/0888 , G06F12/1054

摘要： A single-chip central processing unit (CPU) includes a processing core and a complete cache-coherent I/O system that operates asynchronously with the processing core. An internal communications protocol uses synchronizers and data buffers to transfer information between a clock domain of the processing core and a clock domain of the I/O system. The synchronizers transfer control and handshake signal between clock domains, but the data buffer transfers data without input or output synchronization circuitry for data bits. Throughput for the system is high because the processing unit has direct access to I/O system so that no delays are incurred for complex mechanisms which are commonly employed between a CPU and an external I/O chip-set. Throughput is further increased by holding data from one DMA transfer in the data buffer for use in a subsequent DMA transfer. In one embodiment, the integrated I/O system contains a dedicated memory management unit including a translation lookaside buffer which converts I/O addresses to physical addresses for the processing core.

摘要翻译： 单芯片中央处理单元（CPU）包括处理核心和与处理核心异步运行的完整高速缓存相干I / O系统。内部通信协议使用同步器和数据缓冲器在处理核心的时钟域和I / O系统的时钟域之间传输信息。同步器在时钟域之间传送控制和握手信号，但是数据缓冲器传输数据，而不需要输入或输出同步电路来进行数据位。系统的吞吐量很高，因为处理单元可以直接访问I / O系统，以便在CPU和外部I / O芯片组之间通常采用的复杂机制不会产生延迟。通过保存来自数据缓冲器中的一个DMA传输的数据以进行后续DMA传输来进一步增加吞吐量。在一个实施例中，集成I / O系统包含专用存储器管理单元，其包括将I / O地址转换为处理核的物理地址的翻译后备缓冲器。

3.

发明授权
Method to reduce memory latencies by performing two levels of speculation 有权
标题翻译：通过执行两级投机来减少内存延迟的方法

公开(公告)号：US06496917B1

公开(公告)日：2002-12-17

申请号：US09499264

申请日：2000-02-07

申请人： Rajasekhar Cherabuddi , Kevin B. Normoyle , Brian J. McGee , Meera Kasinathan , Anup Sharma , Sutikshan Bhutani

发明人： Rajasekhar Cherabuddi , Kevin B. Normoyle , Brian J. McGee , Meera Kasinathan , Anup Sharma , Sutikshan Bhutani

IPC分类号： G06F1200

CPC分类号： G06F12/0831 , G06F2212/2542 , G06F2212/507

摘要： A multiprocessor system includes a plurality of central processing units (CPUs) connected to one another by a system bus. Each CPU includes a cache controller to communicate with its cache, and a primary memory controller to communicate with its primary memory. When there is a cache miss in a CPU, the cache controller routes an address request for primary memory directly to the primary memory via the CPU as a speculative request without access the system bus, and also issues the address request to the system bus to facilitate data coherency. The speculative request is queued in the primary memory controller, which in turn retrieves speculative data from a specified primary memory address. The CPU monitors the system bus for a subsequent transaction that requests the specified data in the primary memory. If the subsequent transaction requesting the specified data is a read transaction that corresponds to the speculative address request, the speculative request is validated and becomes non-speculative. If, on the other hand, the subsequent transaction requesting the specified data is a write transaction, the speculative request is canceled.

摘要翻译： 多处理器系统包括通过系统总线相互连接的多个中央处理单元（CPU）。每个CPU包括与其高速缓存通信的高速缓存控制器以及与其主存储器通信的主存储器控制器。当CPU中存在高速缓存未命中时，缓存控制器将主存储器的地址请求直接通过CPU作为推测请求直接发送到主存储器，而无需访问系统总线，并且还向系统总线发出地址请求以方便数据一致性。推测请求在主存储器控制器中排队，主存储器控制器又从指定的主存储器地址检索推测数据。 CPU监视系统总线以用于请求主存储器中指定数据的后续事务。如果请求指定数据的后续事务是与推测地址请求相对应的读事务，则推测请求将被验证并变为非推测性。另一方面，如果请求指定数据的后续事务是写事务，则推测请求被取消。

4.

发明授权
Simplified writeback handling 有权
标题翻译：简化回写处理

公开(公告)号：US06477622B1

公开(公告)日：2002-11-05

申请号：US09670856

申请日：2000-09-26

申请人： Kevin B. Normoyle , Meera Kasinathan , Rajasekhar Cherabuddi

发明人： Kevin B. Normoyle , Meera Kasinathan , Rajasekhar Cherabuddi

IPC分类号： G06F1200

CPC分类号： G06F12/0804

摘要： The main cache of a processor in a multiprocessor computing system is coupled to receive writeback data during writeback operations. In one embodiment, during writeback operations, e.g., for a cache miss, dirty data in the main cache is merged with modified data from an associated write cache, and the resultant writeback data line is loaded into a writeback buffer. The writeback data is also written back into the main cache, and is maintained in the main cache until replaced by new data. Subsequent requests (i.e., snoops) for the data are then serviced from the main cache, rather than from the writeback buffer. In some embodiments, further modifications of the writeback data in the main cache are prevented. The writeback data line in the main cache remains valid until read data for the cache miss is returned, thereby ensuring that the read address reaches the system interface for proper bus ordering before the writeback line is lost. In one embodiment, the writeback operation is paired with the read operation for the cache miss to ensure that upon completion of the read operation, the writeback address has reached the system interface for bus ordering, thereby maintaining cache coherency while allowing requests to be serviced from the main cache.

摘要翻译： 多处理器计算系统中的处理器的主缓存被耦合以在回写操作期间接收回读数据。在一个实施例中，在回写操作期间，例如，对于高速缓存未命中，主缓存器中的脏数据与来自相关联的写入高速缓存的修改的数据合并，并且所得到的写回数据行被加载到写回缓冲器中。写回数据也被写回到主缓存中，并保存在主缓存中，直到被新数据替换为止。然后，从主缓存器而不是从回写缓冲器来服务数据的后续请求（即，窥探）。在一些实施例中，防止主缓存中的回写数据的进一步修改。主缓存中的回写数据线在返回高速缓存未命中的读取数据之前保持有效，从而确保读地址到达系统接口以在回写行丢失之前进行正确的总线排序。在一个实施例中，写回操作与用于高速缓存未命中的读取操作配对，以确保在完成读取操作时，回写地址已经到达用于总线排序的系统接口，从而保持高速缓存一致性，同时允许从主缓存。

5.

发明申请
METHODS AND SYSTEMS FOR HARDWARE ACCELERATION OF DATABASE OPERATIONS AND QUERIES BASED ON MULTIPLE HARDWARE ACCELERATORS 有权
标题翻译：基于多个硬件加速器的数据库操作和查询的硬件加速方法与系统

公开(公告)号：US20120054236A1

公开(公告)日：2012-03-01

申请号：US13172790

申请日：2011-06-29

申请人： Jeremy L. Branscome , Joseph Irawan Chamdani , Rajasekhar Cherabuddi

发明人： Jeremy L. Branscome , Joseph Irawan Chamdani , Rajasekhar Cherabuddi

IPC分类号： G06F17/30

CPC分类号： G06F17/30519

摘要： Embodiments of the present invention provide a hardware accelerator that assists a host database system in processing its queries. The hardware accelerator comprises special purpose processing elements that are capable of receiving database query/operation tasks in the form of machine code database instructions, execute them in hardware without software, and return the query/operation result back to the host system.

摘要翻译： 本发明的实施例提供了一种辅助主机数据库系统处理其查询的硬件加速器。硬件加速器包括能够以机器码数据库指令的形式接收数据库查询/操作任务的专用处理元件，无需软件执行硬件，并将查询/运算结果返回主机系统。

6.

发明授权
Inclusion vector architecture for a level two cache 失效
标题翻译：包含二级缓存的向量架构

公开(公告)号：US5996048A

公开(公告)日：1999-11-30

申请号：US879530

申请日：1997-06-20

申请人： Rajasekhar Cherabuddi , Ricky C. Hetherington

发明人： Rajasekhar Cherabuddi , Ricky C. Hetherington

IPC分类号： G06F12/08

CPC分类号： G06F12/0811

摘要： A cache architecture with a first level cache and a second level cache, with the second level cache lines including an inclusion vector which indicates which portion of that line are stored in the first level cache. In addition, an instruction/data bit in the inclusion vector indicates whether a portion of that line is in the instruction cache at all. Thus, when a snoop is done to the level two cache, additional snoops to the level one cache only need to be done for those lines which are indicated as present by the inclusion vector.

摘要翻译： 具有第一级高速缓存和第二级高速缓存的高速缓存结构，其中第二级高速缓存线包括指示该行的哪一部分被存储在第一级高速缓存中的包含向量。另外，包含向量中的指令/数据位表示该行的一部分是否在指令高速缓存中。因此，当对二级缓存进行窥探时，仅对包含向量表示的那些行需要对一级缓存的附加监听。

7.

发明授权
Apparatus and method to speculatively initiate primary memory accesses 失效
标题翻译：推测性地启动主存储器访问的装置和方法

公开(公告)号：US5761708A

公开(公告)日：1998-06-02

申请号：US658874

申请日：1996-05-31

申请人： Rajasekhar Cherabuddi , Anuradha Moudgal , Kevin Normoyle

发明人： Rajasekhar Cherabuddi , Anuradha Moudgal , Kevin Normoyle

IPC分类号： G06F12/08 , G06F13/16 , G06F13/18

CPC分类号： G06F13/161 , G06F12/0884

摘要： A central processing unit with an external cache controller and a primary memory controller is used to speculatively initiate primary memory access in order to improve average primary memory access times. The external cache controller processes an address request during an external cache latency period and selectively generates an external cache miss signal or an external cache hit signal. If no other primary memory access demands exist at the beginning of the external cache latency period, the primary memory controller is used to speculatively initiate a primary memory access corresponding to the address request. The speculative primary memory access is completed in response to an external cache miss signal. The speculative primary memory access is aborted if an external cache hit signal is generated or a non-speculative primary memory access demand is generated during the external cache latency period.

摘要翻译： 具有外部高速缓存控制器和主存储器控制器的中央处理单元用于推测性地启动主存储器访问，以便提高平均主存储器访问时间。外部高速缓存控制器在外部高速缓存等待期间处理地址请求，并选择性地产生外部高速缓存未命中信号或外部高速缓存命中信号。如果在外部高速缓存等待时间开始时不存在其他主存储器访问需求，则主存储器控制器用于推测地发起对应于地址请求的主存储器访问。响应于外部高速缓存未命中信号完成了推测性主存储器访问。如果外部缓存命中信号被产生或在外部高速缓存等待时间段期间产生非推测性的主存储器访问需求，则推测主存储器访问被中止。

8.

发明授权
Dynamically allocated cache memory for a multi-processor unit 有权
标题翻译：为多处理器单元动态分配高速缓存

公开(公告)号：US06725336B2

公开(公告)日：2004-04-20

申请号：US09838921

申请日：2001-04-20

申请人： Rajasekhar Cherabuddi

发明人： Rajasekhar Cherabuddi

IPC分类号： G06F1208

CPC分类号： G06F12/084

摘要： The resources of a partitioned cache memory are dynamically allocated between two or more processors on a multi-processor unit (MPU). In one embodiment, the MPU includes first and second processors, and the cache memory includes first and second partitions. A cache access circuit selectively transfers data between the cache memory partitions to maximize cache resources. In one mode, both processors are active and may simultaneously execute separate instruction threads. In this mode, the cache access circuit allocates the first cache memory partition as dedicated cache memory for the first processor, and allocates the second cache memory partition as dedicated cache memory for the second processor. In another mode, one processor is active, and the other processor is inactive. In this mode, the cache access circuit allocates both the first and second cache memory partitions as cache memory for the active processor.

摘要翻译： 分区高速缓冲存储器的资源在多处理器单元（MPU）上的两个或多个处理器之间动态分配。在一个实施例中，MPU包括第一和第二处理器，并且高速缓存存储器包括第一和第二分区。缓存访问电路选择性地在高速缓冲存储器分区之间传送数据以最大化高速缓存资源。在一种模式下，两个处理器都是活动的，并且可以同时执行单独的指令线程。在该模式中，高速缓存访问电路将第一高速缓存存储器分区分配为用于第一处理器的专用高速缓存存储器，并且将第二高速缓冲存储器分区分配为用于第二处理器的专用高速缓冲存储器。在另一种模式下，一个处理器处于活动状态，另一个处理器处于非活动状态。在该模式中，高速缓存访问电路将第一高速缓存存储器分区和第二高速缓存存储器分区分配为用于活动处理器的高速缓冲存储器。

9.

发明授权
Method and apparatus for resolving multiple branches 失效
标题翻译：用于解决多个分支的方法和装置

公开(公告)号：US06256729B1

公开(公告)日：2001-07-03

申请号：US09004971

申请日：1998-01-09

申请人： Rajasekhar Cherabuddi , Sanjay Patel , Adam R. Talcott , Ramesh K. Panwar

发明人： Rajasekhar Cherabuddi , Sanjay Patel , Adam R. Talcott , Ramesh K. Panwar

IPC分类号： G06F1500

CPC分类号： G06F9/3861 , G06F9/3806

摘要： A method for repairing a pipeline in response to a branch instruction having a branch, includes the steps of providing a branch repair table having a plurality of entries, allocating an entry in the branch repair table for the branch instruction, storing a target address, a fall-through address, and repair information in the entry in the branch repair table, processing the branch instruction to determine whether the branch was taken, and repairing the pipeline in response to the repair information and the fall-through address in the entry in the branch repair table when the branch was not taken.

摘要翻译： 一种用于响应于具有分支的分支指令来修复流水线的方法，包括以下步骤：提供具有多个条目的分支修复表，在分支指令的分支修复表中分配条目，存储目标地址，分支修复表中的条目中的修复信息和修复信息，处理分支指令以确定是否采用分支，以及修复管道，以响应修复信息和条目中的到达地址分支修复表时未分支。

10.

发明授权
Method and apparatus for branch target prediction 失效
标题翻译：分支目标预测方法和装置

公开(公告)号：US5938761A

公开(公告)日：1999-08-17

申请号：US976826

申请日：1997-11-24

申请人： Sanjay Patel , Adam R. Talcott , Rajasekhar Cherabuddi

发明人： Sanjay Patel , Adam R. Talcott , Rajasekhar Cherabuddi

IPC分类号： G06F9/38 , G06F9/32

CPC分类号： G06F9/3806

摘要： One embodiment of the present invention provides a method and an apparatus for predicting the target of a branch instruction. This method and apparatus operate by using a translation lookaside buffer (TLB) to store page numbers for predicted branch target addresses. In this embodiment, a branch target address table stores a small index to a location in the translation lookaside buffer, and this index is used retrieve a page number from the location in the translation lookaside buffer. This page number is used as the page number portion of a predicted branch target address. Thus, a small index into a translation lookaside buffer can be stored in a predicted branch target address table instead of a larger page number for the predicted branch target address. This technique effectively reduces the size of a predicted branch target table by eliminating much of the space that is presently wasted storing redundant page numbers. Another embodiment maintains coherence between the branch target address table and the translation lookaside buffer. This makes it possible to detect a miss in the translation lookaside buffer at least one cycle earlier by examining the branch target address table.

摘要翻译： 本发明的一个实施例提供了一种用于预测分支指令的目标的方法和装置。该方法和装置通过使用翻译后备缓冲器（TLB）来存储用于预测的分支目标地址的页码。在本实施例中，分支目标地址表将小索引存储到翻译后备缓冲器中的位置，并且使用该索引从翻译后备缓冲器中的位置检索页码。该页码用作预测分支目标地址的页码部分。因此，可以在预测的分支目标地址表中存储向翻译后备缓冲器的小索引，而不是预测的分支目标地址的较大的页码。该技术通过消除存储冗余页码的目前浪费的大部分空间来有效地减小预测分支目标表的大小。另一个实施例维护分支目标地址表和转换后备缓冲器之间的一致性。这使得可以通过检查分支目标地址表来更早地检测翻译后备缓冲区中的未命中至少一个周期。

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类