Patent search ap:("Apple Inc.") AND inv:"Rajat Goel" Page 1

1.

发明授权
Load ordering in a weakly-ordered processor 有权
Title translation: 在弱有序处理器中加载排序

公开(公告)号：US09383995B2

公开(公告)日：2016-07-05

申请号：US13750972

申请日：2013-01-25

Applicant: Apple Inc.

Inventor： Pradeep Kanapathipillai , Hari Kannan , Po-Yung Chang , Ming-Ta Hsu , Rajat Goel

IPC: G06F9/30 , G06F13/00 , G06F9/38

CPC classification number: G06F9/30043 , G06F9/3834

Abstract: Techniques are disclosed relating to ordering of load instructions in a weakly-ordered memory model. In one embodiment, a processor includes a cache with multiple cache lines and a store queue configured to maintain status information associated with a store instruction that targets a location in one of the cache lines. In this embodiment, the processor is configured to set an indicator in the status information in response to migration of the targeted cache line. The indicator may be usable to sequence performance of load instructions that are younger than the store instruction. For example, the processor may be configured to wait, based on the indicator, to perform a younger load instruction that targets the same location as the store instruction until the store instruction is removed from the store queue. This may prevent forwarding of the value of the store instruction to the younger load and preserve load-load ordering.

Abstract translation: 公开了关于弱有序存储器模型中的加载指令的排序的技术。在一个实施例中，处理器包括具有多个高速缓存行的高速缓存和存储队列，该存储队列被配置为维护与存储指令相关联的状态信息，所述存储指令针对高速缓存行之一中的位置在该实施例中，处理器被配置为响应于目标高速缓存线的迁移而将状态信息中的指示符设置成。该指示符可用于对比小于存储指令的加载指令的性能进行排序。例如，处理器可以被配置为基于指示符等待执行与存储指令相同的位置的较年轻的加载指令，直到存储指令从存储队列中移除。这可能会阻止将存储指令的值转发到较小的负载并保持负载负载顺序。

2.

发明授权
Execution unit power management 有权

公开(公告)号：US10037073B1

公开(公告)日：2018-07-31

申请号：US15273925

申请日：2016-09-23

Applicant: Apple Inc.

Inventor： Edvin Catovic , Rajat Goel , Richard F. Russo , Matthew R. Johnson , Shingo Suzuki , Pradeep Kanapathipillai , Raghava Rao V. Denduluri , Pankaj Lnu

IPC: G06F1/32

CPC classification number: G06F1/3287 , G06F1/3206 , G06F1/3228 , G06F1/3243

Abstract: A processor includes an instruction issue circuit, and high-utilization and low-utilization execution unit circuits coupled to execute instructions received from the instruction issue unit. On average, utilization of the low-utilization execution unit circuit is lower than utilization of the high-utilization execution unit circuit. The processor also includes a retention circuit coupled to a different power domain than the low-utilization execution unit circuit, and a power management circuit. The power management circuit may be configured to detect that inactivity of the low-utilization execution unit circuit satisfies a threshold inactivity level; upon detecting that the threshold inactivity level is satisfied, cause architecturally-visible state of the low-utilization execution unit circuit to be copied to the retention circuit; and subsequent to copying of the architecturally-visible state to the retention circuit, cause the low-utilization execution unit circuit to enter a power-off state, where the retention circuit retains stored data during the power-off state.

3.

发明授权
Lookahead scheme for prioritized reads 有权

公开(公告)号：US09652371B2

公开(公告)日：2017-05-16

申请号：US14624621

申请日：2015-02-18

Applicant: Apple Inc.

Inventor： Rajat Goel , Hari S. Kannan , Khurram Z. Malik

IPC: G06F13/16 , G06F12/02 , G11C8/06 , G11C8/16 , G06F5/16 , G06F12/00

CPC classification number: G06F12/0215 , G06F5/16 , G06F13/1673 , G06F2212/301 , G11C8/06 , G11C8/16

Abstract: A circular queue implementing a scheme for prioritized reads is disclosed. In one embodiment, a circular queue (or buffer) includes a number of storage locations each configured to store a data value. A multiplexer tree is coupled between the storage locations and a read port. A priority circuit is configured to generate and provide selection signals to each multiplexer of the multiplexer tree, based on a priority scheme. Based on the states of the selection signals, one of the storage locations is coupled to the read port via the multiplexers of the multiplexer tree.

4.

发明申请
CONCURRENT STORE AND LOAD OPERATIONS 有权
Title translation: 当前存储和负载操作

公开(公告)号：US20150199272A1

公开(公告)日：2015-07-16

申请号：US14154122

申请日：2014-01-13

Applicant: Apple Inc.

Inventor： Rajat Goel , Mridul Agarwal

IPC: G06F12/08

CPC classification number: G06F12/0815 , G06F12/0844 , G06F12/0891

Abstract: Systems, processors, and methods for efficiently handling concurrent store and load operations within a processor. A processor comprises a load-store unit (LSU) with a banked level-one (L1) data cache. When a store operation is ready to write data to the L1 data cache, the store operation will skip the write to any banks that have a conflict with a concurrent load operation. A partial write of the store operation will be performed to those banks of the L1 data cache that do not have a conflict with a concurrent load operation. For every attempt to write the store operation, a corresponding store mask will be updated to indicate which portions of the store operation were successfully written to the L1 data cache.

Abstract translation: 用于在处理器内有效处理并发存储和加载操作的系统，处理器和方法。处理器包括具有一级（L1）数据高速缓存的加载存储单元（LSU）。当存储操作准备好将数据写入L1数据高速缓存时，存储操作将跳过对与并发加载操作冲突的任何存储区的写操作。将对与数据并行加载操作不冲突的L1数据高速缓存区进行存储操作的部分写入。对于每次尝试写存储操作时，将更新相应的存储掩码，以指示存储操作的哪些部分已成功写入L1数据高速缓存。

5.

发明申请
Lookahead Scheme for Prioritized Reads 审中-公开
Title translation: 优先阅读的前瞻方案

公开(公告)号：US20150161033A1

公开(公告)日：2015-06-11

申请号：US14624621

申请日：2015-02-18

Applicant: Apple Inc.

Inventor： Rajat Goel , Hari S. Kannan , Khurram Z. Malik

IPC: G06F12/02 , G06F5/16 , G06F13/16

CPC classification number: G06F12/0215 , G06F5/16 , G06F13/1673 , G06F2212/301 , G11C8/06 , G11C8/16

Abstract: A circular queue implementing a scheme for prioritized reads is disclosed. In one embodiment, a circular queue (or buffer) includes a number of storage locations each configured to store a data value. A multiplexer tree is coupled between the storage locations and a read port. A priority circuit is configured to generate and provide selection signals to each multiplexer of the multiplexer tree, based on a priority scheme. Based on the states of the selection signals, one of the storage locations is coupled to the read port via the multiplexers of the multiplexer tree.

Abstract translation: 公开了实现优先读取方案的循环队列。在一个实施例中，循环队列（或缓冲器）包括多个存储位置，每个存储位置被配置为存储数据值。复用器树耦合在存储位置和读端口之间。优先级电路被配置为基于优先级方案来生成并提供对多路复用器树的每个多路复用器的选择信号。基于选择信号的状态，其中一个存储位置经由复用器树的多路复用器耦合到读端口。

6.

发明授权
Completing load and store instructions in a weakly-ordered memory model 有权
Title translation: 在弱有序的内存模型中完成加载和存储指令

公开(公告)号：US09535695B2

公开(公告)日：2017-01-03

申请号：US13750942

申请日：2013-01-25

Applicant: Apple Inc.

Inventor： John H. Mylius , Rajat Goel , Pradeep Kanapathipillai , Hari S. Kannan

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30043 , G06F9/3834 , G06F9/3836 , G06F9/3838 , G06F9/3855

Abstract: Techniques are disclosed relating to completion of load and store instructions in a weakly-ordered memory model. In one embodiment, a processor includes a load queue and a store queue and is configured to associate queue information with a load instruction in an instruction stream. In this embodiment, the queue information indicates a location of the load instruction in the load queue and one or more locations in the store queue that are associated with one or more store instructions that are older than the load instruction. The processor may determine, using the queue information, that the load instruction does not conflict with a store instruction in the store queue that is older than the load instruction. The processor may remove the load instruction from the load queue while the store instruction remains in the store queue. The queue information may include a wrap value for the load queue.

Abstract translation: 公开了在弱有序存储器模型中完成负载和存储指令的技术。在一个实施例中，处理器包括加载队列和存储队列，并且被配置为将队列信息与指令流中的加载指令相关联。在该实施例中，队列信息指示加载队列中的加载指令的位置和存储队列中与一个或多个比加载指令更早的存储指令相关联的一个或多个位置。处理器可以使用队列信息来确定加载指令不与存储队列中比加载指令更早的存储指令冲突。当存储指令保留在存储队列中时，处理器可以从加载队列中移除加载指令。队列信息可以包括加载队列的换行值。

7.

发明授权
Power switch ramp rate control using selectable daisy-chained connection of enable to power switches or daisy-chained flops providing enables 有权
Title translation: 电源开关斜坡率控制使用可选的菊花链连接启用电源开关或菊花链触发器提供启用

公开(公告)号：US09564898B2

公开(公告)日：2017-02-07

申请号：US14622111

申请日：2015-02-13

Applicant: Apple Inc.

Inventor： Shingo Suzuki , Harsha Krishnamurthy , Edvin Catovic , Rajat Goel , Manoj Gopalan

IPC: H03K19/003 , H03K19/00

CPC classification number: H03K19/00361 , H03K19/0013 , H03K19/0016

Abstract: In an embodiment, an integrated circuit may include one or more power gated blocks and a power manager circuit. The power manager circuit may be configured to generate a block enable for each power gated block and a block enable clock. The power gated block may generate local block enables to various power switch segments in the power gated block. In particular, the power gated block may include a set of series-connected flops that receive the block enable from the power manager circuit. The power gated block may include a set of multiplexors (muxes) that provide the local block enables for each power switch segment. One input of the muxes is coupled to the block enable, and the other input is coupled to another enable propagated through one of the other power switch segments. Accordingly, the muxes may be controlled to select the propagated enables or the input block enable.

Abstract translation: 在一个实施例中，集成电路可以包括一个或多个电源门控块和功率管理器电路。功率管理器电路可以被配置为为每个电源门控块和块使能时钟生成块使能。电源门控块可以在电源门控块中产生各种电源开关段的本地块使能。特别地，电源门控块可以包括从电源管理器电路接收块使能的一组串联的触发器。功率门控块可以包括为每个功率开关段提供本地块使能的一组多路复用器（多路复用器）。多路复用器的一个输入耦合到块使能，另一个输入耦合到通过其它功率开关段之一传播的另一个功能。因此，可以控制多路复用器来选择传播的使能或输入块使能。

8.

发明授权
Register file circuit design process 有权

公开(公告)号：US09824171B2

公开(公告)日：2017-11-21

申请号：US14820223

申请日：2015-08-06

Applicant: Apple Inc.

Inventor： Harsha Krishnamurthy , Mridul Agarwal , Shyam Sundar Balasubramanian , Christopher S. Thomas , Rajat Goel , Rohit Kumar , Muthukumaravelu Velayoudame

IPC: G06F17/50

CPC classification number: G06F17/505 , G06F17/5068

Abstract: In some embodiments, a register file circuit design process includes instructing an automated integrated circuit design program to generate a register file circuit design, including providing a cell circuit design and instructing the automated integrated circuit design program to generate a selection design, a pre-decode design, and a data gating design. The cell circuit design describes a plurality of selection circuits that have a particular arrangement. The selection design describes a plurality of replica circuits that include respective pluralities of selection circuits having the particular arrangement. The pre-decode design describes a pre-decode circuit configured to identify a plurality of entries identified by a portion of a write instruction. The data gating design describes data gating circuits configured, in response to the pre-decode circuit not identifying respective entries, to disable data inputs to respective write selection circuits connected to the respective entries.

9.

发明申请
REGISTER FILE CIRCUIT DESIGN PROCESS 有权
Title translation: 寄存器文件电路设计流程

公开(公告)号：US20170039299A1

公开(公告)日：2017-02-09

申请号：US14820223

申请日：2015-08-06

Applicant: Apple Inc.

Inventor： Harsha Krishnamurthy , Mridul Agarwal , Shyam Sundar Balasubramanian , Christopher S. Thomas , Rajat Goel , Rohit Kumar , Muthukumaravelu Velayoudame

IPC: G06F17/50

CPC classification number: G06F17/505 , G06F17/5068

Abstract: In some embodiments, a register file circuit design process includes instructing an automated integrated circuit design program to generate a register file circuit design, including providing a cell circuit design and instructing the automated integrated circuit design program to generate a selection design, a pre-decode design, and a data gating design. The cell circuit design describes a plurality of selection circuits that have a particular arrangement. The selection design describes a plurality of replica circuits that include respective pluralities of selection circuits having the particular arrangement. The pre-decode design describes a pre-decode circuit configured to identify a plurality of entries identified by a portion of a write instruction. The data gating design describes data gating circuits configured, in response to the pre-decode circuit not identifying respective entries, to disable data inputs to respective write selection circuits connected to the respective entries.

Abstract translation: 在一些实施例中，寄存器文件电路设计过程包括指示自动集成电路设计程序产生寄存器文件电路设计，包括提供单元电路设计并指示自动化集成电路设计程序产生选择设计，预解码设计和数据门控设计。单元电路设计描述了具有特定布置的多个选择电路。选择设计描述了包括具有特定布置的相应多个选择电路的多个复制电路。预解码设计描述了预解码电路，其被配置为识别由写指令的一部分识别的多个条目。数据门控设计描述了数据选通电路，其响应于未识别相应条目的预解码电路而配置，以禁止连接到各个条目的相应写入选择电路的数据输入。

10.

发明授权
Concurrent store and load operations 有权
Title translation: 并行存储和加载操作

公开(公告)号：US09448936B2

公开(公告)日：2016-09-20

申请号：US14154122

申请日：2014-01-13

Applicant: Apple Inc.

Inventor： Rajat Goel , Mridul Agarwal

IPC: G06F12/08

CPC classification number: G06F12/0815 , G06F12/0844 , G06F12/0891

Abstract: Systems, processors, and methods for efficiently handling concurrent store and load operations within a processor. A processor comprises a load-store unit (LSU) with a banked level-one (L1) data cache. When a store operation is ready to write data to the L1 data cache, the store operation will skip the write to any banks that have a conflict with a concurrent load operation. A partial write of the store operation will be performed to those banks of the L1 data cache that do not have a conflict with a concurrent load operation. For every attempt to write the store operation, a corresponding store mask will be updated to indicate which portions of the store operation were successfully written to the L1 data cache.

Abstract translation: 用于在处理器内有效处理并发存储和加载操作的系统，处理器和方法。处理器包括具有一级（L1）数据高速缓存的加载存储单元（LSU）。当存储操作准备好将数据写入L1数据高速缓存时，存储操作将跳过对与并发加载操作冲突的任何存储区的写操作。将对与数据并行加载操作不冲突的L1数据高速缓存区进行存储操作的部分写入。对于每次尝试写存储操作时，将更新相应的存储掩码，以指示存储操作的哪些部分已成功写入L1数据高速缓存。

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification