Least recently used mechanism for cache line eviction from a cache memory
    21.
    发明授权
    Least recently used mechanism for cache line eviction from a cache memory 有权
    最近用于高速缓存存储器缓存线驱逐的最近使用的机制

    公开(公告)号:US09563575B2

    公开(公告)日:2017-02-07

    申请号:US14929645

    申请日:2015-11-02

    Applicant: Apple Inc.

    Abstract: A mechanism for evicting a cache line from a cache memory includes first selecting for eviction a least recently used cache line of a group of invalid cache lines. If all cache lines are valid, selecting for eviction a least recently used cache line of a group of cache lines in which no cache line of the group of cache lines is also stored within a higher level cache memory such as the L1 cache, for example. Lastly, if all cache lines are valid and there are no non-inclusive cache lines, selecting for eviction the least recently used cache line stored in the cache memory.

    Abstract translation: 用于从高速缓冲存储器中逐出高速缓存行的机制包括首先选择驱逐一组无效高速缓存行的最近最少使用的高速缓存行。 如果所有高速缓存行都有效,则选择驱逐,一组高速缓存行的最近最少使用的高速缓存行,其中该高速缓存行组中的高速缓存行也不存储在诸如L1高速缓存的更高级高速缓冲存储器中 。 最后,如果所有高速缓存行都是有效的,并且没有非包含的高速缓存行,则选择驱逐存储在高速缓冲存储器中的最近最少使用的高速缓存行。

    MARKING VALID RETURN TARGETS
    22.
    发明申请
    MARKING VALID RETURN TARGETS 审中-公开
    标记有效的返回目标

    公开(公告)号:US20170024559A1

    公开(公告)日:2017-01-26

    申请号:US14807609

    申请日:2015-07-23

    Applicant: Apple Inc.

    CPC classification number: G06F21/54

    Abstract: Systems, apparatuses, methods, and computer-readable mediums for preventing return oriented programming (ROP) attacks. A compiler may insert landing pads adjacent to valid return targets in an instruction sequence. When a return instruction is executed, the processor may treat the return as suspicious if the target of the return instruction does not have an adjacent landing pad. Additionally, each landing pad may be encoded with a color, and a colored launch pad may be inserted into the instruction stream next to each return instruction. When a return instruction is executed, the processor may determine if the target of the return has a landing pad with the same color as the launch pad of the return instruction. Return-target pairs with color mismatches may be treated as suspicious and the offending process may be killed.

    Abstract translation: 用于防止返回定向编程(ROP)攻击的系统,装置,方法和计算机可读介质。 编译器可以在指令序列中插入与有效返回目标相邻的着陆焊盘。 当执行返回指令时,如果返回指令的目标没有相邻的着陆垫,则处理器可以将返回值视为可疑。 此外,每个着陆垫可以用颜色编码,并且彩色的发射板可以插入每个返回指令旁边的指令流中。 当执行返回指令时,处理器可以确定返回目标是否具有与返回指令的发射台相同颜色的着陆键盘。 具有颜色不匹配的返回目标对可能被视为可疑的,并且违规进程可能被杀死。

    Multi-level dispatch for a superscalar processor
    23.
    发明授权
    Multi-level dispatch for a superscalar processor 有权
    超标量处理器的多级调度

    公开(公告)号:US09336003B2

    公开(公告)日:2016-05-10

    申请号:US13749999

    申请日:2013-01-25

    Applicant: Apple Inc.

    CPC classification number: G06F9/3836 G06F9/30145 G06F9/4881 G06F9/4887

    Abstract: In an embodiment, a processor includes a multi-level dispatch circuit configured to supply operations for execution by multiple parallel execution pipelines. The multi-level dispatch circuit may include multiple dispatch buffers, each of which is coupled to multiple reservation stations. Each reservation station may be coupled to a respective execution pipeline and may be configured to schedule instruction operations (ops) for execution in the respective execution pipeline. The sets of reservation stations coupled to each dispatch buffer may be non-overlapping. Thus, if a given op is to be executed in a given execution pipeline, the op may be sent to the dispatch buffer which is coupled to the reservation station that provides ops to the given execution pipeline.

    Abstract translation: 在一个实施例中,处理器包括被配置为提供由多个并行执行管线执行的操作的多级调度电路。 多级调度电路可以包括多个调度缓冲器,每个调度缓冲器耦合到多个保留站。 每个保留站可以耦合到相应的执行流水线,并且可以被配置为调度用于在相应的执行流水线中执行的指令操作(op)。 耦合到每个调度缓冲器的保留站组可以是不重叠的。 因此,如果在给定的执行流水线中执行给定的操作,则操作可以被发送到调度缓冲器,该调度缓冲器耦合到向给定的执行流水线提供操作的保留站。

    Flush engine
    24.
    发明授权
    Flush engine 有权
    冲洗发动机

    公开(公告)号:US09128857B2

    公开(公告)日:2015-09-08

    申请号:US13734444

    申请日:2013-01-04

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed related to flushing one or more data caches. In one embodiment an apparatus includes a processing element, a first cache associated with the processing element, and a circuit configured to copy modified data from the first cache to a second cache in response to determining an activity level of the processing element. In this embodiment, the apparatus is configured to alter a power state of the first cache after the circuit copies the modified data. The first cache may be at a lower level in a memory hierarchy relative to the second cache. In one embodiment, the circuit is also configured to copy data from the second cache to a third cache or a memory after a particular time interval. In some embodiments, the circuit is configured to copy data while one or more pipeline elements of the apparatus are in a low-power state.

    Abstract translation: 公开了涉及冲洗一个或多个数据高速缓存的技术。 在一个实施例中,设备包括处理元件,与处理元件相关联的第一高速缓存器,以及被配置为响应于确定处理元件的活动级别将修改的数据从第一高速缓存复制到第二高速缓存的电路。 在该实施例中,该装置被配置为在电路复制修改的数据之后改变第一高速缓存的功率状态。 第一缓存可以在相对于第二高速缓存的存储器层级中处于较低级。 在一个实施例中,电路还被配置为在特定时间间隔之后将数据从第二高速缓存复制到第三高速缓存或存储器。 在一些实施例中,电路被配置为在设备的一个或多个流水线元件处于低功率状态时复制数据。

    CACHE POLICIES FOR UNCACHEABLE MEMORY REQUESTS
    25.
    发明申请
    CACHE POLICIES FOR UNCACHEABLE MEMORY REQUESTS 有权
    无法访问的内存请求的缓存策略

    公开(公告)号:US20140181403A1

    公开(公告)日:2014-06-26

    申请号:US13725066

    申请日:2012-12-21

    Applicant: APPLE INC.

    CPC classification number: G06F12/0811 G06F12/0815 G06F12/0888

    Abstract: Systems, processors, and methods for keeping uncacheable data coherent. A processor includes a multi-level cache hierarchy, and uncacheable load memory operations can be cached at any level of the cache hierarchy. If an uncacheable load misses in the L2 cache, then allocation of the uncacheable load will be restricted to a subset of the ways of the L2 cache. If an uncacheable store memory operation hits in the L1 cache, then the hit cache line can be updated with the data from the memory operation. If the uncacheable store misses in the L1 cache, then the uncacheable store is sent to a core interface unit.Multiple contiguous store misses are merged into larger blocks of data in the core interface unit before being sent to the L2 cache.

    Abstract translation: 用于保持不可缓存的数据一致的系统,处理器和方法。 处理器包括多级缓存层次结构,并且不可缓存的加载存储器操作可以在高速缓存层级的任何级别缓存。 如果L2缓存中存在不可缓存的加载错误,则不可缓存的加载的分配将被限制为L2高速缓存的一部分。 如果不可缓存的存储器操作命中在L1缓存中,则命中高速缓存行可以用来自存储器操作的数据来更新。 如果不可缓存的商店在L1缓存中丢失,则不可缓存的商店被发送到核心接口单元。 在发送到L2缓存之前,多个连续的存储器缺失在核心接口单元中被合并到更大的数据块中。

    Hashing with soft memory folding
    27.
    发明授权

    公开(公告)号:US11567861B2

    公开(公告)日:2023-01-31

    申请号:US17519284

    申请日:2021-11-04

    Applicant: Apple Inc.

    Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.

    Computation engine that operates in matrix and vector modes

    公开(公告)号:US10754649B2

    公开(公告)日:2020-08-25

    申请号:US16043772

    申请日:2018-07-24

    Applicant: Apple Inc.

    Abstract: In an embodiment, a computation engine is configured to perform vector multiplications, producing either vector results or outer product (matrix) results. The instructions provided to the computation engine specify a matrix mode or a vector mode for the instructions. The computation engine performs the specified operation. The computation engine may perform numerous computations in parallel, in an embodiment. In an embodiment, the instructions may also specify an offset with the input memories, providing additional flexibility in the location of operands. More particularly, the computation engine may be configured to perform numerous multiplication operations in parallel and to accumulate results in a result memory, performing multiply-accumulate operations for each matrix/vector element in the targeted locations of the output memory.

Patent Agency Ranking