Systems and methods for performing memory compression

    公开(公告)号:US10769065B2

    公开(公告)日:2020-09-08

    申请号:US16436635

    申请日:2019-06-10

    Applicant: Apple Inc.

    Abstract: Systems, apparatuses, and methods for efficiently moving data for storage and processing a compression unit within a processor includes multiple hardware lanes, selects two or more input words to compress, and for assigns them to two or more of the multiple hardware lanes. As each assigned input word is processed, each word is compared to an entry of a plurality of entries of a table. If it is determined that each of the assigned input words indexes the same entry of the table, the hardware lane with the oldest input word generates a single read request for the table entry and the hardware lane with the youngest input word generates a single write request for updating the table entry upon completing compression. Each hardware lane generates a compressed packet based on its assigned input word.

    Systems and Methods for Coherent Power Management

    公开(公告)号:US20180232034A1

    公开(公告)日:2018-08-16

    申请号:US15430699

    申请日:2017-02-13

    Applicant: Apple Inc.

    Abstract: In an embodiment, a system includes multiple power management mechanism operating in different time domains (e.g. with different bandwidths) and control circuitry that is configured to coordinate operation of the mechanisms. If one mechanism is adding energy to the system, for example, the control circuitry may inform another mechanism that the energy is coming so that the other mechanism may not take as drastic an action as it would if no energy were coming. If a light workload is detected by circuitry near the load, and there is plenty of energy in the system, the control circuitry may cause the power management unit (PMU) to generate less energy or even temporarily turn off. A variety of mechanisms for the coordinated, coherent use of power are described.

    Processing multi-destination instruction in pipeline by splitting for single destination operations stage and merging for opcode execution operations stage
    48.
    发明授权
    Processing multi-destination instruction in pipeline by splitting for single destination operations stage and merging for opcode execution operations stage 有权
    通过分割单个目标操作阶段并合并操作码执行操作阶段来处理多目标指令

    公开(公告)号:US09223577B2

    公开(公告)日:2015-12-29

    申请号:US13627884

    申请日:2012-09-26

    Applicant: Apple Inc.

    Abstract: Various techniques for processing instructions that specify multiple destinations. A first portion of a processor pipeline is configured to split a multi-destination instruction into a plurality of single-destination operations. A second portion of the pipeline is configured to process the plurality of single-destination operations. A third portion of the pipeline is configured to merge the plurality of single-destination operations into one or more multi-destination operations. The one or more multi-destination operations may be performed. The first portion of the pipeline may include a decode unit. The second portion of the pipeline may include a map unit, which may in turn include circuitry configured to maintain a list of free architectural registers and a mapping table that maps physical registers to architectural registers. The third portion of the pipeline may comprise a dispatch unit. In some embodiments, this may provide certain advantages such as reduced area and/or power consumption.

    Abstract translation: 用于处理指定多个目的地的指令的各种技术。 处理器流水线的第一部分被配置为将多目的地指令分割成多个单目的地操作。 流水线的第二部分被配置为处理多个单目的地操作。 流水线的第三部分被配置为将多个单目的地操作合并成一个或多个多目的地操作。 可以执行一个或多个多目的地操作。 流水线的第一部分可以包括解码单元。 流水线的第二部分可以包括地图单元,其可以依次包括被配置为维护空闲架构寄存器的列表的电路和将物理寄存器映射到架构寄存器的映射表。 管道的第三部分可以包括调度单元。 在一些实施例中,这可以提供某些优点,例如减小面积和/或功率消耗。

    Cache policies for uncacheable memory requests
    49.
    发明授权
    Cache policies for uncacheable memory requests 有权
    缓存不可缓存内存请求的策略

    公开(公告)号:US09043554B2

    公开(公告)日:2015-05-26

    申请号:US13725066

    申请日:2012-12-21

    Applicant: Apple Inc.

    CPC classification number: G06F12/0811 G06F12/0815 G06F12/0888

    Abstract: Systems, processors, and methods for keeping uncacheable data coherent. A processor includes a multi-level cache hierarchy, and uncacheable load memory operations can be cached at any level of the cache hierarchy. If an uncacheable load misses in the L2 cache, then allocation of the uncacheable load will be restricted to a subset of the ways of the L2 cache. If an uncacheable store memory operation hits in the L1 cache, then the hit cache line can be updated with the data from the memory operation. If the uncacheable store misses in the L1 cache, then the uncacheable store is sent to a core interface unit. Multiple contiguous store misses are merged into larger blocks of data in the core interface unit before being sent to the L2 cache.

    Abstract translation: 用于保持不可缓存的数据一致的系统,处理器和方法。 处理器包括多级缓存层次结构,并且不可缓存的加载存储器操作可以在高速缓存层级的任何级别缓存。 如果L2缓存中存在不可缓存的加载错误,则不可缓存的加载的分配将被限制为L2高速缓存的一部分。 如果不可缓存的存储器操作命中在L1缓存中,则命中高速缓存行可以用来自存储器操作的数据来更新。 如果不可缓存的商店在L1缓存中丢失,则不可缓存的商店被发送到核心接口单元。 在发送到L2缓存之前,多个连续的存储器缺失在核心接口单元中被合并到更大的数据块中。

    Access map-pattern match based prefetch unit for a processor
    50.
    发明授权
    Access map-pattern match based prefetch unit for a processor 有权
    为处理器访问基于地图模式匹配的预取单元

    公开(公告)号:US09015422B2

    公开(公告)日:2015-04-21

    申请号:US13942780

    申请日:2013-07-16

    Applicant: Apple Inc.

    CPC classification number: G06F12/0862 G06F2212/6026 Y02D10/13

    Abstract: In an embodiment, a processor may implement an access map-pattern match (AMPM)-based prefetcher in which patterns may include wild cards for some cache blocks. The wild card may match any access for the corresponding cache block (e.g. no access, demand access, prefetch, successful prefetch, etc.). Furthermore, patterns with irregular strides and/or irregular access patterns may be included in the matching patterns and may be detected for prefetch generation. In an embodiment, the AMPM prefetcher may implement a chained access map for large streaming prefetches. If a stream is detected, the AMPM prefetcher may allocate a pair of map entries for the stream and may reuse the pair for subsequent access map regions within the stream. In some embodiments, a quality factor may be associated with each access map and may control the rate of prefetch generation.

    Abstract translation: 在一个实施例中,处理器可以实现基于访问映射模式匹配(AMPM)的预取器,其中模式可以包括一些高速缓存块的通配符。 通配符可以匹配对应的高速缓存块的任何访问(例如,无访问,请求访问,预取,成功预取等)。 此外,具有不规则步幅和/或不规则访问模式的模式可以被包括在匹配模式中,并且可以被检测用于预取生成。 在一个实施例中,AMPM预取器可以实现用于大型流预取的链接访问映射。 如果检测到流,则AMPM预取器可以为流分配一对映射条目,并且可以将该对重新使用在该流内的后续访问映射区域。 在一些实施例中,质量因子可以与每个访问映射关联,并且可以控制预取生成的速率。

Patent Agency Ranking