Inter Cluster Snoop Latency Reduction

    公开(公告)号:US20210303486A1

    公开(公告)日:2021-09-30

    申请号:US17242051

    申请日:2021-04-27

    Applicant: Apple Inc.

    Abstract: In one embodiment, a cache coherent system includes one or more agents (e.g., coherent agents) that may cache data used by the system. The system may include a point of coherency in a memory controller in the system, and thus the agents may transmit read requests to the memory controller to coherently read data. The point of coherency may determine if the data is cached in another agent, and may transmit a copy back request to the other agent if the other agent has modified the data. The system may include an interconnect between the agents and the memory controller. At a point on the interconnect at which traffic from the agents converges, a copy back response may be converted to a fill for the requesting agent.

    Communications fabric with split paths for control and data packets

    公开(公告)号:US10206175B2

    公开(公告)日:2019-02-12

    申请号:US15817564

    申请日:2017-11-20

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to a split communications fabric topology. In some embodiments, an apparatus includes a communications fabric structure with multiple fabric units. The fabric units may be configured to arbitrate among control packets of different messages. In some embodiments, a processing element is configured to generate a message that includes a control packet and one or more data packets. In some embodiments, the processing element is configured to transmit the control packet to a destination processing element (e.g., a memory controller) via the communications fabric structure and transmit the data packets to a data buffer. In some embodiments, the destination processing element is configured to retrieve the data packets from the data buffer in response to receiving the control packet via the hierarchical fabric structure. In these embodiments, bypassing the fabric structure for data packets may reduce power consumption.

    COMMUNICATIONS FABRIC WITH SPLIT PATHS FOR CONTROL AND DATA PACKETS

    公开(公告)号:US20180077649A1

    公开(公告)日:2018-03-15

    申请号:US15817564

    申请日:2017-11-20

    Applicant: Apple Inc.

    Abstract: Techniques are disclosed relating to a split communications fabric topology. In some embodiments, an apparatus includes a communications fabric structure with multiple fabric units. The fabric units may be configured to arbitrate among control packets of different messages. In some embodiments, a processing element is configured to generate a message that includes a control packet and one or more data packets. In some embodiments, the processing element is configured to transmit the control packet to a destination processing element (e.g., a memory controller) via the communications fabric structure and transmit the data packets to a data buffer. In some embodiments, the destination processing element is configured to retrieve the data packets from the data buffer in response to receiving the control packet via the hierarchical fabric structure. In these embodiments, bypassing the fabric structure for data packets may reduce power consumption.

    PROCESSOR TO MEMORY BYPASS
    34.
    发明申请
    PROCESSOR TO MEMORY BYPASS 审中-公开
    处理器到存储器旁路

    公开(公告)号:US20160328322A1

    公开(公告)日:2016-11-10

    申请号:US14705506

    申请日:2015-05-06

    Applicant: Apple Inc.

    Abstract: An apparatus for processing memory requests from a functional unit in a computing system is disclosed. The apparatus may include an interface that may be configured to receive a request from the functional. Circuitry may be configured initiate a speculative read access command to a memory in response to a determination that the received request is a request for data from the memory. The circuitry may be further configured to determine, in parallel with the speculative read access, if the speculative read will result in an ordering or coherence violation.

    Abstract translation: 公开了一种用于处理来自计算系统中的功能单元的存储器请求的装置。 该装置可以包括可被配置为从功能接收请求的接口。 响应于确定接收到的请求是来自存储器的数据的请求,可以将电路配置为向存储器发起推测性读取访问命令。 电路还可以被配置为与推测性读取访问并行地确定如果推测性读取将导致排序或一致性违规。

    Duplicate tag structure employing single-port tag RAM and dual-port state RAM
    35.
    发明授权
    Duplicate tag structure employing single-port tag RAM and dual-port state RAM 有权
    采用单端口标签RAM和双端口状态RAM的重复标签结构

    公开(公告)号:US09454482B2

    公开(公告)日:2016-09-27

    申请号:US13928636

    申请日:2013-06-27

    Applicant: Apple Inc.

    CPC classification number: G06F12/0815

    Abstract: An apparatus for processing cache requests in a computing system is disclosed. The apparatus may include a single-port memory, a dual-port memory, and a control circuit. The single-port memory may be store tag information associated with a cache memory, and the dual-port memory may be configured to store state information associated with the cache memory. The control circuit may be configured to receive a request which includes a tag address, access the tag and state information stored in the single-port memory and the dual-port memory, respectively, dependent upon the received tag address. A determination of if the data associated with the received tag address is contained in the cache memory may be made the control circuit, and the control circuit may update and store state information in the dual-port memory responsive to the determination.

    Abstract translation: 公开了一种用于处理计算系统中的缓存请求的装置。 该装置可以包括单端口存储器,双端口存储器和控制电路。 单端口存储器可以是与高速缓冲存储器相关联的存储标签信息,并且双端口存储器可以被配置为存储与高速缓冲存储器相关联的状态信息。 控制电路可以被配置为根据接收的标签地址分别接收包括标签地址,访问标签和分别存储在单端口存储器中的状态信息和双端口存储器的请求。 可以确定与接收的标签地址相关联的数据是否包含在高速缓存存储器中,并且控制电路可以响应于该确定来更新和存储双端口存储器中的状态信息。

    Cross dependency checking logic
    36.
    发明授权
    Cross dependency checking logic 有权
    交叉依赖关系检查逻辑

    公开(公告)号:US09158691B2

    公开(公告)日:2015-10-13

    申请号:US13715623

    申请日:2012-12-14

    Applicant: Apple Inc.

    CPC classification number: G06F12/0828 G06F12/0822

    Abstract: Systems and methods for maintaining an order of transactions in the coherence point. The coherence point stores attributes associated with received transactions in an input request queue (IRQ). When a new transaction is received by the coherence point, the IRQ is searched for other entries with the same request address or the same victim address as the new transaction. If one or more matches are found, the new transaction entry points to the entry storing the most recently received transaction with the same address. The new transaction is stalled until the transaction it points to has been completed in the coherence point.

    Abstract translation: 在一致性点保持交易顺序的系统和方法。 相干点存储与输入请求队列(IRQ)中的接收事务相关联的属性。 当相干点接收到新的事务时,IRQ将搜索具有与新事务相同的请求地址或相同受害者地址的其他条目。 如果找到一个或多个匹配,则新的事务条目指向存储具有相同地址的最近接收的事务的条目。 新交易停滞不前,直到其指向的交易已经在一致性点完成。

    DEBUG ACCESS MECHANISM FOR DUPLICATE TAG STORAGE
    37.
    发明申请
    DEBUG ACCESS MECHANISM FOR DUPLICATE TAG STORAGE 有权
    用于重复标签存储的调试访问机制

    公开(公告)号:US20140173342A1

    公开(公告)日:2014-06-19

    申请号:US13713654

    申请日:2012-12-13

    Applicant: APPLE INC.

    CPC classification number: G06F11/273 G06F11/221 G06F11/2236

    Abstract: A coherence system includes a storage array that may store duplicate tag information associated with a cache memory of a processor. The system may also include a pipeline unit that includes a number of stages to control accesses to the storage array. The pipeline unit may pass through the pipeline stages, without generating an access to the storage array, an input/output (I/O) request that is received on a fabric. The system may also include a debug engine that may reformat the I/O request from the pipeline unit into a debug request. The debug engine may send the debug request to the pipeline unit via a debug bus. In response to receiving the debug request, the pipeline unit may access the storage array. The debug engine may return to the source of the I/O request via the fabric bus, a result of the access to the storage array.

    Abstract translation: 相干系统包括可存储与处理器的高速缓冲存储器相关联的重复标签信息的存储阵列。 系统还可以包括流水线单元,其包括多个级以控制对存储阵列的访问。 流水线单元可以通过流水线阶段,而不产生对存储阵列的访问,即在结构上接收的输入/输出(I / O)请求。 该系统还可以包括可以将流水线单元的I / O请求重新格式化为调试请求的调试引擎。 调试引擎可以通过调试总线将调试请求发送到流水线单元。 响应于接收到调试请求,流水线单元可以访问存储阵列。 调试引擎可以通过结构总线返回到I / O请求的源,这是访问存储阵列的结果。

    CROSS DEPENDENCY CHECKING LOGIC
    38.
    发明申请
    CROSS DEPENDENCY CHECKING LOGIC 有权
    交叉依赖检查逻辑

    公开(公告)号:US20140173218A1

    公开(公告)日:2014-06-19

    申请号:US13715623

    申请日:2012-12-14

    Applicant: APPLE INC.

    CPC classification number: G06F12/0828 G06F12/0822

    Abstract: Systems and methods for maintaining an order of transactions in the coherence point. The coherence point stores attributes associated with received transactions in an input request queue (IRQ). When a new transaction is received by the coherence point, the IRQ is searched for other entries with the same request address or the same victim address as the new transaction. If one or more matches are found, the new transaction entry points to the entry storing the most recently received transaction with the same address. The new transaction is stalled until the transaction it points to has been completed in the coherence point.

    Abstract translation: 在一致性点保持交易顺序的系统和方法。 相干点存储与输入请求队列(IRQ)中的接收事务相关联的属性。 当相干点接收到新的事务时,IRQ将搜索具有与新事务相同的请求地址或相同受害者地址的其他条目。 如果找到一个或多个匹配,则新的事务条目指向存储具有相同地址的最近接收的事务的条目。 新交易停滞不前,直到其指向的交易已经在一致性点完成。

    Address hashing in a multiple memory controller system

    公开(公告)号:US12236130B2

    公开(公告)日:2025-02-25

    申请号:US18318672

    申请日:2023-05-16

    Applicant: Apple Inc.

    Abstract: In an embodiment, a system may support programmable hashing of address bits at a plurality of levels of granularity to map memory addresses to memory controllers and ultimately at least to memory devices. The hashing may be programmed to distribute pages of memory across the memory controllers, and consecutive blocks of the page may be mapped to physically distant memory controllers. In an embodiment, address bits may be dropped from each level of granularity, forming a compacted pipe address to save power within the memory controller. In an embodiment, a memory folding scheme may be employed to reduce the number of active memory devices and/or memory controllers in the system when the full complement of memory is not needed.

    Scalable Cache Coherency Protocol
    40.
    发明公开

    公开(公告)号:US20240273024A1

    公开(公告)日:2024-08-15

    申请号:US18582333

    申请日:2024-02-20

    Applicant: Apple Inc.

    CPC classification number: G06F12/0815 G06F12/0831 G06F2212/1032

    Abstract: A scalable cache coherency protocol for system including a plurality of coherent agents coupled to one or more memory controllers is described. The memory controller may implement a precise directory for cache blocks from the memory to which the memory controller is coupled. Multiple requests to a cache block may be outstanding, and snoops and completions for requests may include an expected cache state at the receiving agent, as indicated by a directory in the memory controller when the request was processed, to allow the receiving agent to detect race conditions. In an embodiment, the cache states may include a primary shared and a secondary shared state. The primary shared state may apply to a coherent agent that bears responsibility for transmitting a copy of the cache block to a requesting agent. In an embodiment, at least two types of snoops may be supported: snoop forward and snoop back.

Patent Agency Ranking