-
公开(公告)号:US20220308997A1
公开(公告)日:2022-09-29
申请号:US17212804
申请日:2021-03-25
Applicant: Arm Limited
Inventor: Kishore Kumar Jagadeesha , Jamshed Jalal , Tushar P Ringe , Mark David Werkheiser , Premkishore Shivakumar , Lauren Elise Guckert
IPC: G06F12/0815 , G06F13/40
Abstract: A data processing network includes request nodes with local memories accessible as a distributed virtual memory (DVM) and coupled by an interconnect fabric. Multiple DVM domains are assigned, each containing a DVM node for handling DVM operation requests from request nodes in the domain. On receipt of a request, a DVM node sends a snoop message to other request nodes in its domain and sends a snoop message to one or more peer DVM nodes in other DVM domains. The DVM node receives snoop responses from the request nodes and from the one or more peer DVM nodes, and send a completion message to the first request node. Each peer DVM node sends snoop messages to the request nodes in its domain, collects snoop responses, and sends a single response to the originating DVM node. In this way, DVM operations are performed in parallel.
-
公开(公告)号:US20240273026A1
公开(公告)日:2024-08-15
申请号:US18109454
申请日:2023-02-14
Applicant: Arm Limited
Inventor: Devi Sravanthi Yalamarthy , Jamshed Jalal , Mark David Werkheiser , Wenxuan Zhang , Ritukar Khanna , Rajani Pai , Gurunath Ramagiri , Mukesh Patel , Tushar P Ringe
IPC: G06F12/084 , G06F12/0811 , G06F12/0891
CPC classification number: G06F12/084 , G06F12/0811 , G06F12/0891
Abstract: A data processing apparatus includes one or more cache configuration data stores, a coherence manager, and a shared cache. The coherence manager is configured to track and maintain coherency of cache lines accessed by local caching agents and one or more remote caching agents. The cache lines include local cache lines accessed from a local memory region and remote cache lines accessed from a remote memory region. The shared cache is configured to store local cache lines in a first partition and to store remote cache lines in a second partition. The sizes of the first and second partitions are determined based on values in the one or more cache configuration data stores and may or not overlap. The cache configuration data stores may be programmable by a user or dynamically programmed in response to local memory and remote memory access patterns.
-
公开(公告)号:US20240256460A1
公开(公告)日:2024-08-01
申请号:US18101806
申请日:2023-01-26
Applicant: Arm Limited
Inventor: Jamshed Jalal , Ashok Kumar Tummala , Wenxuan Zhang , Daniel Thomas Pinero , Tushar P Ringe
IPC: G06F12/0888
CPC classification number: G06F12/0888 , G06F2212/1024
Abstract: Efficient data transfer between caching domains of a data processing system is achieved by a local coherency node (LCN) of a first caching domain receiving a read request for data associated with a second caching domain, from a requesting node of the first caching domain. The LCN requests the data from the second caching domain via a transfer agent. In response to receiving a cache line containing the data from the second caching domain, the transfer agent sends the cache line to the requesting node, bypassing the LCN and, optionally, sends a read-receipt indicating the state of the cache line to the LCN. The LCN updates a coherency state for the cache line in response to receiving the read-receipt from the transfer agent and a completion acknowledgement from the requesting node. Optionally, the transfer agent may send the cache line via the LCN when congestion is detected in a response channel of the data processing system.
-
公开(公告)号:US11483260B2
公开(公告)日:2022-10-25
申请号:US17051028
申请日:2019-05-02
Applicant: Arm Limited
Inventor: Jamshed Jalal , Tushar P Ringe , Phanindra Kumar Mannava , Dimitrios Kaseridis
Abstract: An improved protocol for data transfer between a request node and a home node of a data processing network that includes a number of devices coupled via an interconnect fabric is provided that minimizes the number of response messages transported through the interconnect fabric. When congestion is detected in the interconnect fabric, a home node sends a combined response to a write request from a request node. The response is delayed until a data buffer is available at the home node and home node has completed an associated coherence action. When the request node receives a combined response, the data to be written and the acknowledgment are coalesced in the data message.
-
公开(公告)号:US11934334B2
公开(公告)日:2024-03-19
申请号:US17244182
申请日:2021-04-29
Applicant: Arm Limited
Inventor: Tushar P Ringe , Mark David Werkheiser , Jamshed Jalal , Sai Kumar Marri , Ashok Kumar Tummala , Rishabh Jain
CPC classification number: G06F13/4221 , G06F13/4068 , G06F13/4027 , G06F2213/0026
Abstract: The present disclosure advantageously provides a method and system for transferring data over a chip-to-chip interconnect (CCI). At a request node of a coherent interconnect (CHI) of a first chip, receiving at least one peripheral component interface express (PCIe) transaction from a PCIe master device, the PCIe transaction including a stream identifier; selecting a CCI port of the CHI of the first chip based on the stream identifier of the PCIe transaction; and sending the PCIe transaction to the selected CCI port.
-
公开(公告)号:US20220350771A1
公开(公告)日:2022-11-03
申请号:US17244182
申请日:2021-04-29
Applicant: Arm Limited
Inventor: Tushar P Ringe , Mark David Werkheiser , Jamshed Jalal , Sai Kumar Marri , Ashok Kumar Tummala , Rishabh Jain
Abstract: The present disclosure advantageously provides a method and system for transferring data over a chip-to-chip interconnect (CCI). At a request node of a coherent interconnect (CHI) of a first chip, receiving at least one peripheral component interface express (PCIe) transaction from a PCIe master device, the PCIe transaction including a stream identifier; selecting a CCI port of the CHI of the first chip based on the stream identifier of the PCIe transaction; and sending the PCIe transaction to the selected CCI port.
-
公开(公告)号:US11181957B1
公开(公告)日:2021-11-23
申请号:US17102963
申请日:2020-11-24
Applicant: Arm Limited
Inventor: Ramamoorthy Guru Prasadh , Tushar P Ringe , Kishore Kumar Jagadeesha , David Joseph Hawkins , Saira Samar Malik
Abstract: An improved apparatus and method for the protection of reset in systems with stringent safety goals that employ primary and shadow logic blocks with a lock-step checker to achieve functional safety, including those systems having very large fanout of primary and shadow reset signal trees. The disclosed apparatus and method support assertion of reset that is asynchronous to the system clock and deassertion of reset that is synchronous to the system clock. Shadow logic blocks have reset deasserted a fixed number of clock cycles after their respective primary logic blocks, thereby avoiding the requirement to synchronize the primary and shadow reset signal trees at each of their end points to ensure lock-step operation between the primary and shadow logic blocks.
-
公开(公告)号:US11593025B2
公开(公告)日:2023-02-28
申请号:US16743409
申请日:2020-01-15
Applicant: Arm Limited
Inventor: Gurunath Ramagiri , Jamshed Jalal , Mark David Werkheiser , Tushar P Ringe , Klas Magnus Bruce , Ritukar Khanna
IPC: G06F3/06
Abstract: A request node is provided comprising request circuitry to issue write requests to write data to storage circuitry. The write requests are issued to the storage circuitry via a coherency node. Status receiving circuitry receives a write status regarding write operations at the storage circuitry from the coherency node and throttle circuitry throttles a rate at which the write requests are issued to the storage circuitry in dependence on the write status. A coherency node is also provided, comprising access circuitry to receive a write request from a request node to write data to storage circuitry and to access the storage circuitry to write the data to the storage circuitry. Receive circuitry receives, from the storage circuitry, an incoming write status regarding write operations at the storage circuitry and transmit circuitry transmits an outgoing write status to the request node based on the incoming write status.
-
公开(公告)号:US11550720B2
公开(公告)日:2023-01-10
申请号:US17102997
申请日:2020-11-24
Applicant: Arm Limited
Inventor: Gurunath Ramagiri , Jamshed Jalal , Mark David Werkheiser , Tushar P Ringe , Mukesh Patel , Sakshi Verma
IPC: G06F12/0815 , G06F12/0831
Abstract: Entries in a cluster-to-caching agent map table of a data processing network identify one or more caching agents in a caching agent cluster. A snoop filter cache stores coherency information that includes coherency status information and a presence vector, where a bit position in the presence vector is associated with a caching agent cluster in the cluster-to-caching agent map table. In response to a data request, a presence vector in the snoop filter cache is accessed to identify a caching agent cluster and the map table is accessed to identify target caching agents for snoop messages. In order to reduce message traffic, snoop message are sent only to the identified targets.
-
公开(公告)号:US11531620B2
公开(公告)日:2022-12-20
申请号:US17212804
申请日:2021-03-25
Applicant: Arm Limited
Inventor: Kishore Kumar Jagadeesha , Jamshed Jalal , Tushar P Ringe , Mark David Werkheiser , Premkishore Shivakumar , Lauren Elise Guckert
IPC: G06F12/0815 , G06F12/0831 , G06F13/40
Abstract: A data processing network includes request nodes with local memories accessible as a distributed virtual memory (DVM) and coupled by an interconnect fabric. Multiple DVM domains are assigned, each containing a DVM node for handling DVM operation requests from request nodes in the domain. On receipt of a request, a DVM node sends a snoop message to other request nodes in its domain and sends a snoop message to one or more peer DVM nodes in other DVM domains. The DVM node receives snoop responses from the request nodes and from the one or more peer DVM nodes, and send a completion message to the first request node. Each peer DVM node sends snoop messages to the request nodes in its domain, collects snoop responses, and sends a single response to the originating DVM node. In this way, DVM operations are performed in parallel.
-
-
-
-
-
-
-
-
-