摘要:
In some embodiments, a system includes a shared, high bandwidth resource (e.g. a memory system), multiple agents configured to communicate with the shared resource, and a communication fabric coupling the multiple agents to the shared resource. The communication fabric may be equipped with limiters configured to limit bandwidth from the various agents based on one or more performance metrics measured with respect to the shared, high bandwidth resource. For example, the performance metrics may include one or more of latency, number of outstanding transactions, resource utilization, etc. The limiters may dynamically modify their limit configurations based on the performance metrics. In an embodiment, the system may include multiple thresholds for the performance metrics, and exceeding a given threshold may include modifying the limiters in the communication fabric. There may be hysteresis implemented in the system as well in some embodiments, to reduce the frequency of transitions between configurations.
摘要:
A multiple processor device generates a control packet for at least one connectionless-based packet in partial accordance with a control packet format of the connection-based point-to-point link and partially not in accordance with the control packet format. For instance, the multiple processor device generates the control packet to include, in noncompliance with the control packet format, one or more of an indication that at least one connectionless-based packet is being transported, an indication of a virtual channel of a plurality of virtual channels associated with the at least one connectionless-based packet, an indication of an amount of data included in the associated data packet, status of the at least one connectionless-based packet, and an error status indication. The multiple processor device then generates the associated data packet in accordance with a data packet format of the connection-based point-to-point link, wherein the data packet includes at least a portion of the at least one connectionless-based packet.
摘要:
According to the present invention, the multiple processor device determines routing for a plurality of data segments. In determining the routing, the multiple processor device first receives the plurality of data segments. The plurality of data segments include multiplexed data fragments from at least one of a plurality of virtual channels. Further, a data segment of the plurality of data segments corresponds to one of the multiplexed data fragments. The multiple processor device then applies at least one routing rule to one of the plurality of data segments to produce at least one result corresponding to the one of the plurality of data segments. The multiple processor device then interprets the at least one result to determine whether sufficient information is available to render a routing decision for the one of the plurality of data segments. When the multiple processor device determines that there is sufficient information to render a routing decision, the multiple processor device determines routing of the one of the plurality of data segments. When there is insufficient information to render a routing decision, the one of the plurality of data segments is stored in a buffer corresponding to a packet in which the one of the plurality of data segments was received. These operations are repeated for subsequent data segments.
摘要:
An apparatus includes a first interface circuit, a second interface circuit, a memory controller for configured to interface to a memory, and a packet DMA circuit. The first interface circuit is configured to couple to a first interface for receiving and transmitting packet data. Similarly, the second interface circuit is configured to couple to a second interface for receiving and transmitting packet data. The packet DMA circuit is coupled to receive a first packet from the first interface circuit and a second packet from the second interface circuit. The packet DMA circuit is configured to transmit the first packet and the second packet in write commands to the memory controller to be written to the memory. In some embodiments, a switch is coupled to the first interface circuit, the second interface circuit, and the packet DMA circuit.
摘要:
A multiplier capable of performing complex iterative calculations such as division and square root concurrently with simple independent multiplication operations is disclosed. The division and square root operations are performed using iterative multiplication operations such as the Newton Raphson iteration and series expansion. These iterative calculations may require a number of passes through the multiplier. Since the multiplier may be pipelined, it may experience a number of idle cycles during the iterative calculations. The multiplier is configured to utilize these idle cycles to perform independent simple multiplication operations. The multiplier may be configured to assert a control signal that is indicative of future idle cycles in the first stages of the multiplier pipeline. The control signal may be used by control logic to dispatch independent simple multiplication operations to the multiplier for execution during the idle clock cycles. The multiplier may also be configured to concurrently execute two independent iterative operations.
摘要:
In some embodiments, a system includes a shared, high bandwidth resource (e.g. a memory system), multiple agents configured to communicate with the shared resource, and a communication fabric coupling the multiple agents to the shared resource. The communication fabric may be equipped with limiters configured to limit bandwidth from the various agents based on one or more performance metrics measured with respect to the shared, high bandwidth resource. For example, the performance metrics may include one or more of latency, number of outstanding transactions, resource utilization, etc. The limiters may dynamically modify their limit configurations based on the performance metrics. In an embodiment, the system may include multiple thresholds for the performance metrics, and exceeding a given threshold may include modifying the limiters in the communication fabric. There may be hysteresis implemented in the system as well in some embodiments, to reduce the frequency of transitions between configurations.
摘要:
In one embodiment, an SOC includes multiple components including a CPU complex and one or more non-CPU components such as peripheral interface controllers, memory controllers, media components, etc. The SOC also includes an SOC debug control unit, which is coupled to receive detected debug events from the components. Each component may include a local debug control unit that is configured to monitor for various debug events within that component. The debug events may be specific to the component. The local debug control units may transmit detected events to the SOC debug control unit. The SOC debug control unit may detect one or more events from one or more components, and may halt the components of the SOC responsive to detecting the selected events.
摘要:
A system and method for efficiently storing traces of multiple components in an embedded system. A system-on-a-chip (SOC) includes a trace unit for collecting and storing trace history, bus event statistics, or both. The SOC may transfer cache coherent messages across multiple buses between a shared memory and a cache coherent controller. The trace unit includes a trace buffer with multiple physical partitions assigned to subsets of the multiple buses. The number of partitions is less than the number of multiple buses. One or more trace instructions may cause a trace history, trace bus event statistics, local time stamps and a global time-base value to be stored in a physical partition within the trace buffer.
摘要:
Smart routing between peers in a point-to-point link based system begins when a device of a plurality of devices in a point-to-point link interconnected system receives a packet from an upstream link or a downstream link. The processing continues when the device interprets the packet to determine a destination of the packet. If the device is the destination of the packet, the device accepts the packet. If, however, the device is not the destination of the packet, the device forwards the packet on another upstream link or another downstream link without alteration of at least one of: source information of the packet and destination information of the packet.
摘要:
Techniques are disclosed relating to integrated circuits that implement a virtual memory. In one embodiment, an integrated circuit is disclosed that includes a translation lookaside buffer configured to store non-prefetched translations and a translation table configured to store prefetched translations. In such an embodiment, the translation lookaside buffer and the translation table share table walk circuitry. In some embodiments, the table walk circuitry is configured to store a translation in the translation table in response to a prefetch request and without updating the translation lookaside buffer. In some embodiments, the translation lookaside buffer, the translation table, and table walk circuitry are included within a memory management unit configured to service memory requests received from a plurality of client circuits via a plurality of direct memory access (DMA) channels.