Abstract:
Coherency driven enhancements to a PCIe transaction layer are disclosed. In an exemplary aspect, a coherency agent is added to a PCIe system to support a relaxed consistency model for use of memory therein. In particular, endpoints can request ownership of portions of the memory to read from and write to the memory. The coherency agent assigns an address range including the requested portions. The requesting endpoint copies the contents of the memory corresponding to the assigned address range into local endpoint memory to perform read and write operations locally. The owning endpoint may provide an updated snapshot of the copied memory contents upon request. At completion of use of the copied memory contents, or upon request from the coherency agent, ownership of the address range reverts back to the root complex, and the endpoint sends the updated contents back to the address range in the system memory element.
Abstract:
Extended message signaled interrupts (MSI) data are disclosed. In one aspect, MSI bits are modified to include a system level identifier. In an exemplary aspect, an upper sixteen bits of the MSI message data are modified to be the system level identifier. By providing the system level identifier within the MSI message data, an interrupt controller can verify the interrupt source.
Abstract:
Link speed control systems for power optimization are disclosed.In one aspect, a communication link adjusts a data transfer speed based on link utilization levels.In a second exemplary aspect, one or more conditions affecting a link speed are weighted and collectively evaluated to determine an efficient or optimal link speed. By adjusting the link speed in this fashion, lower link speeds may be used, and net power savings may be effectuated.
Abstract:
Certain aspects of the present disclosure provide techniques for adaptively executing machine learning models on a computing device. An example method generally includes receiving weight information for a machine learning model to be executed on a computing device. The received weight information is reduced into quantized weight information having a reduced bit size relative to the received weight information. First inferences using the machine learning model and the received weight information, and second inferences are performed using the machine learning model and the quantized weight information. Results of the first and second inferences are compared, it is determined that results of the second inferences are within a threshold performance level of results of the first inferences, and based on the determination, one or more subsequent inferences are performed using the machine learning model and the quantized weight information.
Abstract:
A replacement physical layer (PHY) for low-speed Peripheral Component Interconnect (PCI) Express (PCIe) systems is disclosed. In one aspect, an analog PHY of a conventional PCIe system is replaced with a digital PHY. The digital PHY is coupled to a media access control (MAC) logic by a PHY interface for PCIe (PIPE) directly. In further exemplary aspects, the digital PHY may be a complementary metal oxide semiconductor (CMOS) PHY that includes a serializer and a deserializer. Replacing the analog PHY with the digital PHY allows entry and exit from low-power modes to occur much quicker, resulting in substantial power savings and reduced latency. Because the digital PHY is operable with low-speed communication, the digital PHY can maintain sufficient bandwidth that communication is not unnecessarily impacted by digital logic of the digital PHY.
Abstract:
A PCIe system includes a host system (304) and at least one PCIe endpoint (302). The PCIe endpoint is configured to determine one or more transaction-specific attributes that can improve efficiency and performance of a predefined host transaction. The PCIe endpoint encodes the transaction-specific attributes in a transaction layer packet (TLP) prefix of at least one PCIe TLP and provides the PCIe TLP to the host system. A PCIe root complex (316) in the host system is configured to detect and extract the transaction-specific attributes from the TLP prefix of the PCIe TLP received from the PCIe endpoint. By communicating the transaction-specific attributes in the TLP prefix of the PCIe TLP, it is possible to improve efficiency and performance of the PCIe system without violating the existing PCIe standard.