SYSTEMS AND METHODS TO SKIP INCONSEQUENTIAL MATRIX OPERATIONS

    公开(公告)号:US20230070579A1

    公开(公告)日:2023-03-09

    申请号:US17878427

    申请日:2022-08-01

    Abstract: Disclosed embodiments relate to systems and methods to skip inconsequential matrix operations. In one example, a processor includes decode circuitry to decode an instruction having fields to specify an opcode and locations of first source, second source, and destination matrices, the opcode indicating that the processor is to multiply each element at row M and column K of the first source matrix with a corresponding element at row K and column N of the second source matrix, and accumulate a resulting product with previous contents of a corresponding element at row M and column N of the destination matrix, the processor to skip multiplications that, based on detected values of corresponding multiplicands, would generate inconsequential results; scheduling circuitry to schedule execution of the instruction; and execution circuitry to execute the instructions as per the opcode.

    HIGH-PERFORMANCE INPUT-OUTPUT DEVICES SUPPORTING SCALABLE VIRTUALIZATION

    公开(公告)号:US20200012530A1

    公开(公告)日:2020-01-09

    申请号:US16351396

    申请日:2019-03-12

    Abstract: Techniques for scalable virtualization of an Input/Output (I/O) device are described. An electronic device composes a virtual device comprising one or more assignable interface (AI) instances of a plurality of AI instances of a hosting function exposed by the I/O device. The electronic device emulates device resources of the I/O device via the virtual device. The electronic device intercepts a request from the guest pertaining to the virtual device, and determines whether the request from the guest is a fast-path operation to be passed directly to one of the one or more AI instances of the I/O device or a slow-path operation that is to be at least partially serviced via software executed by the electronic device. For a slow-path operation, the electronic device services the request at least partially via the software executed by the electronic device.

    HIGH-PERFORMANCE INPUT-OUTPUT DEVICES SUPPORTING SCALABLE VIRTUALIZATION

    公开(公告)号:US20230251912A1

    公开(公告)日:2023-08-10

    申请号:US18301733

    申请日:2023-04-17

    Abstract: Techniques for scalable virtualization of an Input/Output (I/O) device are described. An electronic device composes a virtual device comprising one or more assignable interface (AI) instances of a plurality of AI instances of a hosting function exposed by the I/O device. The electronic device emulates device resources of the I/O device via the virtual device. The electronic device intercepts a request from the guest pertaining to the virtual device, and determines whether the request from the guest is a fast-path operation to be passed directly to one of the one or more AI instances of the I/O device or a slow-path operation that is to be at least partially serviced via software executed by the electronic device. For a slow-path operation, the electronic device services the request at least partially via the software executed by the electronic device.

    METHOD AND APPARATUS FOR DYNAMICALLY ADJUSTING PIPELINE DEPTH TO IMPROVE EXECUTION LATENCY

    公开(公告)号:US20230040226A1

    公开(公告)日:2023-02-09

    申请号:US17559612

    申请日:2021-12-22

    Abstract: Apparatus and method for managing pipeline depth of a data processing device. For example, one embodiment of an apparatus comprises: an interface to receive a plurality of work requests from a plurality of clients; and a plurality of engines to perform the plurality of work requests; wherein the work requests are to be dispatched to the plurality of engines from a plurality of work queues, the work queues to store a work descriptor per work request, each work descriptor to include information needed to perform a corresponding work request, wherein the plurality of work queues include a first work queue to store work descriptors associated with first latency characteristics and a second work queue to store work descriptors associated with second latency characteristics; engine configuration circuitry to configure a first engine to have a first pipeline depth based on the first latency characteristics and to configure a second engine to have a second pipeline depth based on the second latency characteristics.

    POSTED INTERRUPT PROCESSING IN VIRTUAL MACHINE MONITOR

    公开(公告)号:US20190121658A1

    公开(公告)日:2019-04-25

    申请号:US16226367

    申请日:2018-12-19

    Abstract: A processor includes a processor core, a processor cache to store reporting data structures including a queue structure, and an interrupt posting circuit coupled to the processor core and the processing cache. The interrupt posting circuit receives an interrupt request directed to a virtual processor (VP) of a virtual machine (VM) executed by the processor core. The VM is managed by a virtual machine monitor (VMM) executed by the processor core. The interrupt posting circuit determines the VP is in an inactive state and records the interrupt request in a first posted data structure allocated by the VMM for the VP in main memory coupled to the processor. The interrupt posting circuit updates location information stored in the reporting data structures based on recording the interrupt request in the first posted data structure to generate updated location information that identifies a location of the interrupt request.

Patent Agency Ranking