Software-based instruction scoreboard for arithmetic logic units

    公开(公告)号:US11847462B2

    公开(公告)日:2023-12-19

    申请号:US17122089

    申请日:2020-12-15

    Inventor: Brian Emberling

    CPC classification number: G06F9/3838 G06F7/57

    Abstract: A software-based instruction scoreboard indicates dependencies between closely-issued instructions issued to an arithmetic logic unit (ALU) pipeline. The software-based instruction scoreboard inserts one or more control words into the command stream between the dependent instructions, which is then executed by the ALU pipeline. The control words identify the instruction(s) upon which the dependent instructions depend (parent instructions) so that the GPU hardware can ensure that the ALU pipeline does not stall while the dependent instruction waits for results from the parent instruction.

    Hypervisor secure event handling at a processor

    公开(公告)号:US11842227B2

    公开(公告)日:2023-12-12

    申请号:US16712190

    申请日:2019-12-12

    Abstract: A virtualized computing environment is protected from a malicious hypervisor by restricting the hypervisor's access to one or more portions of an event (interrupt or exception) handling pathway of a guest virtual machine, wherein the guest virtual machine includes both a secure layer to manage security for the guest and one or more non-secure layers to handle event processing. The hypervisor is restricted from providing normal exception information to the guest virtual machine (referred to simply as a “guest” herein), and instead is only permitted to provide an event signal to the secure layer of the guest. In response to the event signal, the secure layer of the guest accesses a specified region of memory for the event information, reviews the information, and provides the information to another, non-secure, layer of the guest for processing only if the event information complies with specified security protocols.

    Concurrent training of functional subnetworks of a neural network

    公开(公告)号:US11836610B2

    公开(公告)日:2023-12-05

    申请号:US15841030

    申请日:2017-12-13

    CPC classification number: G06N3/08 G06N3/045

    Abstract: An artificial neural network that includes first subnetworks to implement known functions and second subnetworks to implement unknown functions is trained. The first subnetworks are trained separately and in parallel on corresponding known training datasets to determine first parameter values that define the first subnetworks. The first subnetworks are executing on a plurality of processing elements in a processing system. Input values from a network training data set are provided to the artificial neural network including the trained first subnetworks. Error values are generated by comparing output values produced by the artificial neural network to labeled output values of the network training data set. The second subnetworks are trained by back propagating the error values to modify second parameter values that define the second subnetworks without modifying the first parameter values. The first and second parameter values are stored in a storage component.

    System and method for load fusion
    217.
    发明授权

    公开(公告)号:US11835988B2

    公开(公告)日:2023-12-05

    申请号:US15828708

    申请日:2017-12-01

    Inventor: John M. King

    Abstract: A system and method for load fusion fuses small load operations into fewer, larger load operations. The system detects that a pair of adjacent operations are consecutive load operations, where the adjacent micro-operations refers to micro-operations flowing through adjacent dispatch slots and the consecutive load micro-operations refers to both of the adjacent micro-operations being load micro-operations. The consecutive load operations are then reviewed to determine if the data sizes are the same and if the load operation addresses are consecutive. The two load operations are then fused together to form one load micro-operation with twice the data size and one load data micro-operation with no load component.

    Data routing for efficient decompression of compressed data stored in a cache

    公开(公告)号:US11829190B2

    公开(公告)日:2023-11-28

    申请号:US17557815

    申请日:2021-12-21

    CPC classification number: G06F12/0802 G06T9/00 G06F2212/401

    Abstract: Data routing for efficient decompressor use is described. In accordance with the described techniques, a cache controller receives requests from multiple requestors for elements of data stored in a compressed format in a cache. The requests include at least a first request from a first requestor and a second request from a second requestor. A decompression routing system identifies a redundant element of data requested by both the first requestor and the second requestor and causes decompressors to decompress the requested elements of data. The decompression includes performing a single decompression of the redundant element. After the decompression, the decompression routing system routes the decompressed elements to the plurality of requestors, which includes routing the decompressed redundant element to both the first requestor and the second requestor.

    Performing store-to-load forwarding of a return address for a return instruction

    公开(公告)号:US11822923B1

    公开(公告)日:2023-11-21

    申请号:US16451783

    申请日:2019-06-25

    Inventor: David Kaplan

    CPC classification number: G06F9/3834 G06F9/3842 G06F9/3861

    Abstract: A load/store unit includes a first queue including a first entry for a store operation and a second queue including a second entry for a load operation that includes a return instruction that redirects a program flow to a location indicated by the return instruction. The load/store unit also includes a processor to determine that the store operation matches the load operation and selectively perform store-to-load forwarding (STLF) of a return address for the return instruction from the first entry to the second entry based on whether the store operation is associated with a call instruction. The load/store unit forwards the return address to the second entry in response to the store operation being associated with the call instruction. The load/store unit blocks forwarding until the store operation retires in response to the store operation not being associated with the call instruction.

Patent Agency Ranking