REGISTER SAVING FOR FUNCTION CALLING

    公开(公告)号:US20210096864A1

    公开(公告)日:2021-04-01

    申请号:US16584775

    申请日:2019-09-26

    Abstract: Described herein are techniques for saving registers in the event of a function call. The techniques include modifying a program including a block of code designated as a calling code that calls a function. The modifying includes modifying the calling code to set a register usage mask indicating which registers are in use at the time of the function call. The modifying also includes modifying the function to combine the information of the register usage mask with information indicating registers used by the function to generate registers to be saved and save the registers to be saved.

    ELECTRO-OPTIC WAVEFORM ANALYSIS PROCESS

    公开(公告)号:US20210096174A1

    公开(公告)日:2021-04-01

    申请号:US16585963

    申请日:2019-09-27

    Abstract: A reconfigurable optic probe is used to measure signals from a device under test. The reconfigurable optic probe is positioned at a target probe location within a cell of the device under test. The cell including a target net to be measured and non-target nets. A test pattern is applied to the cell and a laser probe (LP) waveform is obtained in response. A target net waveform is extracted from the LP waveform by: i) configuring the reconfigurable optic probe to produce a ring-shaped beam having a relatively low-intensity region central to the ring-shaped beam; (ii) re-applying the test pattern to the cell at the target probe location with the relatively low-intensity region applied to the target net and obtaining a cross-talk LP waveform in response; (iii) normalizing the cross-talk LP waveform; and (iv) determining a target net waveform by subtracting the normalized cross-talk LP waveform from the LP waveform.

    Hardware accelerated dynamic work creation on a graphics processing unit

    公开(公告)号:US10963299B2

    公开(公告)日:2021-03-30

    申请号:US16134695

    申请日:2018-09-18

    Abstract: A processor core is configured to execute a parent task that is described by a data structure stored in a memory. A coprocessor is configured to dispatch a child task to the at least one processor core in response to the coprocessor receiving a request from the parent task concurrently with the parent task executing on the at least one processor core. In some cases, the parent task registers the child task in a task pool and the child task is a future task that is configured to monitor a completion object and enqueue another task associated with the future task in response to detecting the completion object. The future task is configured to self-enqueue by adding a continuation future task to a continuation queue for subsequent execution in response to the future task failing to detect the completion object.

    EXCEPTION HANDLER FOR SAMPLING DRAW DISPATCH IDENTIFIERS

    公开(公告)号:US20210090205A1

    公开(公告)日:2021-03-25

    申请号:US16580654

    申请日:2019-09-24

    Abstract: The address of the draw or dispatch packet responsible for creating an exception is tied to a shader/wavefront back to the draw command from which it originated. In various embodiments, a method of operating a graphics pipeline and exception handling includes receiving, at a command processor of a graphics processing unit (GPU), an exception signal indicating an occurrence of a pipeline exception at a shader stage of a graphics pipeline. The shader stage generates an exception signal in response to a pipeline exception and transmits the exception signal to the command processor. The command processor determines, based on the exception signal, an address of a command packet responsible for the occurrence of the pipeline exception.

    MATRIX MULTIPLICATION UNIT WITH FLEXIBLE PRECISION OPERATIONS

    公开(公告)号:US20210089304A1

    公开(公告)日:2021-03-25

    申请号:US16581252

    申请日:2019-09-24

    Abstract: A processing unit such as a graphics processing unit (GPU) includes a plurality of vector signal processors (VSPs) that include multiply/accumulate elements. The processing unit also includes a plurality of registers associated with the plurality of VSPs. First portions of first and second matrices are fetched into the plurality of registers prior to a first round that includes a plurality of iterations. The multiply/accumulate elements perform matrix multiplication and accumulation on different combinations of subsets of the first portions of the first and second matrices in the plurality of iterations prior to fetching second portions of the first and second matrices into the plurality of registers for a second round. The accumulated results of multiplying the first portions of the first and second matrices are written into an output buffer in response to completing the plurality of iterations.

    Device and method for accelerating matrix multiply operations

    公开(公告)号:US10956536B2

    公开(公告)日:2021-03-23

    申请号:US16176662

    申请日:2018-10-31

    Abstract: A processing device is provided which comprises memory configured to store data and a plurality of processor cores in communication with each other via first and second hierarchical communication links. Processor cores of a first hierarchical processor core group are in communication with each other via the first hierarchical communication links and are configured to store, in the memory, a sub-portion of data of a first matrix and a sub-portion of data of a second matrix. The processor cores are also configured to determine a product of the sub-portion of data of the first matrix and the sub-portion of data of the second matrix, receive, from another processor core, another sub-portion of data of the second matrix and determine a product of the sub-portion of data of the first matrix and the other sub-portion of data of the second matrix.

    Taint protection during speculative execution

    公开(公告)号:US10956157B1

    公开(公告)日:2021-03-23

    申请号:US16293154

    申请日:2019-03-05

    Abstract: A subset of a set of architectural registers in a processing system is marked (or “tainted”) to indicate that speculative use of data in the subset of the architectural registers is constrained based on a taint handling policy. One or more speculation features supported by the processing system are disabled for the instruction so that the one or more speculation features cannot be used on data in the subset. In some cases, values of bits associated with the subset of architectural registers are modified to indicate that the subset is tainted. The taint handling policy can be indicated by values stored in a policy register. Taint markings are tracked in response to values stored in the tainted architectural registers being written to a memory or read from the memory.

    TERMINATION CALIBRATION SCHEME USING A CURRENT MIRROR

    公开(公告)号:US20210083677A1

    公开(公告)日:2021-03-18

    申请号:US16570334

    申请日:2019-09-13

    Abstract: Systems, apparatuses, and methods for conveying and receiving information as electrical signals in a computing system are disclosed. A computing system includes multiple transmitters sending singled-ended data signals to multiple receivers. A termination voltage is generated and sent to the multiple receivers. The termination voltage is coupled to each of signal termination circuitry and signal sampling circuitry within each of the multiple receivers. Any change in the termination voltage affects the termination circuitry and affects comparisons performed by the sampling circuitry. Received signals are reconstructed at the receivers using the received signals, the signal termination circuitry and the signal sampling circuitry.

    Block level rate control
    539.
    发明授权

    公开(公告)号:US10951892B2

    公开(公告)日:2021-03-16

    申请号:US16263630

    申请日:2019-01-31

    Inventor: Adam H. Li

    Abstract: Systems, apparatuses, and methods for performing efficient bitrate control of video compression are disclosed. Logic in a bitrate controller of a video encoder receives a target block bitstream length for a block of pixels of a video frame. When the logic determines a count of previously compressed blocks does not exceed a count threshold, the logic selects a quantization parameter from a full range of available quantization parameters. After encoding the block, the logic determines a parameter based on a first ratio of the achieved block bitstream length to an exponential value of an actual quantization parameter used to generate the achieved block bitstream length. For another block, when the count exceeds the count threshold, the logic generates a quantization parameter based on a ratio of the target block bitstream length to an average of parameters of previously encoded blocks.

Patent Agency Ranking