REDUCED BANDWIDTH TESSELLATION FACTORS

    公开(公告)号:US20210374898A1

    公开(公告)日:2021-12-02

    申请号:US17318523

    申请日:2021-05-12

    Abstract: A graphics pipeline reduces the number of tessellation factors written to and read from a graphics memory. A hull shader stage of the graphics pipeline detects whether at least a threshold percentage of the tessellation factors for a thread group of patches are the same and, in some embodiments, whether at least the threshold percentage of the tessellation factors for a thread group of patches have a same value that either indicates that the plurality of patches are to be culled or that the plurality of patches are to be passed to a tessellator stage of the graphics pipeline. In response to detecting that at least the threshold percentage of the tessellation factors for the thread group are the same (or, additionally, that at least the threshold percentage of the tessellation factors have a value that either indicates that the plurality of patches are to be culled or that the plurality of patches are to be passed to a tessellator stage of the graphics pipeline), the hull shader stage bypasses writing at least a subset of the tessellation factors for the thread group of patches to the graphics memory, thus reducing bandwidth and increasing efficiency of the graphics pipeline.

    WORKGROUP SYNCHRONIZATION AND PROCESSING

    公开(公告)号:US20210373975A1

    公开(公告)日:2021-12-02

    申请号:US17029935

    申请日:2020-09-23

    Abstract: A processing system monitors and synchronizes parallel execution of workgroups (WGs). One or more of the WGs perform (e.g., periodically or in response to a trigger such as an indication of oversubscription) a waiting atomic instruction. In response to a comparison between an atomic value produced as a result of the waiting atomic instruction and an expected value, WGs that fail to produce a correct atomic value are identified as being in a waiting state (e.g., waiting for a synchronization variable). Execution of WGs in the waiting state is prevented (e.g., by a context switch) until corresponding synchronization variables are released.

    TASK GRAPH GENERATION FOR WORKLOAD PROCESSING

    公开(公告)号:US20210373892A1

    公开(公告)日:2021-12-02

    申请号:US16888521

    申请日:2020-05-29

    Abstract: Techniques for generating a task graph for workload scheduling based on a task graph specification program are provided. The techniques include executing control flow instructions of the task graph specification program to traverse the task graph specification program; generating pass nodes of the task graph based on pass instructions of the task graph specification program; generating resource nodes and directed edges based on resource declarations of the task graph specification program; and outputting the task graph specification program to a command scheduler for scheduling.

    REFRESH MANAGEMENT FOR DRAM
    465.
    发明申请

    公开(公告)号:US20210358540A1

    公开(公告)日:2021-11-18

    申请号:US16875281

    申请日:2020-05-15

    Abstract: A memory controller interfaces with a dynamic random access memory (DRAM) over a memory channel. A refresh control circuit monitors an activate counter which counts a rolling number of activate commands sent over the memory channel to a memory region of the DRAM. In response to the activate counter being above an intermediate management threshold value, the refresh control circuit only issue a refresh management (RFM) command if there is no REF command currently held at the refresh command circuit for the memory region.

    Memory context restore, reduction of boot time of a system on a chip by reducing double data rate memory training

    公开(公告)号:US11176986B2

    公开(公告)日:2021-11-16

    申请号:US16730086

    申请日:2019-12-30

    Abstract: Methods for reducing boot time of a system-on-a-chip (SOC) by reducing double data rate (DDR) memory training and memory context restore. Dynamic random access memory (DRAM) controller and DDR physical interface (PHY) settings are stored into a non-volatile memory and the DRAM controller and DDR PHY are powered down. On system resume, a basic input/output system restores the DRAM controller and DDR PHY settings from non-volatile memory, and finalizes the DRAM controller and DDR PHY settings for operation with the SOC. Reducing the boot time of the SOC by reducing DDR training includes setting DRAMs into self-refresh mode, and programing a self-refresh state machine memory operation (MOP) array to exit self-refresh mode and update any DRAM device state for the target power management state. The DRAM device is reset, and the self-refresh state machine MOP array reinitializes the DRAM device state for the target power management state.

    ENCODING OF SYMBOLS FOR A COMPUTER INTERCONNECT BASED ON FREQUENCY OF SYMBOL VALUES

    公开(公告)号:US20210342285A1

    公开(公告)日:2021-11-04

    申请号:US16863149

    申请日:2020-04-30

    Abstract: Data are serially communicated over an interconnect between an encoder and a decoder. The encoder includes a first training unit to count a frequency of symbol values in symbol blocks of a set of N number of symbol blocks in an epoch. A circular shift unit of the encoder stores a set of most-recently-used (MRU) amplitude values. An XOR unit is coupled to the first training unit and the first circular shift unit as inputs and to the interconnect as output. A transmitter is coupled to the encoder XOR unit and the interconnect and thereby contemporaneously sends symbols and trains on the symbols. In a system, a device includes a receiver and decoder that receive, from the encoder, symbols over the interconnect. The decoder includes its own training unit for decoding the transmitted symbols.

    MEMORY OPERATIONS USING COMPOUND MEMORY COMMANDS

    公开(公告)号:US20210326063A1

    公开(公告)日:2021-10-21

    申请号:US16848920

    申请日:2020-04-15

    Abstract: Memory operations using compound memory commands, including: receiving, by a memory module, a compound memory command indicating one or more operations to be applied to each portion of a plurality of portions of contiguous memory in the memory module; generating, based on the compound memory command, a plurality of memory commands to apply the one or more operations to each portion of the plurality of portions of contiguous memory; and executing the plurality of memory commands.

Patent Agency Ranking