Highly flexible performance counter and system debug module

    公开(公告)号:US10386410B2

    公开(公告)日:2019-08-20

    申请号:US15464334

    申请日:2017-03-20

    Abstract: According to one general aspect, an apparatus may include a plurality of performance and debug monitoring circuits (PDMCs). Each performance and debug monitoring circuit (PDMC) may include an input stage, a combinatorial stage, and a counter. The input stage may be configured to receive a plurality of input signals, wherein the input signals include: signals from other performance and debug monitoring circuits, signals from combinatorial logic circuits, and configuration values. The combinatorial stage may be configured to perform one or more logical operations on a selected sub-set of the input signals. The counter may be configured to increment based, at least in part, upon a result of the combinatorial stage.

    Method for performing shader occupancy for small primitives

    公开(公告)号:US11748933B2

    公开(公告)日:2023-09-05

    申请号:US17168168

    申请日:2021-02-04

    CPC classification number: G06T15/005 G06T1/20 G06T15/80

    Abstract: A GPU includes shader cores and a shader warp packer unit. The shader warp packer unit may receive a first primitive associated with a first partially covered quad, and a second primitive associated with a second partially covered quad. The shader warp packer unit may determine that the first partially covered quad and the second partially covered quad have non-overlapping coverage. The shader warp packer unit may pack the first partially covered quad and the second partially covered quad into a packed quad. The shader warp packer unit may send the packed quad to the shader cores. The first partially covered quad and the second partially covered quad may be spatially disjoint from each other. The shader cores may receive and process the packed quad with no loss of information relative to the shader cores individually processing the first partially covered quad and the second partially covered quad.

    Method and apparatus for displaying multiple devices on shared screen

    公开(公告)号:US11360732B1

    公开(公告)日:2022-06-14

    申请号:US17209209

    申请日:2021-03-22

    Abstract: A system and method is disclosed that allows multiple casting devices to work together to populate a large display screen according to the subject matter disclosed herein. The system includes a receiving device that includes two or more screen-cast receivers and a controller. Each screen-cast receiver receives from a corresponding casting device at least a portion of a frame of original content of the corresponding casting device generated in a native resolution of the corresponding casting device. The controller synchronizes each received portion of the frame of the original content of the corresponding casting device to form a video output signal that comprises a combination of each received portion, in addition to any internally generated content derived by the receiving display. A casting device may be a smartphone, a tablet, or a computing device, such as a laptop computer.

    Method and apparatus for graphics driver optimization using daemon-based resources

    公开(公告)号:US11321907B1

    公开(公告)日:2022-05-03

    申请号:US17227270

    申请日:2021-04-09

    Abstract: A system and a method are disclosed that optimizes a graphics driver. The system may be embodied as a computing device that includes a storage that is internal to the computing device, a graphic processing unit that includes a driver and a controller. The controller may be configured to run a daemon process that optimizes a shader and/or a shader pipeline for an application that is resident on the computing device when the computing device is not running the application and stores at least one optimization for the shader in the storage. The at least one optimization may be based on the application. The daemon process may further receive a request from the driver of the GPU for an optimization for the shader/shader pipeline during a runtime compilation of the shader and provide the at least one optimization to the driver of the GPU from the storage.

    Lightweight, low overhead debug bus

    公开(公告)号:US10310012B2

    公开(公告)日:2019-06-04

    申请号:US15473593

    申请日:2017-03-29

    Abstract: According to one general aspect, an apparatus may include an interconnect bus, an interconnect-to-debug bus interface, and a debug bus. The interconnect bus may be configured to connect and manage combinatorial logical blocks during normal operation of a processor and operate synchronous to a core clock. The interconnect-to-debug bus interface may be configured to translate communications between the interconnect bus and the debug bus. The debug bus may include a plurality of debug wrapper circuits arranged in a daisy chain for unidirectional communication, and configured to operate synchronous to the core clock. Each of the plurality of debug wrapper circuits may be configured to: identify if the respective debug wrapper circuit is activated by the debug bus, receive a non-invasive input from a respective combinatorial logic block, and place the non-invasive input from the respective combinatorial logic block on the debug bus.

    Systems and methods of adaptive, variable-rate, hybrid ray tracing

    公开(公告)号:US11869117B2

    公开(公告)日:2024-01-09

    申请号:US17576796

    申请日:2022-01-14

    CPC classification number: G06T11/00 G06T7/40

    Abstract: A hybrid ray tracing system includes: a processor; and memory including instructions that, when executed by the processor, cause the processor to: identify a subset of pixels of an image to be ray-traced based on variable rate shading (VRS) screenspace image data; set, based on the VRS screenspace image data, one or more material properties of at least one object corresponding to the subset of pixels; and perform ray-tracing for the subset of pixels to generate a ray-traced image. The ray-tracing includes performing a limited ray casting process based on the set one or more material properties.

    VERTEX ATTRIBUTE COMPRESSION AND DECOMPRESSION IN HARDWARE

    公开(公告)号:US20180150991A1

    公开(公告)日:2018-05-31

    申请号:US15432782

    申请日:2017-02-14

    CPC classification number: G06T15/005 G06T2210/08

    Abstract: One or more embodiments of the present disclosure provide an apparatus used in source data compression, comprising a memory and a at least one processor. The memory is configured to store vertex attribute data and a set of instructions. The processor is coupled to the memory. The processor is configured to receive a source data stream that includes one or more values corresponding to the vertex attribute data. The processor is also configured to provide a dictionary for the one or more values in the source data stream, wherein the dictionary includes a plurality of index values corresponding to the one or more values in the source data stream. The processor is also configured to lace at least some of the one or more values in the source data stream with corresponding index values of the plurality of index values.

    Methods and apparatus for atomic operations with multiple processing paths

    公开(公告)号:US11620222B2

    公开(公告)日:2023-04-04

    申请号:US17086323

    申请日:2020-10-30

    Abstract: A method for performing an atomic memory operation may include receiving an atomic input, receiving an address for an atomic memory location, and performing an atomic operation on the atomic memory location based on the atomic input, wherein performing the atomic operation may include performing a first operation on a first portion of the atomic input, and performing a second operation, which may be different from the first operation, on a second portion of the atomic input. The method may further include storing a result of the first operation in a first portion of the atomic memory location, and storing a result of the second operation in a second portion of the atomic memory location. The method may further include returning an original content of the first portion of the atomic memory location concatenated with an original content of the second portion of the atomic memory location.

    SYSTEMS AND METHODS OF ADAPTIVE, VARIABLE-RATE, HYBRID RAY TRACING

    公开(公告)号:US20220301233A1

    公开(公告)日:2022-09-22

    申请号:US17576796

    申请日:2022-01-14

    Abstract: A hybrid ray tracing system includes: a processor; and memory including instructions that, when executed by the processor, cause the processor to: identify a subset of pixels of an image to be ray-traced based on variable rate shading (VRS) screenspace image data; set, based on the VRS screenspace image data, one or more material properties of at least one object corresponding to the subset of pixels; and perform ray-tracing for the subset of pixels to generate a ray-traced image. The ray-tracing includes performing a limited ray casting process based on the set one or more material properties.

Patent Agency Ranking