SYSTEM AND METHOD USING HASH TABLE WITH A SET OF FREQUENTLY-ACCESSED BUCKETS AND A SET OF LESS FREQUENTLY-ACCESSED BUCKETS

    公开(公告)号:US20210182262A1

    公开(公告)日:2021-06-17

    申请号:US16717027

    申请日:2019-12-17

    Inventor: Nuwan Jayasena

    Abstract: A method and apparatus perform a first hash operation on a first key wherein the first hash operation is biased to map the first key and associated value to a set of frequently-accessed buckets in a hash table. An entry for the first key and associated value is stored in the set of frequently-accessed buckets. A second hash operation is performed on a second key wherein the second hash operation is biased to map the second key and associated value to a set of less frequently-accessed buckets in the hash table. An entry for the second key and associated value is stored in the set of less frequently-accessed buckets. The method and apparatus perform a hash table look up of the requested key in the set of frequently-accessed buckets, if the requested key is not found, then a hash table lookup is performed in the set of less frequently-accessed buckets.

    CACHE LINE RE-REFERENCE INTERVAL PREDICTION USING PHYSICAL PAGE ADDRESS

    公开(公告)号:US20210182213A1

    公开(公告)日:2021-06-17

    申请号:US16716165

    申请日:2019-12-16

    Abstract: Systems, apparatuses, and methods for implementing cache line re-reference interval prediction using a physical page address are disclosed. When a cache line is accessed, a controller retrieves a re-reference interval counter value associated with the line. If the counter is less than a first threshold, then the address of the cache line is stored in a small re-use page buffer. If the counter is greater than a second threshold, then the address is stored in a large re-use page buffer. When a new cache line is inserted in the cache, if its address is stored in the small re-use page buffer, then the controller assigns a high priority to the line to cause it to remain in the cache to be re-used. If a match is found in the large re-use page buffer, then the controller assigns a low priority to the line to bias it towards eviction.

    MARKER-BASED PROCESSOR INSTRUCTION GROUPING

    公开(公告)号:US20210182072A1

    公开(公告)日:2021-06-17

    申请号:US16713432

    申请日:2019-12-13

    Abstract: A system includes a processing unit such as a GPU that itself includes a command processor configured to receive instructions for execution from a software application. A processor pipeline coupled to the processing unit includes a set of parallel processing units for executing the instructions in sets. A set manager is coupled to one or more of the processor pipeline and the command processor. The set manager includes at least one table for storing a set start time, a set end time, and a set execution time. The set manager determines an execution time for one or more sets of instructions of a first window of sets of instructions submitted to the processor pipeline. Based on the execution time of the one or more sets of instructions, a set limit is determined and applied to one or more sets of instructions of a second window subsequent to the first window.

    SCHEDULER QUEUE ASSIGNMENT BURST MODE

    公开(公告)号:US20210173702A1

    公开(公告)日:2021-06-10

    申请号:US16709527

    申请日:2019-12-10

    Abstract: Systems, apparatuses, and methods for implementing scheduler queue assignment burst mode are disclosed. A scheduler queue assignment unit receives a dispatch packet with a plurality of operations from a decode unit in each clock cycle. The scheduler queue assignment unit determines if the number of operations in the dispatch packet for any class of operations is greater than a corresponding threshold for dispatching to the scheduler queues in a single cycle. If the number of operations for a given class is greater than the corresponding threshold, and if a burst mode counter is less than a burst mode window threshold, the scheduler queue assignment unit dispatches the extra number of operations for the given class in a single cycle. By operating in burst mode for a given operation class during a small number of cycles, processor throughput can be increased without starving the processor of other operation classes.

    APPARATUS AND METHODS FOR MANAGING PACKET TRANSFER ACROSS A MEMORY FABRIC PHYSICAL LAYER INTERFACE

    公开(公告)号:US20210165606A1

    公开(公告)日:2021-06-03

    申请号:US16701794

    申请日:2019-12-03

    Abstract: An apparatus and method for managing packet transfer between a memory fabric having a physical layer interface higher data rate than a data rate of a physical layer interface of another device, receives incoming packets from the memory fabric physical layer interface wherein at least some of the packets include different instruction types. The apparatus and method determine a packet type of the incoming packet received from the memory fabric physical layer interface and when the determined incoming packet type is of a type containing an atomic request, the method and apparatus prioritizes transfer of the incoming packet with the atomic request over other packet types of incoming packets, to memory access logic that accesses local memory within an apparatus.

    Static Random Access Memory Read Path with Latch

    公开(公告)号:US20210158855A1

    公开(公告)日:2021-05-27

    申请号:US16692714

    申请日:2019-11-22

    Abstract: A read path for reading data from a memory includes a sense amplifier having data (SAT) and data complement (SAC) output nodes and a latch. The latch includes an input tri-state inverter including first and second PMOS transistors connected between VDD and an intermediate node, and first and second NMOS transistors connected between VSS and the intermediate node. A gate connection of the first PMOS and NMOS transistors is connected to the SAT node; a gate connection of the second PMOS transistor is connected to a sense amplifier enable complement input; and a gate connection of the second NMOS transistor is connected to a sense amplifier enable input. The latch also includes an output driver with an input connected to the intermediate node and an output connected to a data output node. The latch thus has two gate delays between the SAT node and the data output node.

    DATA FLOW IN A DISTRIBUTED GRAPHICS PROCESSING UNIT ARCHITECTURE

    公开(公告)号:US20210158599A1

    公开(公告)日:2021-05-27

    申请号:US16698624

    申请日:2019-11-27

    Abstract: An apparatus includes a command buffer configured to temporarily store commands. The apparatus also includes processing units disposed at a substrate. The processing units are configured to access a plurality of copies of a command from the command buffer. The processing units include first processing units (such as fixed function hardware blocks) to perform geometry operations indicated by the command on a set of primitives. The geometry operations are performed concurrently by the first processing units. The processing units also include second processing units (such as shaders) to process mutually exclusive sets of pixels generated by rasterizing the set of primitives. The apparatus also includes a cache to temporarily store the pixels after shading by the shaders. The processing units stop or interrupt processing commands in response to detecting a synchronization point and resume processing the commands in response to all the processing units completing commands before synchronization point.

Patent Agency Ranking