-
公开(公告)号:US20240220315A1
公开(公告)日:2024-07-04
申请号:US18091443
申请日:2022-12-30
CPC分类号: G06F9/4881 , G06F9/52
摘要: A processing system includes a scheduling mechanism for producing data for fine-grained reordering of workgroups of a kernel to produce blocks of data, such as for communication across devices to enable overlapping of a producer computation with an all-reduce communication across the network. This scheduling mechanism enables a first parallel processor to schedule and execute a set of workgroups of a producer operation to generate data for transmission to a second parallel processor in a desired traffic pattern. At the same time, the second parallel processor schedules and executes a different set of workgroups of the producer operation to generate data for transmission in a desired traffic pattern to a third parallel processor or back to the first parallel processor.
-
公开(公告)号:US20240220314A1
公开(公告)日:2024-07-04
申请号:US18091441
申请日:2022-12-30
发明人: Harris Gasparakis
CPC分类号: G06F9/4881 , G06F9/522
摘要: A processing system flexibly schedules workgroups across kernels based on data dependencies between workgroups to enhance processing efficiency. The workgroups are partitioned into subsets based on the data dependencies and workgroups of a first subset that produces data are scheduled to execute immediately before workgroups of a second subset that consumes the data generated by the first subset. Thus, the processing system does not execute one kernel at a time, but instead schedules workgroups across kernels based on data dependencies across kernels. By limiting the sizes of the subsets to the amount of data that can be stored at local caches, the processing system increases the probability that data to be consumed by workgroups of a subset will be resident in a local cache and will not require a memory access.
-
公开(公告)号:US20240220296A1
公开(公告)日:2024-07-04
申请号:US18090605
申请日:2022-12-29
IPC分类号: G06F9/455 , G06F12/1081
CPC分类号: G06F9/45558 , G06F12/1081 , G06F2009/45587
摘要: A processor manages memory-mapped input/output (MMIO) accesses, in secure fashion, at an input/output memory management unit (IOMMU). The processor is configured to ensure that, for a given MMIO request issued by a processor core and associated with a particular executing VM, the request is targeted to a MMIO address that has been assigned to the VM by a security module (e.g., a security co-processor). The processor thus prevents a malicious entity from accessing confidential information of a VM via MMIO requests.
-
公开(公告)号:US20240220108A1
公开(公告)日:2024-07-04
申请号:US18147963
申请日:2022-12-29
发明人: Jayesh Hari Joshi , Alicia Wen Ju Yurie Leong , William Robert Alverson , Joshua Taylor Knight , Jerry Anton Ahrens , Grant Evan Ley , Amitabh Mehra , Anil Harwani
IPC分类号: G06F3/06
CPC分类号: G06F3/061 , G06F3/0653 , G06F3/0673
摘要: Automated memory overclocking is described. In accordance with the described techniques, one or more sets of overclocked memory settings of a memory are automatically selected for performance testing and stability testing of the memory. The one or more sets of the overclocked memory settings are tested for performance of the memory and a performance indication is output for each of the one or more sets of the overclocked memory settings. The one or more sets of the overclocked memory settings are tested for stability of the memory and a stability indication is output for each of the one or more sets of the overclocked memory settings. One of the one or more sets of the overclocked memory settings are selected as optimized overclocked memory settings for the memory.
-
公开(公告)号:US12028190B1
公开(公告)日:2024-07-02
申请号:US18086960
申请日:2022-12-22
CPC分类号: H04L25/03038 , H04L25/4917 , H04L2025/03471
摘要: A driver circuit includes a feed-forward equalization (FFE) circuit. The FFE circuit receives a plurality of pulse-amplitude modulation (PAM) symbol values to be transmitted at one of multiple PAM levels. The FFE circuit includes a first partial lookup table, one or more additional partial lookup tables, and an adder circuit. The first partial lookup table contains partial finite impulse-response (FIR) values and indexed based on a current PAM symbol value, a precursor PAM symbol value, and a postcursor PAM symbol value. The one or more additional partial lookup tables each contain partial FIR values and indexed based on a respective additional one or more of the PAM symbol values. The adder circuit adds results of lookups from the first partial lookup table and the additional partial lookup tables to produce an output value.
-
公开(公告)号:US20240212777A1
公开(公告)日:2024-06-27
申请号:US18146558
申请日:2022-12-27
IPC分类号: G11C29/14
CPC分类号: G11C29/14 , G11C2029/1208
摘要: Memory verification using processing-in-memory is described. In accordance with the described techniques, memory testing logic is loaded into a processing-in-memory component. The processing-in-memory component executes the memory testing logic to test a memory. An indication is output of a detected fault in the memory based on testing the memory.
-
公开(公告)号:US20240211134A1
公开(公告)日:2024-06-27
申请号:US18087964
申请日:2022-12-23
IPC分类号: G06F3/06
CPC分类号: G06F3/061 , G06F3/0656 , G06F3/0659 , G06F3/0673
摘要: A memory controller includes an arbiter, a vector arithmetic logic unit (VALU), a read buffer and a write buffer both coupled to the VALU, and an atomic memory operation scheduler. The VALU performs scattered atomic memory operations on arrays of data elements responsive to selected memory access commands. The atomic memory operation scheduler is for scheduling atomic memory operations at the VALU; identifying a plurality of scattered atomic memory operations with commutative and associative properties, the plurality of scattered atomic memory operations on at least one element of an array of data elements associated with an address; and commanding the VALU to perform the plurality of scattered atomic memory operations.
-
公开(公告)号:US20240211023A1
公开(公告)日:2024-06-27
申请号:US18146811
申请日:2022-12-27
发明人: Gia Tung Phan , Ashish Jain , Shang Yang
IPC分类号: G06F1/3296 , G06F12/0875 , G06T1/20 , G06T1/60
CPC分类号: G06F1/3296 , G06F12/0875 , G06T1/20 , G06T1/60 , G06F2212/45
摘要: An apparatus and method for efficiently managing power consumption among multiple, replicated functional blocks of an integrated circuit. An integrated circuit includes multiple, replicated functional blocks that use separate power domains. Data of a given type is stored in an interleaved manner among at least two of the multiple functional blocks. In one implementation, a prior static allocation determines that only a subset of the functional blocks store the data of the given type. In another implementation, each of the functional blocks stores the data of the given type, and when an idle state has occurred, data of the given type is moved between the multiple functional blocks until one or more functional blocks no longer store data of the given type. When a transition to the idle state has occurred, the functional blocks that do not store the data of the given type are transitioned to a sleep state.
-
公开(公告)号:US12019566B2
公开(公告)日:2024-06-25
申请号:US16938364
申请日:2020-07-24
IPC分类号: G06F13/16 , G06F9/30 , H04L45/122
CPC分类号: G06F13/1642 , G06F9/3004 , G06F9/30098 , G06F13/1663 , H04L45/122
摘要: Arbitrating atomic memory operations, including: receiving, by a media controller, a plurality of atomic memory operations; determining, by an atomics controller associated with the media controller, based on one or more arbitration rules, an ordering for issuing the plurality of atomic memory operations; and issuing the plurality of atomic memory operations to a memory module according to the ordering.
-
公开(公告)号:US20240203036A1
公开(公告)日:2024-06-20
申请号:US18083298
申请日:2022-12-16
CPC分类号: G06T15/08 , G06T15/10 , G06T2210/12
摘要: A technique for building a bounding volume hierarchy is disclosed. The technique subdividing a candidate box node based on a resolution to generate a plurality of cells of the candidate box node; identifying a plurality of nodes of a triangle set collection that fit within the cells; generating a plurality of candidate splits based on the plurality of nodes; selecting a candidate split based on a selection criterion to obtain a selected candidate split; and generating child box nodes for a box node of a bounding volume hierarchy under construction, based on the selected candidate split.
-
-
-
-
-
-
-
-
-