-
公开(公告)号:US20240345903A1
公开(公告)日:2024-10-17
申请号:US18298833
申请日:2023-04-11
申请人: Arm Limited
IPC分类号: G06F9/54
CPC分类号: G06F9/544
摘要: The present disclosure relates to a data processing apparatus for a processing resource to perform a transform operation on an input tensor for the processing resource, said input tensor being formed of a plurality of blocks, each block being a portion of said input tensor capable of being operated on independently of each other, said data processing apparatus comprising: communication circuitry to communicate with a control module and a shared storage of said processing resource; processing circuitry to perform said transform operation, said processing circuitry comprising sub-block processing circuitry and transformation circuitry; and a local storage to store transform operation output from said processing circuitry; wherein said communication circuitry is configured to: receive one or more transform parameters; read a first input sub-block from said shared storage, said first input sub-block being a portion of a first block of said input tensor corresponding to a processing unit of said processing circuitry; and write a first output sub-block to said shared storage, wherein said sub-block processing circuitry is configured to: divide said first block of said input tensor into one or more input sub-blocks capable of being operated on independently of each other based on said one or more transform parameters; and wherein said transformation circuitry is configured to: perform said transform operation on said first input sub-block based on said one or more transform parameters to generate said first output sub-block; and write said first output sub-block to said local storage.
-
公开(公告)号:US20240345868A1
公开(公告)日:2024-10-17
申请号:US18753113
申请日:2024-06-25
IPC分类号: G06F9/46 , G06F9/30 , G06F9/38 , G06F9/448 , G06F9/48 , G06F9/54 , G06F11/30 , G06F12/0804 , G06F12/0811 , G06F12/0813 , G06F12/0817 , G06F12/0831 , G06F12/0855 , G06F12/0871 , G06F12/0888 , G06F12/0891 , G06F12/12 , G06F12/121 , G06F13/16
CPC分类号: G06F9/467 , G06F9/30047 , G06F9/30079 , G06F9/30098 , G06F9/30101 , G06F9/30189 , G06F9/3867 , G06F9/4498 , G06F9/4881 , G06F9/544 , G06F11/3037 , G06F12/0811 , G06F12/0813 , G06F12/0824 , G06F12/0828 , G06F12/0831 , G06F12/0855 , G06F12/0871 , G06F12/0888 , G06F12/0891 , G06F12/12 , G06F13/1668 , G06F12/0804 , G06F12/121 , G06F2212/1016 , G06F2212/1044 , G06F2212/621
摘要: A method includes receiving a first request to allocate a line in an N-way set associative cache and, in response to a cache coherence state of a way indicating that a cache line stored in the way is invalid, allocating the way for the first request. The method also includes, in response to no ways in the set having a cache coherence state indicating that the cache line stored in the way is invalid, randomly selecting one of the ways in the set. The method also includes, in response to a cache coherence state of the selected way indicating that another request is not pending for the selected way, allocating the selected way for the first request.
-
公开(公告)号:US12117947B2
公开(公告)日:2024-10-15
申请号:US17205090
申请日:2021-03-18
发明人: Changchun Ouyang , Shui Cao , Zihao Xiang
IPC分类号: G06F13/28 , G06F3/06 , G06F9/4401 , G06F9/455 , G06F9/54
CPC分类号: G06F13/28 , G06F9/4411 , G06F9/45558 , G06F9/544 , G06F2009/4557 , G06F2009/45583 , G06F2213/0026
摘要: The present disclosure relates to information processing methods, physical machines, and peripheral component interconnect express (PCIE) devices. In one example method, a PCIE device receives, in a live migration process of a to-be-migrated virtual machine (VM), a packet corresponding to the to-be-migrated VM, where the to-be-migrated VM is one of a plurality of VMs. The PCIE device determines a direct memory access (DMA) address based on the packet. The PCIE device sends the DMA address to a physical function (PF) driver.
-
公开(公告)号:US12113678B2
公开(公告)日:2024-10-08
申请号:US17376785
申请日:2021-07-15
申请人: VMware LLC
IPC分类号: G06F9/44 , G06F8/60 , G06F9/38 , G06F9/4401 , G06F9/455 , G06F9/50 , G06F9/54 , G06F11/34 , G06F30/331 , G06N20/00 , H04B7/0452 , H04L41/122 , H04L41/40 , H04L43/10 , H04L69/324 , H04W8/18 , H04W8/20 , H04W12/037 , H04W12/08 , H04W24/02 , H04W28/086 , H04W28/16 , H04W40/24 , H04W48/14 , H04W72/044 , H04W72/0453 , H04W72/20 , H04W72/29 , H04W72/51 , H04W72/52 , H04W84/04
CPC分类号: H04L41/122 , G06F8/60 , G06F9/3877 , G06F9/4411 , G06F9/45533 , G06F9/45545 , G06F9/5077 , G06F9/541 , G06F9/544 , G06F9/546 , G06F11/3409 , G06F30/331 , G06N20/00 , H04B7/0452 , H04L41/40 , H04L43/10 , H04L69/324 , H04W8/18 , H04W8/186 , H04W8/20 , H04W12/037 , H04W12/08 , H04W24/02 , H04W28/0865 , H04W28/16 , H04W40/246 , H04W48/14 , H04W72/0453 , H04W72/046 , H04W72/20 , H04W72/29 , H04W72/51 , H04W72/52 , G06F9/45558 , G06F2009/4557 , G06F2009/45579 , G06F2009/45583 , G06F2009/45595 , G06F2209/548 , H04L2212/00 , H04W84/042
摘要: Some embodiments provide various methods for offloading operations in an O-RAN (Open Radio Access Network) onto control plane (CP) or edge applications that execute on host computers with hardware accelerators in software defined datacenters (SDDCs). At the CP or edge application operating on a machine executing on a host computer with a hardware accelerator, the method of some embodiments receives data, from an O-RAN E2 unit, to perform an operation. The method uses a driver of the machine to communicate directly with the hardware accelerator to direct the hardware accelerator to perform a set of computations associated with the operation. This driver allows the communication with the hardware accelerator to bypass an intervening set of drivers executing on the host computer between the machine's driver and the hardware accelerator. Through this driver, the application in some embodiments receives the computation results, which it then provides to one or more O-RAN components (e.g., to the E2 unit that provided the data, another E2 unit or another control plane or edge application).
-
公开(公告)号:US20240330085A1
公开(公告)日:2024-10-03
申请号:US18621895
申请日:2024-03-29
申请人: REBELLIONS INC.
发明人: Hongyun Kim
IPC分类号: G06F9/48 , G06F9/30 , G06F15/173
CPC分类号: G06F9/544 , G06F9/461 , G06F9/4812 , G06F9/4881 , G06F9/5027 , G06F9/5066 , G06F2209/486 , G06F2209/5017
摘要: An apparatus comprising neural processors, a command processor, and a shared memory. The command processor receives a context start signal indicating a start of a context of a neural network model from a host system. The command processor determines whether neural network model data is entirely or partially updated based on the context start signal. The command processor updates the neural network model data in the shared memory based on a determination on whether neural network model data is entirely or partially updated based on the context start signal. The command processor generates a plurality of task descriptors describing neural network model tasks based on the neural network model data, and distributes the plurality of task descriptors to the neural processors.
-
6.
公开(公告)号:US20240330084A1
公开(公告)日:2024-10-03
申请号:US18525553
申请日:2023-11-30
申请人: Intel Corporation
发明人: Mario Flajslik , James Dinan
摘要: Systems, apparatuses and methods may provide for detecting an outbound communication and identifying a context of the outbound communication. Additionally, a completion status of the outbound communication may be tracked relative to the context. In one example, tracking the completion status includes incrementing a sent messages counter associated with the context in response to the outbound communication, detecting an acknowledgement of the outbound communication based on a network response to the outbound communication, incrementing a received acknowledgements counter associated with the context in response to the acknowledgement, comparing the sent messages counter to the received acknowledgements counter, and triggering a per-context memory ordering operation if the sent messages counter and the received acknowledgements counter have matching values.
-
7.
公开(公告)号:US20240330041A1
公开(公告)日:2024-10-03
申请号:US18621936
申请日:2024-03-29
申请人: REBELLIONS INC.
发明人: Hongyun Kim , Chang-Hyo Yu , Yoonho Boo
IPC分类号: G06F9/48 , G06F9/54 , G06F12/0831
CPC分类号: G06F9/4856 , G06F9/544 , G06F12/0835
摘要: A command processor determines whether a command descriptor describing a current command is in a first format or in a second format, wherein the first format includes a source memory address pointing to a memory area in a shared memory having a binary code to be accessed according to direct memory access (DMA) scheme, and the second format includes one or more object indices, a respective one of the one or more object indices indicating an object in an object database. If the command descriptor describing the current command is in the second format, the command processor converts a format of the command descriptor to the first format, generates one or more task descriptors describing neural network model tasks based on the command descriptor in the first format, and distributes the one or more task descriptors to the one or more neural processors.
-
公开(公告)号:US20240320777A1
公开(公告)日:2024-09-26
申请号:US18434319
申请日:2024-02-06
摘要: A graphics pipeline includes a first shader that generates first wave groups, a shader processor input (SPI) that launches the first wave groups for execution by shaders, and a scan converter that generates second waves for execution on the shaders based on results of processing the first wave groups the one or more shaders. The first wave groups are selectively throttled based on a comparison of in-flight first wave groups and second waves pending execution on the at least one second shader. A cache holds information that is written to the cache in response to the first wave groups finishing execution on the shaders. Information is read from the cache in response to read requests issued by the second waves. In some cases, the first wave groups are selectively throttled by comparing how many first wave groups are in-flight and how many read requests to the cache are pending.
-
公开(公告)号:US12099875B2
公开(公告)日:2024-09-24
申请号:US18162704
申请日:2023-01-31
CPC分类号: G06F9/5016 , G06F9/355 , G06F9/455 , G06F9/45533 , G06F9/45558 , G06F9/50 , G06F9/5005 , G06F9/5022 , G06F9/54 , G06F9/544 , G06F9/546 , G06F21/572 , G06F2009/45583 , G06F2009/45587
摘要: A method of memory deallocation across a trust boundary between a first software component and a second software component is described. Some memory is shared between the first and second software components. An in-memory message passing facility is implemented using the shared memory. The first software component is used to deallocate memory from the shared memory which has been allocated by the second software component. The deallocation is done by: taking at least one allocation to be freed from the message passing facility; and freeing the at least one allocation using a local deallocation mechanism while validating that memory access to memory owned by data structures related to memory allocation within the shared memory are within the shared memory.
-
公开(公告)号:US12086993B2
公开(公告)日:2024-09-10
申请号:US17696024
申请日:2022-03-16
申请人: Robert Bosch GmbH
发明人: Cosmin Ionut Bercea
CPC分类号: G06T7/20 , G06F9/544 , G06N3/045 , G06T2207/20084 , G06T2207/30236 , G06T2207/30252
摘要: A method for tracking and/or characterizing multiple objects in a sequence of images. The method includes: assigning a neural network to each object to be tracked; providing a memory shared by all neural networks, and designed to map an address vector of address components, via differentiable operations, onto one or multiple memory locations, and to read data from these memory locations or write data into these memory locations; supplying images from the sequence, and/or details of these images, to each neural network; during the processing of each image and/or image detail by one of the neural networks, generating an address vector from at least one processing product of this neural network; based on this address vector, writing at least one further processing product of the neural network into the shared memory, and/or reading out data from this shared memory and further processing the data by the neural network.
-
-
-
-
-
-
-
-
-