专利检索 cpc:"G06F9/3887" 第 1 页

1.

发明公开
SYSTEM AND METHOD FOR ENERGY-EFFICIENT IMPLEMENTATION OF NEURAL NETWORKS 审中-公开

公开(公告)号：US20240362456A1

公开(公告)日：2024-10-31

申请号：US18770518

申请日：2024-07-11

申请人： UNTETHER AI CORPORATION

发明人： William Martin SNELGROVE , Darrick WIEBE

IPC分类号： G06N3/045 , G06F9/38 , G06F13/40 , G06N3/063

CPC分类号： G06N3/045 , G06F9/3887 , G06F13/4022 , G06N3/063 , Y02D10/00

摘要： A system and method for enhancing C*RAM, improving its performance for known applications such as video processing but also making it well suited to low-power implementation of neural nets. The required computing engine is decomposed into banks of enhanced C*RAM each having a SIMD controller, thus allowing operations at several scales simultaneously. Several configurations of suitable controllers are discussed, along with communication structures and enhanced processing elements.

2.

发明公开
Vector Based Matrix Multiplication 审中-公开

公开(公告)号：US20240354259A1

公开(公告)日：2024-10-24

申请号：US18732865

申请日：2024-06-04

申请人： Texas Instruments Incorporated

发明人： Asheesh Bhardwaj , Mujibur Rahman , Timothy David Anderson

IPC分类号： G06F12/1045 , G06F7/24 , G06F7/487 , G06F7/499 , G06F7/53 , G06F7/57 , G06F9/30 , G06F9/32 , G06F9/345 , G06F9/38 , G06F9/48 , G06F11/00 , G06F11/10 , G06F12/0862 , G06F12/0875 , G06F12/0897 , G06F12/1009 , G06F15/78 , G06F17/16 , H03H17/06

CPC分类号： G06F12/1045 , G06F7/24 , G06F7/487 , G06F7/4876 , G06F7/49915 , G06F7/53 , G06F7/57 , G06F9/3001 , G06F9/30014 , G06F9/30021 , G06F9/30032 , G06F9/30036 , G06F9/30065 , G06F9/30072 , G06F9/30098 , G06F9/30112 , G06F9/30145 , G06F9/30149 , G06F9/3016 , G06F9/32 , G06F9/345 , G06F9/3802 , G06F9/3818 , G06F9/383 , G06F9/3836 , G06F9/3851 , G06F9/3856 , G06F9/3867 , G06F9/3887 , G06F9/48 , G06F11/00 , G06F11/1048 , G06F12/0862 , G06F12/0875 , G06F12/0897 , G06F12/1009 , G06F17/16 , H03H17/0664 , G06F9/30018 , G06F9/325 , G06F9/381 , G06F9/3822 , G06F11/10 , G06F15/7807 , G06F15/781 , G06F2212/452 , G06F2212/60 , G06F2212/602 , G06F2212/68

摘要： A method is provided that includes performing, by a processor in response to a vector matrix multiply instruction, multiplying an m×n matrix (A matrix) and a n×p matrix (B matrix) to generate elements of an m×p matrix (R matrix), and storing the elements of the R matrix in a storage location specified by the vector matrix multiply instruction.

3.

发明授权
Programmable coarse grained and sparse matrix compute hardware with advanced scheduling 有权

公开(公告)号：US12112397B2

公开(公告)日：2024-10-08

申请号：US18334733

申请日：2023-06-14

申请人： Intel Corporation

发明人： Eriko Nurvitadhi , Balaji Vembu , Nicolas C. Galoppo Von Borries , Rajkishore Barik , Tsung-Han Lin , Kamal Sinha , Nadathur Rajagopalan Satish , Jeremy Bottleson , Farshad Akhbari , Altug Koker , Narayan Srinivasa , Dukhwan Kim , Sara S. Baghsorkhi , Justin E. Gottschlich , Feng Chen , Elmoustapha Ould-Ahmed-Vall , Kevin Nealis , Xiaoming Chen , Anbang Yao

IPC分类号： G06T1/20 , G06F9/30 , G06F9/38 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084

CPC分类号： G06T1/20 , G06F9/3001 , G06F9/3017 , G06F9/3851 , G06F9/3887 , G06F9/3895 , G06N3/04 , G06N3/044 , G06N3/045 , G06N3/063 , G06N3/08 , G06N3/084

摘要： One embodiment provides a parallel processor comprising a hardware scheduler to schedule pipeline commands for compute operations to one or more of multiple types of compute units, a plurality of processing resources including a first sparse compute unit configured for input at a first level of sparsity and hybrid memory circuitry including a memory controller, a memory interface, and a second sparse compute unit configured for input at a second level of sparsity that is greater than the first level of sparsity.

4.

发明授权
Task execution in a SIMD processing unit with parallel groups of processing lanes 有权

公开(公告)号：US12112396B2

公开(公告)日：2024-10-08

申请号：US18236036

申请日：2023-08-21

申请人： Imagination Technologies Limited

发明人： John Howson , Jonathan Redshaw , Yoong Chert Foo

IPC分类号： G06T1/20 , G06F9/30 , G06F9/38 , G06F15/80 , G06F9/48

CPC分类号： G06T1/20 , G06F9/30036 , G06F9/3822 , G06F9/3836 , G06F9/3887 , G06F15/8007 , G06F9/4881 , G06F2209/507

摘要： A SIMD processing unit processes a plurality of tasks which each include up to a predetermined maximum number of work items. The work items of a task are arranged for executing a common sequence of instructions on respective data items. The data items are arranged into blocks, with some of the blocks including at least one invalid data item. Work items which relate to invalid data items are invalid work items. The SIMD processing unit comprises a group of processing lanes configured to execute instructions of work items of a particular task over a plurality of processing cycles. A control module assembles work items into the tasks based on the validity of the work items, so that invalid work items of the particular task are temporally aligned across the processing lanes. In this way the number of wasted processing slots due to invalid work items may be reduced.

5.

发明公开
MULTI-TILE ARCHITECTURE FOR GRAPHICS OPERATIONS 审中-公开

公开(公告)号：US20240320184A1

公开(公告)日：2024-09-26

申请号：US18620284

申请日：2024-03-28

申请人： Intel Corporation

发明人： Altug Koker , Ben Ashbaugh , Scott Janus , Aravindh Anantaraman , Abhishek R. Appu , Niranjan Cooray , Varghese George , Arthur Hunter , Brent E. Insko , Elmoustapha Ould-Ahmed-Vall , Selvakumar Panneer , Vasanth Ranganathan , Joydeep Ray , Kamal Sinha , Lakshminarayanan Striramassarma , Prasoonkumar Surti , Saurabh Tangri

IPC分类号： G06F15/78 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/80 , G06F17/16 , G06F17/18 , G06N3/08 , G06T1/20 , G06T1/60 , G06T15/06 , H03M7/46

CPC分类号： G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06

摘要： Embodiments are generally directed to a multi-tile architecture for graphics operations. An embodiment of an apparatus includes a multi-tile architecture for graphics operations including a multi-tile graphics processor, the multi-tile processor includes one or more dies; multiple processor tiles installed on the one or more dies; and a structure to interconnect the processor tiles on the one or more dies, wherein the structure to enable communications between processor tiles the processor tiles.

6.

发明授权
Multi-tile memory management 有权

公开(公告)号：US12099461B2

公开(公告)日：2024-09-24

申请号：US17431034

申请日：2020-03-14

申请人： Intel Corporation

发明人： Abhishek R. Appu , Altug Koker , Aravindh Anantaraman , Elmoustapha Ould-Ahmed-Vall , Valentin Andrei , Nicolas Galoppo Von Borries , Varghese George , Mike Macpherson , Subramaniam Maiyuran , Joydeep Ray , Lakshminarayanan Striramassarma , Scott Janus , Brent Insko , Vasanth Ranganathan , Kamal Sinha , Arthur Hunter , Prasoonkumar Surti , David Puffer , James Valerio , Ankur N. Shah

IPC分类号： G06F16/00 , G06F7/544 , G06F7/575 , G06F7/58 , G06F9/30 , G06F9/38 , G06F9/50 , G06F12/02 , G06F12/06 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/78 , G06F15/80 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06N3/08 , G06T15/06

CPC分类号： G06F15/7839 , G06F7/5443 , G06F7/575 , G06F7/588 , G06F9/3001 , G06F9/30014 , G06F9/30036 , G06F9/3004 , G06F9/30043 , G06F9/30047 , G06F9/30065 , G06F9/30079 , G06F9/3887 , G06F9/5011 , G06F9/5077 , G06F12/0215 , G06F12/0238 , G06F12/0246 , G06F12/0607 , G06F12/0802 , G06F12/0804 , G06F12/0811 , G06F12/0862 , G06F12/0866 , G06F12/0871 , G06F12/0875 , G06F12/0882 , G06F12/0888 , G06F12/0891 , G06F12/0893 , G06F12/0895 , G06F12/0897 , G06F12/1009 , G06F12/128 , G06F15/8046 , G06F17/16 , G06F17/18 , G06T1/20 , G06T1/60 , H03M7/46 , G06F9/3802 , G06F9/3818 , G06F9/3867 , G06F2212/1008 , G06F2212/1021 , G06F2212/1044 , G06F2212/302 , G06F2212/401 , G06F2212/455 , G06F2212/60 , G06N3/08 , G06T15/06

摘要： Methods and apparatus relating to techniques for multi-tile memory management. In an example, an apparatus comprises a cache memory, a high-bandwidth memory, a shader core communicatively coupled to the cache memory and comprising a processing element to decompress a first data element extracted from an in-memory database in the cache memory and having a first bit length to generate a second data element having a second bit length, greater than the first bit length, and an arithmetic logic unit (ALU) to compare the data element to a target value provided in a query of the in-memory database. Other embodiments are also disclosed and claimed.

7.

发明公开
Method and Apparatus for Dual Issue Multiply Instructions 审中-公开

公开(公告)号：US20240311313A1

公开(公告)日：2024-09-19

申请号：US18660120

申请日：2024-05-09

申请人： Texas Instruments Incorporated

发明人： Timothy David Anderson , Mujibur Rahman

IPC分类号： G06F12/1045 , G06F7/24 , G06F7/487 , G06F7/499 , G06F7/53 , G06F7/57 , G06F9/30 , G06F9/32 , G06F9/345 , G06F9/38 , G06F9/48 , G06F11/00 , G06F11/10 , G06F12/0862 , G06F12/0875 , G06F12/0897 , G06F12/1009 , G06F15/78 , G06F17/16 , H03H17/06

CPC分类号： G06F12/1045 , G06F7/24 , G06F7/487 , G06F7/4876 , G06F7/49915 , G06F7/53 , G06F7/57 , G06F9/3001 , G06F9/30014 , G06F9/30021 , G06F9/30032 , G06F9/30036 , G06F9/30065 , G06F9/30072 , G06F9/30098 , G06F9/30112 , G06F9/30145 , G06F9/30149 , G06F9/3016 , G06F9/32 , G06F9/345 , G06F9/3802 , G06F9/3818 , G06F9/383 , G06F9/3836 , G06F9/3851 , G06F9/3856 , G06F9/3867 , G06F9/3887 , G06F9/48 , G06F11/00 , G06F11/1048 , G06F12/0862 , G06F12/0875 , G06F12/0897 , G06F12/1009 , G06F17/16 , H03H17/0664 , G06F9/30018 , G06F9/325 , G06F9/381 , G06F9/3822 , G06F11/10 , G06F15/7807 , G06F15/781 , G06F2212/452 , G06F2212/60 , G06F2212/602 , G06F2212/68

摘要： Various configurations of processors are provided. In a configuration, the processor comprises first and second multiplication units. The first multiplication unit includes first multiply circuitry including a first set of outputs; and first multiplexing logic coupled to the first set of outputs and configured to generate a first partial sum and a first partial carry. The second multiplication unit includes second multiply circuitry including a second set of outputs; and second multiplexing logic coupled to the second set of outputs and configured to generate a second partial sum and a first partial carry.

8.

发明授权
Look-ahead teleportation for reliable computation in multi-SIMD quantum processor 有权

公开(公告)号：US12079634B2

公开(公告)日：2024-09-03

申请号：US16794124

申请日：2020-02-18

申请人： Advanced Micro Devices, Inc.

发明人： Onur Kayiran , Jieming Yin , Yasuko Eckert

IPC分类号： G06F9/38 , G06F8/41 , G06N10/00

CPC分类号： G06F9/3887 , G06F8/41 , G06N10/00

摘要： A technique for processing qubits in a quantum computing device is provided. The technique includes determining that, in a first cycle, a first quantum processing region is to perform a first quantum operation that does not use a qubit that is stored in the first quantum processing region, identifying a second quantum processing region that is to perform a second quantum operation at a second cycle that is later than the first cycle, wherein the second quantum operation uses the qubit, determining that between the first cycle and the second cycle, no quantum operations are performed in the second quantum processing region, and moving the qubit from the first quantum processing region to the second quantum processing region.

9.

发明公开
Pipeline Techniques for Dependent Graphics Kicks 审中-公开

公开(公告)号：US20240272940A1

公开(公告)日：2024-08-15

申请号：US18450978

申请日：2023-08-16

申请人： Apple Inc.

发明人： Benjamin Bowman , Ali Rabbani Rankouhi , Jonathan M. Redshaw , Steven Fishwick

IPC分类号： G06F9/48 , G06F9/38

CPC分类号： G06F9/4881 , G06F9/3887 , G06F9/485

摘要： Disclosed techniques relate to scheduling sets of graphics work with dependencies. In some embodiments, a first set of graphics work depends on a second set of graphics work. Control circuitry may, in response to a release signal that indicates the second set reaching a first processing point, initiate processing of the first set. Control circuitry may, in response to reaching a kick gate point, stall processing of the first set. Control circuitry may, in response to an end signal for the second set, resume processing of the first set.

10.

发明授权
Method and arrangement for handling memory access for a TCF-aware processor 有权

公开(公告)号：US12056495B2

公开(公告)日：2024-08-06

申请号：US17415890

申请日：2019-12-20

申请人： Teknologian tutkimuskeskus VTT Oy

发明人： Martti Forsell , Jussi Roivainen

IPC分类号： G06F9/38 , G06F9/52 , G06F9/54

CPC分类号： G06F9/3851 , G06F9/3824 , G06F9/3867 , G06F9/3887 , G06F9/522 , G06F9/544

摘要： An arrangement for handling shared data memory access for a TCF-aware processor. The arrangement comprises at least a flexible latency handling unit (601) comprising local memory (602) and related control logic, said local memory being provided for storing shared data memory access related data. The arrangement is configured to receive at least one TCF comprising at least one instruction, the at least one instruction being associated with at least one fiber, wherein the flexible latency handling unit is configured to determine if shared data memory access is required by the at least one instruction, if shared data memory access is required, send a shared data memory access request, via the flexible latency handling unit, observe, essentially continuously, if a reply to the shared data memory access request is received, suspend continued execution of the instruction until a reply is received, and continue execution of the instruction after receiving the reply so that the delay associated with the shared data memory access is dynamically determined by the actual required shared data memory access latency.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类