专利检索 ap:("Prasoonkumar Surti" OR "Narayan Srinivasa" OR "Feng Chen" OR "Joydeep Ray" OR "Ben J. Ashbaugh" OR "Nicolas C. Galoppo Von Borries" OR "Eriko Nurvitadhi" OR "Balaji Vembu" OR "Tsung-Han Lin" OR "Kamal Sinha" OR "Rajkishore Barik" OR "Sara S. Baghsorkhi" OR "Justin E. Gottschlich" OR "Altug Koker" OR "Nadathur Rajagopalan Satish" OR "Farshad Akhbari" OR "Dukhwan Kim" OR "Wenyin Fu" OR "Travis T. Schluessler" OR "Josh B. Mastronarde" OR "Linda L. Hurd" OR "John H. Feit" OR "Jeffery S. Boles" OR "Adam T. Lake" OR "Karthik Vaidyanathan" OR "Devan Burke" OR "Subramaniam Maiyuran" OR "Abhishek R. Appu") AND inv:"Nadathur Rajagopalan Satish" 第 1 页

1.

发明申请
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20180308206A1

公开(公告)日：2018-10-25

申请号：US15698217

申请日：2017-09-07

申请人： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

发明人： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC分类号： G06T1/20 , G06T1/60 , G09G5/36 , G06F3/06 , G06N3/08

CPC分类号： G06T1/20 , G06F3/0613 , G06F3/0659 , G06F3/0679 , G06F3/1438 , G06N3/0445 , G06N3/0454 , G06N3/063 , G06N3/08 , G06N3/084 , G06T1/60 , G09G5/001 , G09G5/363 , G09G2352/00 , G09G2360/06 , G09G2360/08 , G09G2360/121 , G09G2360/123 , G09G2370/08

摘要： An apparatus to facilitate compute optimization is disclosed. The apparatus includes a memory device including a first integrated circuit (IC) including a plurality of memory channels and a second IC including a plurality of processing units, each coupled to a memory channel in the plurality of memory channels.

2.

发明申请
COMPUTE OPTIMIZATION MECHANISM FOR DEEP NEURAL NETWORKS 审中-公开

公开(公告)号：US20180308200A1

公开(公告)日：2018-10-25

申请号：US15494886

申请日：2017-04-24

申请人： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

发明人： Prasoonkumar Surti , Narayan Srinivasa , Feng Chen , Joydeep Ray , Ben J. Ashbaugh , Nicolas C. Galoppo Von Borries , Eriko Nurvitadhi , Balaji Vembu , Tsung-Han Lin , Kamal Sinha , Rajkishore Barik , Sara S. Baghsorkhi , Justin E. Gottschlich , Altug Koker , Nadathur Rajagopalan Satish , Farshad Akhbari , Dukhwan Kim , Wenyin Fu , Travis T. Schluessler , Josh B. Mastronarde , Linda L. Hurd , John H. Feit , Jeffery S. Boles , Adam T. Lake , Karthik Vaidyanathan , Devan Burke , Subramaniam Maiyuran , Abhishek R. Appu

IPC分类号： G06T1/20 , G06F17/16 , G06T1/60

CPC分类号： G06T1/20 , G06F8/41 , G06F9/45533 , G06F9/5061 , G06F9/5094 , G06F2009/45583 , G06N3/0445 , G06N3/0454 , G06N3/063 , G06N3/084

摘要： An apparatus to facilitate compute optimization is disclosed. The apparatus includes a plurality of processing units each comprising a plurality of execution units (EUs), wherein the plurality of EUs comprise a first EU type and a second EU type

3.

发明申请
COMPUTE OPTIMIZATION MECHANISM 审中-公开

公开(公告)号：US20180308201A1

公开(公告)日：2018-10-25

申请号：US15494905

申请日：2017-04-24

申请人： Abhishek R. Appu , Altug Koker , Linda L. Hurd , Dukhwan Kim , Mike B. Macpherson , John C. Weast , Feng Chen , Farshad Akhbari , Narayan Srinivasa , Nadathur Rajagopalan Satish , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman

发明人： Abhishek R. Appu , Altug Koker , Linda L. Hurd , Dukhwan Kim , Mike B. Macpherson , John C. Weast , Feng Chen , Farshad Akhbari , Narayan Srinivasa , Nadathur Rajagopalan Satish , Joydeep Ray , Ping T. Tang , Michael S. Strickland , Xiaoming Chen , Anbang Yao , Tatiana Shpeisman

IPC分类号： G06T1/20

CPC分类号： G06T1/20 , G06F9/3001 , G06F9/3017 , G06F9/3851 , G06F9/3887 , G06F9/3895 , G06F9/46 , G06N3/063 , G06T15/005 , G06T15/04 , G09G5/363

摘要： An apparatus to facilitate compute optimization is disclosed. The apparatus includes sorting logic to sort processing threads into thread groups based on bit depth of floating point thread operations.

4.

发明申请
PARALLEL OPERATION ON B+ TREES 有权
标题翻译：平行操作在B + TREES

公开(公告)号：US20130339395A1

公开(公告)日：2013-12-19

申请号：US13996508

申请日：2011-08-29

申请人： Jason D. Sewall , Changkyu Kim , Jatin Chhugani , Nadathur Rajagopalan Satish

发明人： Jason D. Sewall , Changkyu Kim , Jatin Chhugani , Nadathur Rajagopalan Satish

IPC分类号： G06F17/30

CPC分类号： G06F17/30327 , G06F9/5005 , G06F2209/5018

摘要： Embodiments of techniques and systems for parallel processing of B+ trees are described. A parallel B+ tree processing module with partitioning and redistribution may include a set of threads executing a batch of B+ tree operations on a B+ tree in parallel. The batch of operations may be partitioned amongst the threads. Next, a search may be performed to determine which leaf nodes in the B+ tree are to be affected by which operations. Then, the threads may redistribute operations between each other such that multiple threads will not operate on the same leaf node. The threads may then perform B+ tree operations on the leaf nodes of the B+ tree in parallel. Subsequent modifications to nodes in the B+ may similarly be redistributed and performed in parallel as the threads work up the tree.

摘要翻译： 描述了用于B +树的并行处理的技术和系统的实施例。具有分区和再分配的并行B +树处理模块可以包括一组在B +树上并行执行一批B +树操作的线程。该批操作可以在线程之间划分。接下来，可以执行搜索以确定B +树中的哪些叶节点将受哪些操作影响。然后，线程可以在彼此之间重新分配操作，使得多个线程将不在同一叶节点上操作。然后，线程可以并行地在B +树的叶节点上执行B +树操作。当线程处理树时，对B +中的节点的后续修改可以类似地重新分布并且并行执行。

5.

发明申请
METHOD AND APPARATUS FOR STREAM BUFFER MANAGEMENT INSTRUCTIONS 有权
标题翻译：流缓冲器管理指令的方法和装置

公开(公告)号：US20120137074A1

公开(公告)日：2012-05-31

申请号：US12955763

申请日：2010-11-29

申请人： Daehyun Kim , Changkyu Kim , Victor W. Lee , Jatin Chhugani , Nadathur Rajagopalan Satish

发明人： Daehyun Kim , Changkyu Kim , Victor W. Lee , Jatin Chhugani , Nadathur Rajagopalan Satish

IPC分类号： G06F12/08 , G06F12/02 , G06F12/16

CPC分类号： G06F12/0862 , G06F9/3004 , G06F2212/6022

摘要： A method and system to perform stream buffer management instructions in a processor. The stream buffer management instructions facilitate the creation and usage of a dedicated memory space or stream buffer of the processor in one embodiment of the invention. The dedicated memory space is a contiguous memory space and has a sequential or linear addressing scheme in one embodiment of the invention. The processor has logic to execute a stream buffer management instruction to copy data from a source memory address to a destination memory address that is specified with a desired level of memory hierarchy.

摘要翻译： 一种在处理器中执行流缓冲器管理指令的方法和系统。在本发明的一个实施例中，流缓冲器管理指令有助于创建和使用处理器的专用存储空间或流缓冲器。专用存储器空间是连续存储器空间，并且在本发明的一个实施例中具有顺序或线性寻址方案。处理器具有执行流缓冲器管理指令的逻辑，以将来自源存储器地址的数据复制到用期望的存储器层级指定的目的地存储器地址。

6.

发明授权
Cache and/or socket sensitive multi-processor cores breadth-first traversal 有权
标题翻译：缓存和/或套接字敏感的多处理器核宽度优先遍历

公开(公告)号：US08533432B2

公开(公告)日：2013-09-10

申请号：US13629087

申请日：2012-09-27

申请人： Nadathur Rajagopalan Satish , Changkyu Kim , Jatin Chhugani , Jason D. Sewall

发明人： Nadathur Rajagopalan Satish , Changkyu Kim , Jatin Chhugani , Jason D. Sewall

IPC分类号： G06F9/38 , G06F9/06 , G06F15/80 , G06F13/14

CPC分类号： G06F9/52

摘要： Methods, apparatuses and storage device associated with cache and/or socket sensitive breadth-first iterative traversal of a graph by parallel threads, are described. A vertices visited array (VIS) may be employed to track graph vertices visited. VIS may be partitioned into VIS sub-arrays, taking into consideration cache sizes of LLC, to reduce likelihood of evictions. Potential boundary vertices arrays (PBV) may be employed to store potential boundary vertices for a next iteration, for vertices being visited in a current iteration. The number of PBV generated for each thread may take into consideration a number of sockets, over which the processor cores employed are distributed. The threads may be load balanced; further data locality awareness to reduce inter-socket communication may be considered, and/or lock-and-atomic free update operations may be employed.

摘要翻译： 描述了通过并行线程与缓存和/或套接字敏感的宽度优先遍历遍历图形的方法，装置和存储装置。可以使用顶点访问阵列（VIS）来跟踪所访问的图形顶点。 VIS可以分为VIS子阵列，考虑到LLC的缓存大小，以减少驱逐的可能性。可以使用潜在边界顶点阵列（PBV）来存储用于下一次迭代的潜在边界顶点，用于在当前迭代中被访问的顶点。为每个线程生成的PBV的数量可以考虑多个套接字，所采用的处理器核在其上分布。螺纹可以是负载平衡的; 可以考虑进一步的数据局部性意识以减少套接字间通信，和/或可以采用锁定和无原子的更新操作。

7.

发明申请
CACHE AND/OR SOCKET SENSITIVE MULTI-PROCESSOR CORES BREADTH-FIRST TRAVERSAL 有权
标题翻译：高速缓存和/或插座敏感多处理器最初的第一个TRAVERSAL

公开(公告)号：US20130086354A1

公开(公告)日：2013-04-04

申请号：US13629087

申请日：2012-09-27

申请人： Nadathur Rajagopalan Satish , Changkyu Kim , Jatin Chhuagani , Jason D. Sewall

发明人： Nadathur Rajagopalan Satish , Changkyu Kim , Jatin Chhuagani , Jason D. Sewall

IPC分类号： G06F15/80

CPC分类号： G06F9/52

摘要： Methods, apparatuses and storage device associated with cache and/or socket sensitive breadth-first iterative traversal of a graph by parallel threads, are disclosed. In embodiments, a vertices visited array (VIS) may be employed to track graph vertices visited. VIS may be partitioned into VIS sub-arrays, taking into consideration cache sizes of LLC, to reduce likelihood of evictions. In embodiments, potential boundary vertices arrays (PBV) may be employed to store potential boundary vertices for a next iteration, for vertices being visited in a current iteration. The number of PBV generated for each thread may take into consideration a number of sockets, over which the processor cores employed are distributed. In various embodiments, the threads may be load balanced; further data locality awareness to reduce inter-socket communication may be considered, and/or lock-and-atomic free update operations may be employed. Other embodiments may be disclosed or claimed.

摘要翻译： 公开了通过并行线程与缓存和/或套接字敏感的宽度优先遍历遍历图形的方法，装置和存储装置。在实施例中，可以采用顶点访问阵列（VIS）来跟踪所访问的图形顶点。 VIS可以分为VIS子阵列，考虑到LLC的缓存大小，以减少驱逐的可能性。在实施例中，可以采用潜在边界顶点阵列（PBV）来存储针对当前迭代中被访问的顶点的下一次迭代的潜在边界顶点。为每个线程生成的PBV的数量可以考虑多个套接字，所采用的处理器核在其上分布。在各种实施例中，螺纹可以是负载平衡的; 可以考虑进一步的数据局部性意识以减少套接字间通信，和/或可以采用锁定和无原子的更新操作。可以公开或要求保护其他实施例。

8.

发明授权
Method and apparatus for stream buffer management instructions 有权

公开(公告)号：US09727471B2

公开(公告)日：2017-08-08

申请号：US12955763

申请日：2010-11-29

申请人： Daehyun Kim , Changkyu Kim , Victor W. Lee , Jatin Chhugani , Nadathur Rajagopalan Satish

发明人： Daehyun Kim , Changkyu Kim , Victor W. Lee , Jatin Chhugani , Nadathur Rajagopalan Satish

IPC分类号： G06F12/00 , G06F12/0862 , G06F9/30

CPC分类号： G06F12/0862 , G06F9/3004 , G06F2212/6022

摘要： A method and system to perform stream buffer management instructions in a processor. The stream buffer management instructions facilitate the creation and usage of a dedicated memory space or stream buffer of the processor in one embodiment of the invention. The dedicated memory space is a contiguous memory space and has a sequential or linear addressing scheme in one embodiment of the invention. The processor has logic to execute a stream buffer management instruction to copy data from a source memory address to a destination memory address that is specified with a desired level of memory hierarchy.

9.

发明申请
Time and space efficient sharing of data structures across different phases of a virtual world application 有权
标题翻译：在虚拟世界应用程序的不同阶段，数据结构的时空有效共享

公开(公告)号：US20110238680A1

公开(公告)日：2011-09-29

申请号：US12732392

申请日：2010-03-26

申请人： Jatin Chhugani , Bryan Catanzaro , Sanjeev Kumar , Changkyu Kim , Nadathur Rajagopalan Satish

发明人： Jatin Chhugani , Bryan Catanzaro , Sanjeev Kumar , Changkyu Kim , Nadathur Rajagopalan Satish

IPC分类号： G06F17/30 , G06F17/10

CPC分类号： G06T15/00 , G06T13/00 , G06T17/005 , G06T2200/28

摘要： A method of decreasing a total computation time for a visual simulation loop includes sharing a common data structure across each phase of the visual simulation loop by adapting the common data structure to a requirement for each particular phase prior to performing a computation for that particular phase.

摘要翻译： 减少可视化仿真循环的总计算时间的方法包括通过在对该特定阶段执行计算之前通过将公共数据结构适应于每个特定阶段的需求来共享视觉仿真循环的每个阶段的公共数据结构。

10.

发明申请
APPARATUS AND METHOD FOR ACCELERATING GRAPH ANALYTICS 审中-公开

公开(公告)号：US20170177361A1

公开(公告)日：2017-06-22

申请号：US14978229

申请日：2015-12-22

申请人： Michael Anderson , Sheng Li , Jong Soo Park , MD Mostafa Ali Patwary , Nadathur Rajagopalan Satish , Mikhail Smelyanskiy , Narayanan Sundaram

发明人： Michael Anderson , Sheng Li , Jong Soo Park , MD Mostafa Ali Patwary , Nadathur Rajagopalan Satish , Mikhail Smelyanskiy , Narayanan Sundaram

IPC分类号： G06F9/30 , G06F12/08 , G06F17/30 , G06F9/38

CPC分类号： G06F9/30036 , G06F9/30032 , G06F9/3877 , G06F12/0811 , G06F12/084 , G06F2212/455

摘要： An apparatus and method are described for accelerating graph analytics. For example, one embodiment of a processor comprises: an instruction fetch unit to fetch program code including set intersection and set union operations; a graph accelerator unit (GAU) to execute at least a first portion of the program code related to the set intersection and set union operations and generate results; and an execution unit to execute at least a second portion of the program code using the results provided from the GAU.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类