-
公开(公告)号:US20230385198A1
公开(公告)日:2023-11-30
申请号:US18448102
申请日:2023-08-10
申请人: Rebellions Inc.
发明人: Jinseok Kim , Jinwook Oh , Donghan Kim
IPC分类号: G06F12/084
CPC分类号: G06F12/084 , G06F2212/622
摘要: A neural processing device is provided. The neural processing device comprises: a processing unit configured to perform calculations, an L0 memory configured to receive data from the processing unit and provide data to the processing unit, and an LSU (Load/Store Unit) configured to perform load and store operations of the data, wherein the LSU comprises: a neural core load unit configured to issue a load instruction of the data, a neural core store unit configured to issue a store instruction for transmitting and storing the data, and a sync ID logic configured to provide a sync ID to the neural core load unit and the neural core store unit to thereby cause a synchronization signal to be generated for each sync ID.
-
公开(公告)号:US20180121359A1
公开(公告)日:2018-05-03
申请号:US15860353
申请日:2018-01-02
发明人: Ekaterina M. Ambroladze , Deanna P. Berger , Michael F. Fee , Arthur J. O'Neill , Robert J. Sonnelitter, III
IPC分类号: G06F12/0815 , G06F12/084 , G06F15/173 , G06F12/0831 , G06F12/0817
CPC分类号: G06F12/0815 , G06F12/0817 , G06F12/0831 , G06F12/084 , G06F15/173 , G06F2212/1016 , G06F2212/1032 , G06F2212/314 , G06F2212/60 , G06F2212/601 , G06F2212/622
摘要: Topology of clusters of processors of a computer configuration, configured to support any of a plurality of cache coherency protocols, is discovered at initialization time to determine which one of the plurality of cache coherency protocols is to be used to handle coherency requests of the configuration
-
公开(公告)号:US20170249992A1
公开(公告)日:2017-08-31
申请号:US15461262
申请日:2017-03-16
申请人: Intel Corporation
IPC分类号: G11C15/00
CPC分类号: G06F12/0811 , G06F12/0826 , G06F12/0862 , G06F2212/283 , G06F2212/602 , G06F2212/622 , G11C15/00
摘要: Loading data from a computer memory system is disclosed. A memory system is provided, wherein some or all data stored in the memory system is organized as one or more pointer-linked data structures. One or more iterator registers are provided. A first pointer chain is loaded, having two or more pointers leading to a first element of a selected pointer-linked data structure to a selected iterator register. A second pointer chain is loaded, having two or more pointers leading to a second element of the selected pointer-linked data structure to the selected iterator register. The loading of the second pointer chain reuses portions of the first pointer chain that are common with the second pointer chain.Modifying data stored in a computer memory system is disclosed. A memory system is provided. One or more iterator registers are provided, wherein the iterator registers each include two or more pointer fields for storing two or more pointers that form a pointer chain leading to a data element. A local state associated with a selected iterator register is generated by performing one or more register operations relating to the selected iterator register and involving pointers in the pointer fields of the selected iterator register. A pointer-linked data structure is updated in the memory system according to the local state.
-
公开(公告)号:US20170185518A1
公开(公告)日:2017-06-29
申请号:US14983081
申请日:2015-12-29
申请人: Francesc Guim Bernet , Karthik Kumar , Robert G. Blankenship , Raj K. Ramanujan , Thomas Willhalm , Narayan Ranganathan
发明人: Francesc Guim Bernet , Karthik Kumar , Robert G. Blankenship , Raj K. Ramanujan , Thomas Willhalm , Narayan Ranganathan
CPC分类号: G06F12/0833 , G06F12/0802 , G06F12/0817 , G06F12/084 , G06F12/0875 , G06F13/1663 , G06F2212/314 , G06F2212/621 , G06F2212/622
摘要: Embodiments of systems, method, and apparatuses for remote monitoring are described. In some embodiments, an apparatus includes at least one monitoring circuit to monitor for memory accesses to an address space; at least one a monitoring table to store an identifier of the address space; and a tag directory per core used by the core to track entities that have access to the address space.
-
公开(公告)号:US09626321B2
公开(公告)日:2017-04-18
申请号:US14060191
申请日:2013-10-22
申请人: Intel Corporation
发明人: Robert J. Safranek , Robert G. Blankenship , Venkatraman Iyer , Jeff Willey , Robert Beers , Darren S. Jue , Arvind A. Kumar , Debendra Das Sharma , Jeffrey C. Swanson , Bahaa Fahim , Vedaraman Geetha , Aaron T. Spink , Fulvio Spagna , Rahul R. Shah , Sitaraman V. Iyer , William Harry Nale , Abhishek Das , Simon P. Johnson , Yuvraj S. Dhillon , Yen-Cheng Liu , Raj K. Ramanujan , Robert A. Maddox , Herbert H. Hum , Ashish Gupta
IPC分类号: G06F13/40 , G06F12/0831 , G06F13/42 , G06F9/30 , G06F12/0806 , H04L12/933 , G06F9/46 , G06F12/0813 , G06F12/0815 , H04L12/741 , G06F9/44
CPC分类号: G06F13/22 , G06F1/3287 , G06F8/71 , G06F8/73 , G06F8/77 , G06F9/30145 , G06F9/44505 , G06F9/466 , G06F11/1004 , G06F12/0806 , G06F12/0808 , G06F12/0813 , G06F12/0815 , G06F12/0831 , G06F12/0833 , G06F13/4022 , G06F13/4068 , G06F13/4221 , G06F13/4282 , G06F13/4286 , G06F13/4291 , G06F2212/1016 , G06F2212/2542 , G06F2212/622 , H04L9/0662 , H04L12/4641 , H04L45/74 , H04L49/15 , Y02D10/13 , Y02D10/14 , Y02D10/151 , Y02D10/40 , Y02D10/44 , Y02D30/30
摘要: A physical layer (PHY) is coupled to a serial, differential link that is to include a number of lanes. The PHY includes a transmitter and a receiver to be coupled to each lane of the number of lanes. The transmitter coupled to each lane is configured to embed a clock with data to be transmitted over the lane, and the PHY periodically issues a blocking link state (BLS) request to cause an agent to enter a BLS to hold off link layer flit transmission for a duration. The PHY utilizes the serial, differential link during the duration for a PHY associated task selected from a group including an in-band reset, an entry into low power state, and an entry into partial width state
-
公开(公告)号:US09606927B2
公开(公告)日:2017-03-28
申请号:US15249796
申请日:2016-08-29
IPC分类号: G06F12/08 , G06F12/0864
CPC分类号: G06F12/0864 , G06F12/0822 , G06F12/121 , G06F12/122 , G06F12/128 , G06F2212/1044 , G06F2212/601 , G06F2212/6032 , G06F2212/621 , G06F2212/622
摘要: A system includes a set-associative storage container and a processor configured to generate a vector that is a random number. Two or more residue functions are applied to the vector that each produces a state signal including a different number of states based on the vector. A set status is determined that identifies whether each set of the set-associative storage container is enabled or disabled. One of the state signals is selected that has a same number of states as a number of the sets that are enabled. The selected state signal is mapped to the sets that are enabled to assign each of the states of the selected state signal to a corresponding one of the sets that are enabled. A set selection of the set-associative storage container is output based on the mapping to randomly select one of the sets that are enabled from the set-associative storage container.
-
公开(公告)号:US09495300B2
公开(公告)日:2016-11-15
申请号:US15067305
申请日:2016-03-11
CPC分类号: G06F12/0864 , G06F12/0822 , G06F12/121 , G06F12/122 , G06F12/128 , G06F2212/1044 , G06F2212/601 , G06F2212/6032 , G06F2212/621 , G06F2212/622
摘要: A computer-implemented method includes generating a vector that is a random number. Two or more residue functions are applied to the vector to produce a state signal including a different number of states. A set status of a set-associative storage container in a computer system is determined. The set status identifies whether each set of the set-associative storage container is enabled or disabled. One of the state signals is selected that has a same number of states as a number of the sets that are enabled. The selected state signal is mapped to the sets that are enabled to assign each of the states of the selected state signal to a corresponding one of the sets that are enabled. A set selection of the set-associative storage container is output based on the mapping to randomly select one of the sets that are enabled from the set-associative storage container.
摘要翻译: 计算机实现的方法包括生成作为随机数的向量。 将两个或更多个残差函数应用于向量以产生包括不同数量状态的状态信号。 确定计算机系统中的集合关联存储容器的设置状态。 设置状态标识是否启用或禁用了组关联存储容器的每一组。 选择状态信号中的一个具有与启用的集合的数量相同数量的状态。 所选择的状态信号被映射到能够将所选状态信号的每个状态分配给被启用的集合中的相应一个的集合。 基于映射输出集合关联存储容器的集合选择,以随机选择从集合关联存储容器启用的集合之一。
-
公开(公告)号:US09495295B1
公开(公告)日:2016-11-15
申请号:US14822778
申请日:2015-08-10
发明人: Birendra Dutt , Douglas B. Boyle
CPC分类号: G06F12/0842 , G06F1/06 , G06F1/10 , G06F9/44505 , G06F12/0246 , G06F12/0811 , G06F12/0817 , G06F12/1027 , G06F13/20 , G06F13/4022 , G06F13/4068 , G06F13/4221 , G06F13/4282 , G06F15/781 , G06F15/7825 , G06F2212/283 , G06F2212/622 , G06F2212/681 , G06F2212/70 , G11C7/1072 , G11C14/0018
摘要: A photonics-optimized multi-processor system may include a plurality of processor chips, each of the processor chips comprising at least one input/output (I/O) component. The multi-processor system may also include first and second photonic components. The at least one I/O component of at least one of the processor chips may be configured to directly drive the first photonic component and receive a signal from the second photonic component. A total latency from any one of the processor chips to data at any global memory location may not be dominated by a round trip speed-of-light propagation delay. A number of the processor chips may be at least 10,000, and the processor chips may be packaged into a total volume of no more than 8 m3. A density of the processor chips may be greater than 1,000 chips per cubic meter.
摘要翻译: 光子学优化的多处理器系统可以包括多个处理器芯片,每个处理器芯片包括至少一个输入/输出(I / O)组件。 多处理器系统还可以包括第一和第二光子组件。 至少一个处理器芯片的至少一个I / O分量可以被配置为直接驱动第一光子分量并从第二光子分量接收信号。 从任何一个处理器芯片到任何全局存储器位置处的数据的总等待时间可能不受到往返光速传播延迟的支配。 许多处理器芯片可以至少为10,000,并且处理器芯片可以被封装成不超过8m 3的总体积。 处理器芯片的密度可以大于每立方米1,000个芯片。
-
公开(公告)号:US20160314069A1
公开(公告)日:2016-10-27
申请号:US14691971
申请日:2015-04-21
CPC分类号: G06F12/0811 , G06F12/0817 , G06F12/0831 , G06F12/128 , G06F2212/283 , G06F2212/622
摘要: A method and apparatus for performing non-temporal write combining using existing cache resources is disclosed. In one embodiment, a method includes executing a first thread on a processor core, the first thread including a first block initialization store (BIS) instruction. A cache query may be performed responsive to the BIS instruction, and if the query results in a cache miss, a cache line may be installed in a cache in an unordered dirty state in which it is exclusively owned by the first thread. The first BIS instruction and one or more additional BIS instructions may write data from the first processor core into the first cache line. After a cache coherence response is received, the state of the first cache line may be changed to an ordered dirty state in which it is no longer exclusive to the first thread.
摘要翻译: 公开了一种使用现有高速缓存资源执行非时间写入组合的方法和装置。 在一个实施例中,一种方法包括执行处理器核心上的第一线程,第一线程包括第一块初始化存储(BIS)指令。 可以响应于BIS指令执行缓存查询,并且如果查询导致高速缓存未命中,则高速缓存行可以以无序的脏状态安装在高速缓存中,其中它是由第一线程专有的。 第一BIS指令和一个或多个附加BIS指令可以将数据从第一处理器核心写入第一高速缓存行。 在接收到高速缓存一致性响应之后,可以将第一高速缓存行的状态改变为不再对第一线程排斥的有序脏状态。
-
公开(公告)号:US20160246721A1
公开(公告)日:2016-08-25
申请号:US14626913
申请日:2015-02-19
CPC分类号: G06F12/0828 , G06F9/45558 , G06F9/466 , G06F12/0811 , G06F12/0833 , G06F2009/45587 , G06F2212/1028 , G06F2212/152 , G06F2212/283 , G06F2212/622 , Y02D10/13
摘要: A method for controlling cache snoop and/or invalidate coherence traffic for specific caches based on transaction attributes is described. A memory management unit (MMU) determines one or more transaction attributes for a cache coherence transaction from a requesting processor. A routing module identifies a cachability domain and/or shareability domain based on the transaction attributes and routes the cache coherence transaction to one or more caches in the cachability domain and/or shareability domain. Instead of coherence traffic being routed to all caches on a coherence bus, coherence traffic is selectively routed based on transaction attributes such as an address space identifier (ASID), a virtual machine identifier (VMID), a secure bit (NS), a hypervisor identifier (HYP), etc.
摘要翻译: 描述了一种基于事务属性来控制缓存窥探和/或使特定高速缓存的相干流量无效的方法。 存储器管理单元(MMU)从请求处理器确定用于高速缓存一致性事务的一个或多个事务属性。 路由模块基于事务属性识别可访问域和/或可共享域,并将高速缓存一致性事务路由到可缓存域和/或可共享域中的一个或多个高速缓存。 代替一致性流量被路由到相干总线上的所有高速缓存,相干流量基于诸如地址空间标识符(ASID),虚拟机器标识符(VMID),安全位(NS),管理程序 标识符(HYP)等
-
-
-
-
-
-
-
-
-