-
公开(公告)号:US12008149B2
公开(公告)日:2024-06-11
申请号:US17123944
申请日:2020-12-16
CPC分类号: G06F21/74 , G06F9/30076 , G06F9/30098 , G06F9/3818 , G06F9/3836 , G06F9/3867 , G06F9/4812 , G06F21/54 , H04L9/0643 , H04L9/0894
摘要: A computer system, processor, computer program product, and method for executing instructions in a software application that includes a processor that can be dynamically controlled, in response to a value set in a control register, to operate in either a secure mode or a performance mode. In the secure mode, the processor: upon encountering a secure mode entry instruction, computes an entry hash value using a hash function and stores the entry hash value; and upon encountering a secure mode exit instruction, computes an exit hash value, loads the entry hash value, and determines whether the entry hash value is the same as the exit hash value, and depending upon verification of the hash values can execute the return function or transfer control to the operating system. In the performance mode, the processor: executes both the secure mode entry instruction and the secure mode exit instruction as no-operations.
-
公开(公告)号:US20230367597A1
公开(公告)日:2023-11-16
申请号:US18227608
申请日:2023-07-28
发明人: Brian W. Thompto , Maarten J. Boersma , Andreas Wagner , Jose E. Moreira , Hung Q. Le , Silvia Melitta Mueller , Dung Q. Nguyen
IPC分类号: G06F9/30
CPC分类号: G06F9/30098 , G06F9/3012 , G06F9/3001 , G06F9/30036 , G06F9/3013 , G06F9/30109 , G06F9/384
摘要: A computer system, processor, and method for processing information is disclosed that includes at least one computer processor; a main register file associated with the at least one processor, the main register file having a plurality of entries for storing data, one or more write ports to write data to the main register file entries, and one or more read ports to read data from the main register file entries; one or more execution units including a dense math execution unit; and at least one accumulator register file having a plurality of entries for storing data. The results of the dense math execution unit in an aspect are written to the accumulator register file, preferably to the same accumulator register file entry multiple times, and the data from the accumulator register file is written to the main register file.
-
3.
公开(公告)号:US11663009B2
公开(公告)日:2023-05-30
申请号:US17450987
申请日:2021-10-14
CPC分类号: G06F9/30145 , G06F9/30079 , G06F9/30101 , G06F9/3877 , G06F21/602 , H04L9/3236 , H04L9/3247
摘要: A Reduced Instruction Set Computer (“RISC”) supporting large-word operations in a computing environment is disclosed. In one implementation, in response to receiving one or more control signals from a central processing unit (“CPU”), a set of operations are executed on a state of a special purpose execution unit (“SPU”) having a plurality of SPU registers, the SPU being associated with the CPU and the state of the SPU having word widths of one or more of the plurality of registers being greater in size than word widths of a plurality of CPU registers of a computing system and a set of state-master bits to synchronize the state of the SPU and a state of the CPU. The results of the set of operations are stored in the plurality of CPU registers or an alternative set of the plurality of SPU registers.
-
公开(公告)号:US11182458B2
公开(公告)日:2021-11-23
申请号:US16712103
申请日:2019-12-12
摘要: Embodiments of the present invention are directed to a new instruction set extension and a method for providing 3D lane predication for matrix operations. In a non-limiting embodiment of the invention, a first input matrix having m rows and k columns and a second input matrix having k rows and n columns are received by a compute array of a processor. A three-dimensional predicate mask having an M-bit row mask, an N-bit column mask, and a K-bit rank mask is generated. A result matrix of up to m rows, up to n columns, and up to k rank updates is determined based on the first input matrix, the second input matrix, and the predicate mask.
-
公开(公告)号:US20210173662A1
公开(公告)日:2021-06-10
申请号:US16703934
申请日:2019-12-05
摘要: A processor unit for multiply and accumulate (“MAC”) operations is provided. The present invention may include the processor unit having a plurality of MAC units for performing a set of MAC operations. The present invention may include each MAC unit having an execution unit and a one-write one-read (“1W/1R”) register file, where the 1W/1R register file may have at least one accumulator. The present invention may include the execution unit of each MAC unit being configured to perform a subset of MAC operations by computing a product of a set of values received from another register file of the processor unit and adding the computed product to the at least one accumulator. The present invention may include each MAC unit being configured to perform the respective subset of MAC operations in a single clock cycle.
-
6.
公开(公告)号:US10684856B2
公开(公告)日:2020-06-16
申请号:US15646219
申请日:2017-07-11
IPC分类号: G06F9/30
摘要: Converting program instructions for two-stage processors including receiving, by a preprocessing unit, a group of program instructions; determining, by the preprocessing unit, that at least two of the group of program instructions can be converted into a single combined instruction; converting, by the preprocessing unit, the at least two program instructions into the single combined instruction comprising an extension opcode, wherein the extension opcode indicates, to an execution unit, a format of the single combined instruction; and sending, by the preprocessing unit, the single combined instruction to the execution unit.
-
公开(公告)号:US20200174965A1
公开(公告)日:2020-06-04
申请号:US16205211
申请日:2018-11-29
IPC分类号: G06F15/80 , G06N20/00 , G06F12/1072
摘要: A computing system includes a plurality of functional units, each functional unit having one or more inputs and an output. There is a shared memory block coupled to the inputs and outputs of the plurality of functional units. There is a private memory block assigned to each of the plurality of functional units. An inter functional unit data bypass (IFUDB) block is coupled to the plurality of functional units. The IFUDB is configured to route signals between the one or more functional units without use of the shared memory block.
-
公开(公告)号:US20180067746A1
公开(公告)日:2018-03-08
申请号:US15805267
申请日:2017-11-07
发明人: Sam G. Chu , Markus Kaltenbach , Hung Q. Le , Jentje Leenstra , Jose E. Moreira , Dung Q. Nguyen , Brian W. Thompto
CPC分类号: G06F9/3836 , G06F9/3012 , G06F9/3802 , G06F9/3814 , G06F9/3851 , G06F9/3855 , G06F9/3885 , G06F9/3891 , G06F9/5061 , G06F9/5066 , G06F2209/5018
摘要: Embodiments of the present invention provide systems and methods for mapping the architected state of one or more threads to a set of distributed physical register files to enable independent execution of one or more threads in a multiple slice processor. In one embodiment, a system is disclosed including a plurality of dispatch queues which receive instructions from one or more threads and an even number of parallel execution slices, each parallel execution slice containing a register file. A routing network directs an output from the dispatch queues to the parallel execution slices and the parallel execution slices independently execute the one or more threads.
-
公开(公告)号:US09904551B2
公开(公告)日:2018-02-27
申请号:US15342141
申请日:2016-11-03
CPC分类号: G06F9/3806 , G06F9/30058 , G06F9/30149 , G06F9/3848 , G06F9/3861
摘要: Branch prediction is provided by generating a first index from a previous instruction address and from a first branch history vector having a first length. A second index is generated from the previous instruction address and from a second branch history vector that is longer than the first vector. Using the first index, a first branch prediction is retrieved from a first branch prediction table. Using the second index, a second branch prediction is retrieved from a second branch prediction table. Based upon additional branch history data, the first branch history vector and the second branch history vector are updated. A first hash value is generated from a current instruction address and the updated first branch history vector. A second hash value is generated from the current instruction address and the updated second branch history vector. One of the branch predictions are selected based upon the hash values.
-
公开(公告)号:US09513805B2
公开(公告)日:2016-12-06
申请号:US14253059
申请日:2014-04-15
CPC分类号: G06F12/0862 , G06F3/0604 , G06F3/0629 , G06F3/0673 , G06F12/0882 , G06F12/1009 , G06F2212/602 , G06F2212/65
摘要: Embodiments relate to a page table including a data fetch width indicator. An aspect includes allocating a memory page in a main memory to an application. Another aspect includes creating a page table entry corresponding to the memory page in the page table. Another aspect includes determining, by a data fetch width indicator determination logic, the data fetch width indicator for the memory page. Another aspect includes sending a notification of the data fetch width indicator from the data fetch width indicator determination logic to supervisory software. Another aspect includes setting the data fetch width indicator in the page table entry by the supervisory software based on the notification. Another aspect includes, based on a cache miss in the cache memory corresponding to an address that is located in the memory page, fetching an amount of data from the memory page based on the data fetch width indicator.
摘要翻译: 实施例涉及包括数据获取宽度指示符的页表。 一个方面包括将主存储器中的存储器页面分配给应用。 另一方面包括创建与页表中的存储器页相对应的页表项。 另一方面包括通过数据获取宽度指示符确定逻辑来确定存储器页面的数据获取宽度指示符。 另一方面包括从数据获取宽度指示符确定逻辑向管理软件发送数据获取宽度指示符的通知。 另一方面包括基于通知,由监控软件设置页表项中的数据获取宽度指示符。 另一方面包括:基于与存储器页面中的地址相对应的高速缓冲存储器中的高速缓存未命中,基于数据获取宽度指示器从存储器页面获取数据量。
-
-
-
-
-
-
-
-
-