专利检索 ap:("International Business Machines Corporation") AND inv:"Jose E. Moreira" 第 8 页

71.

发明申请
BRANCH PREDICTION USING MULTIPLE VERSIONS OF HISTORY DATA 有权
标题翻译：使用多个版本的历史数据进行分支预测

公开(公告)号：US20150331691A1

公开(公告)日：2015-11-19

申请号：US14278000

申请日：2014-05-15

申请人： International Business Machines Corporation

发明人： David S. Levitan , Jose E. Moreira , Mauricio J. Serrano

IPC分类号： G06F9/38 , G06F9/30

CPC分类号： G06F9/3806 , G06F9/30058 , G06F9/30149 , G06F9/3848 , G06F9/3861

摘要： Branch prediction is provided by generating a first index from a previous instruction address and from a first branch history vector having a first length. A second index is generated from the previous instruction address and from a second branch history vector that is longer than the first vector. Using the first index, a first branch prediction is retrieved from a first branch prediction table. Using the second index, a second branch prediction is retrieved from a second branch prediction table. Based upon additional branch history data, the first branch history vector and the second branch history vector are updated. A first hash value is generated from a current instruction address and the updated first branch history vector. A second hash value is generated from the current instruction address and the updated second branch history vector. One of the branch predictions are selected based upon the hash values.

摘要翻译： 通过从先前指令地址和具有第一长度的第一分支历史向量生成第一索引来提供分支预测。从先前的指令地址和长于第一向量的第二分支历史向量生成第二索引。使用第一索引，从第一分支预测表检索第一分支预测。使用第二索引，从第二分支预测表检索第二分支预测。基于附加的分支历史数据，更新第一分支历史矢量和第二分支历史矢量。从当前指令地址和更新的第一分支历史向量生成第一哈希值。从当前指令地址和更新的第二分支历史向量生成第二哈希值。基于散列值来选择分支预测之一。

72.

发明申请
Selective Delaying of Write Requests in Hardware Transactional Memory Systems 审中-公开
标题翻译：在硬件事务内存系统中选择性延迟写请求

公开(公告)号：US20140075121A1

公开(公告)日：2014-03-13

申请号：US13646011

申请日：2012-10-05

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Colin B. Blundell , Harold W. Cain, III , Jose E. Moreira

IPC分类号： G06F12/08

CPC分类号： G06F9/467

摘要： Techniques for conflict detection in hardware transactional memory (HTM) are provided. In one aspect, a method for detecting conflicts in HTM includes the following steps. Conflict detection is performed eagerly by setting read and write bits in a cache as transactions having read and write requests are made. A given one of the transactions is stalled when a conflict is detected whereby more than one of the transactions are accessing data in the cache in a conflicting way. An address of the conflicting data is placed in a predictor. The predictor is queried whenever the write requests are made to determine whether they correspond to entries in the predictor. A copy of the data corresponding to entries in the predictor is placed in a store buffer. The write bits in the cache are set and the copy of the data in the store buffer is merged in at transaction commit.

摘要翻译： 提供了硬件事务存储器（HTM）中的冲突检测技术。一方面，一种用于检测HTM中的冲突的方法包括以下步骤。通过设置高速缓存中的读取和写入位来进行冲突检测，作为具有读取和写入请求的事务。当检测到冲突时，给定的一个事务被停止，其中多于一个事务以冲突的方式访问缓存中的数据。冲突数据的地址放在预测器中。每当做出写入请求以确定它们是否对应于预测变量中的条目时，就会查询预测变量。与预测器中的条目相对应的数据的副本被放置在存储缓冲器中。设置缓存中的写入位，并在事务提交中合并存储缓冲区中的数据副本。

73.

发明授权
Loosely-coupled slice target file data 有权

公开(公告)号：US11900116B1

公开(公告)日：2024-02-13

申请号：US17489746

申请日：2021-09-29

申请人： International Business Machines Corporation

发明人： Dung Q. Nguyen , Brian W. Thompto , Jose E. Moreira , Jessica Hui-Chun Tseng , Pratap C. Pattnaik , Kattamuri Ekanadham , Manoj Kumar

IPC分类号： G06F9/30 , G06F9/38

CPC分类号： G06F9/30145 , G06F9/30036 , G06F9/30109 , G06F9/3836 , G06F9/3869

摘要： A system may determine that two instructions may be combined based on a processing power of the processor and a size of the instructions, fuse the two instructions into a pair, map the two instructions with a single register tag, write the register tag into a mapper with bits indicating that the register tag is for a first instruction of the two instructions, write the register tag into the mapper with bits indicating that the register tag is for a second instruction of the two instructions, write the fused instruction pair into an issue queue, issue the fused instruction pair to a vector-scalar transformation units (VSU), and execute the two instructions.

74.

发明授权
Compute array of a processor with mixed-precision numerical linear algebra support 有权

公开(公告)号：US11755320B2

公开(公告)日：2023-09-12

申请号：US17480279

申请日：2021-09-21

申请人： International Business Machines Corporation

发明人： Jose E. Moreira , Brett Olsson , Brian W. Thompto , Silvia Melitta Mueller , Andreas Wagner

IPC分类号： G06F9/38 , G06F9/30 , G06F17/16

CPC分类号： G06F9/30014 , G06F9/30145 , G06F9/3855 , G06F9/3893 , G06F17/16

摘要： Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A plurality of linear algebra operations is repeated in parallel within the compute array to update a result matrix in an accumulator register based on the first input matrix, the second input matrix, and a number of rank updates of the result matrix to store in the accumulator register.

75.

发明申请
ENCRYPTED DATA PROCESSING DESIGN INCLUDING CLEARTEXT REGISTER FILES 有权

公开(公告)号：US20220414270A1

公开(公告)日：2022-12-29

申请号：US17356784

申请日：2021-06-24

申请人： International Business Machines Corporation

发明人： Jessica Hui-Chun Tseng , Jose E. Moreira , Pratap C. Pattnaik , Manoj Kumar , Kattamuri Ekanadham , Gianfranco Bilardi

IPC分类号： G06F21/79 , G06F21/60 , G06F21/74 , G06F21/54

摘要： Aspects of the present disclosure relate to encrypted data processing (EDAP). Encrypted data from a cache to be loaded into a register file can be accessed. The encrypted data can be decrypted to receive cleartext data. The cleartext data can be written to the register file. The cleartext data can be processed using at least one functional unit to receive cleartext computation results. The cleartext computation results can then be written back to the register file.

76.

发明授权
Instruction fusion using dependence analysis 有权

公开(公告)号：US11294685B2

公开(公告)日：2022-04-05

申请号：US16431723

申请日：2019-06-04

申请人： International Business Machines Corporation

发明人： Jessica Hui-Chun Tseng , Manoj Kumar , Kattamuri Ekanadham , Jose E. Moreira , Pratap C. Pattnaik

IPC分类号： G06F9/38 , G06F8/41

摘要： Method and systems for creating a sequence of fused instructions. An instruction stream is obtained, and a window of instructions from the instruction stream is examined and one or more groups of instructions that satisfy one or more fusion rules are identified. One or more of the groups of instructions that satisfy the one or more fusion rules are fused and a maximal length data dependence chain in the instruction stream is analyzed by analyzing every node in a dependence graph in a selected window of instructions. Fusion of an instruction group is prevented based on the maximal length data dependence chain.

77.

发明授权
Half-precision floating-point arrays at low overhead 有权

公开(公告)号：US11281745B2

公开(公告)日：2022-03-22

申请号：US16542447

申请日：2019-08-16

申请人： International Business Machines Corporation

发明人： Bruce Fleischer , Jose E. Moreira , Joel A. Silberman

IPC分类号： G06F17/16 , G06F7/544

摘要： Methods and systems of matrix multiplication are described. In an example, a processor can multiply a first entry of a first vector of a first data array with a second vector of a second data array to generate a third vector of a third data array. The processor can store the third vector of the third data array in the second register file. The processor can multiply a second entry of the first vector with the second vector to generate a fourth vector of the third data array. The processor can store the fourth vector of the third data array in the second register file. The processor can combine vectors of the third data array that are stored in the second register file to produce the third data array.

78.

发明申请
Reformatting Matrices to Improve Computing Efficiency 有权

公开(公告)号：US20220012010A1

公开(公告)日：2022-01-13

申请号：US17485455

申请日：2021-09-26

申请人： INTERNATIONAL BUSINESS MACHINES CORPORATION

发明人： Manoj Kumar , Pratap C. Pattnaik , Kattamuri Ekanadham , Jessica Tseng , Jose E. Moreira

IPC分类号： G06F7/08 , G06F16/22 , G06F17/16 , G06F7/24 , G06F7/78

摘要： A data ordering device includes a plurality of inputs N and a plurality of outputs M. There is a sorting network coupled between the plurality of inputs N and the plurality of outputs M. There are one or more latches comprising a buffer coupled between each input of the plurality of inputs N and a corresponding input of the sorting network. There are one or more latches comprising a buffer coupled between each output of the plurality of outputs M and a corresponding output of the sorting network. There is an input for a control signal operative to initiate a sorting of data between the plurality of inputs N and the plurality of outputs M. The data ordering device is coupled to a core of a central processing unit.

79.

发明申请
COMPUTE ARRAY OF A PROCESSOR WITH MIXED-PRECISION NUMERICAL LINEAR ALGEBRA SUPPORT 有权

公开(公告)号：US20220004386A1

公开(公告)日：2022-01-06

申请号：US17480279

申请日：2021-09-21

申请人： International Business Machines Corporation

发明人： Jose E. Moreira , Brett Olsson , Brian W. Thompto , Silvia Melitta Mueller , Andreas Wagner

IPC分类号： G06F9/30 , G06F9/38 , G06F17/16

摘要： Aspects include a compute array of a processor with mixed-precision numerical linear algebra support. A first precision and a first shape of a first input matrix and a second precision and a second shape of a second input matrix to the compute array are determined. A plurality of linear algebra operations is repeated in parallel within the compute array to update a result matrix in an accumulator register based on the first input matrix, the second input matrix, and a number of rank updates of the result matrix to store in the accumulator register.

80.

发明授权
Independent mapping of threads 有权

公开(公告)号：US11144323B2

公开(公告)日：2021-10-12

申请号：US16676763

申请日：2019-11-07

申请人： International Business Machines Corporation

发明人： Sam G. Chu , Markus Kaltenbach , Hung Q. Le , Jentje Leenstra , Jose E. Moreira , Dung Q. Nguyen , Brian W. Thompto

IPC分类号： G06F9/46 , G06F9/38 , G06F9/30 , G06F9/50

摘要： Embodiments of the present invention provide systems and methods for mapping the architected state of one or more threads to a set of distributed physical register files to enable independent execution of one or more threads in a multiple slice processor. In one embodiment, a system is disclosed including a plurality of dispatch queues which receive instructions from one or more threads and an even number of parallel execution slices, each parallel execution slice containing a register file. A routing network directs an output from the dispatch queues to the parallel execution slices and the parallel execution slices independently execute the one or more threads.

搜索结果

国家/区域

专利有效性

申请日

公布(公告)日

申请人

申请人所在国/区域

发明人

IPC

IPC部

IPC大类

IPC小类

IPC大组

IPC小组

外观分类