Patent search ap:("INTEL CORPORATION") AND inv:"Georgios Tournavitis" Page 1

1.

发明授权
Instruction and logic for optimization level aware branch prediction 有权

公开(公告)号：US10157063B2

公开(公告)日：2018-12-18

申请号：US13631402

申请日：2012-09-28

Applicant: INTEL CORPORATION

Inventor： Polychronis Xekalakis , Pedro Marcuello , Alejandro Vicente Martinez , Christos E. Kotselidis , Grigorios Magklis , Fernando Latorre , Raul Martinez , Josep M. Codina , Enric Gibert Codina , Crispin Gomez Requena , Antonio Gonzelez , Mirem Hyuseinova , Pedro Lopez , Marc Lupon , Carlos Madriles , Daniel Ortega , Demos Pavlou , Kyriakos A. Stavrou , Georgios Tournavitis

IPC: G06F9/38 , G06F9/30

Abstract: A computer-readable storage medium, method and system for optimization-level aware branch prediction is described. A gear level is assigned to a set of application instructions that have been optimized. The gear level is also stored in a register of a branch prediction unit of a processor. Branch prediction is then performed by the processor based upon the gear level.

2.

发明授权
Instruction and logic for bulk register reclamation 有权

公开(公告)号：US10061587B2

公开(公告)日：2018-08-28

申请号：US14496113

申请日：2014-09-25

Applicant: Intel Corporation

Inventor： David Pardo Keppel , Denis M. Khartikov , Fernando LaTorre , Marc Lupon , Grigorios Magklis , Naveen Neelakantam , Georgios Tournavitis , Polychronis Xekalakis

IPC: G06F9/30 , G06F9/38

CPC classification number: G06F9/30185 , G06F9/384 , G06F9/3857

Abstract: A processor includes a front end, a decoder, an allocator, and a retirement unit. The decoder includes logic to identify an end-of-live-range (EOLR) indicator. The EOLR indicator specifies an architectural register and a location in code for which the architectural register is unused. The allocator includes logic to scan for a mapping of the architectural register to a physical register, based upon the EOLR indicator. The allocator also includes logic to generate a request to disassociate the architectural register from the physical register. The retirement unit includes logic to disassociate the architectural register from the physical register.

3.

发明申请
INSTRUCTION AND LOGIC FOR BULK REGISTER RECLAMATION 审中-公开
Title translation: 大容量存储器重新引导的指令和逻辑

公开(公告)号：US20160092222A1

公开(公告)日：2016-03-31

申请号：US14496113

申请日：2014-09-25

Applicant: Intel Corporation

Inventor： David Pardo Keppel , Denis M. Khartikov , Fernando LaTorre , Marc Lupon , Grigorios Magklis , Naveen Neelakantam , Georgios Tournavitis , Polychronis Xekalakis

IPC: G06F9/30

CPC classification number: G06F9/30185 , G06F9/384 , G06F9/3857

Abstract: A processor includes a front end, a decoder, an allocator, and a retirement unit. The decoder includes logic to identify an end-of-live-range (EOLR) indicator. The EOLR indicator specifies an architectural register and a location in code for which the architectural register is unused. The allocator includes logic to scan for a mapping of the architectural register to a physical register, based upon the EOLR indicator. The allocator also includes logic to generate a request to disassociate the architectural register from the physical register. The retirement unit includes logic to disassociate the architectural register from the physical register.

Abstract translation: 处理器包括前端，解码器，分配器和退休单元。解码器包括用于识别终点范围（EOLR）指示符的逻辑。 EOLR指示符指定体系结构寄存器和不使用体系结构寄存器的代码中的位置。分配器包括基于EOLR指示器扫描架构寄存器到物理寄存器的映射的逻辑。分配器还包括生成用于将体系结构寄存器与物理寄存器取消关联的请求的逻辑。退休单位包括将架构寄存器与物理寄存器取消关联的逻辑。

4.

发明申请
PROFILING ASYNCHRONOUS EVENTS RESULTING FROM THE EXECUTION OF SOFTWARE AT CODE REGION GRANULARITY 审中-公开

公开(公告)号：US20190004916A1

公开(公告)日：2019-01-03

申请号：US16026870

申请日：2018-07-03

Applicant: Intel Corporation

Inventor： Raul Martinez , Enric Gibert Codina , Pedro Lopez , Marti Torrents Lapuerta , Polychronis Xekalakis , Georgios Tournavitis , Kyriakos A. Stavrou , Demos Pavlou , Daniel Ortega , Alejandro Martinez Vicente , Pedro Marcuello , Grigorios Magklis , Josep M. Codina , Crispin Gomez Requena , Antonio Gonzalez , Mirem Hyuseinova , Christos Kotselidis , Fernando Latorre , Marc Lupon , Carlos Madriles

IPC: G06F11/30 , G06F12/0862 , G06F11/34

Abstract: A combination of hardware and software collect profile data for asynchronous events, at code region granularity. An exemplary embodiment is directed to collecting metrics for prefetching events, which are asynchronous in nature. Instructions that belong to a code region are identified using one of several alternative techniques, causing a profile bit to be set for the instruction, as a marker. Each line of a data block that is prefetched is similarly marked. Events corresponding to the profile data being collected and resulting from instructions within the code region are then identified. Each time that one of the different types of events is identified, a corresponding counter is incremented. Following execution of the instructions within the code region, the profile data accumulated in the counters are collected, and the counters are reset for use with a new code region.

5.

发明申请
WEIGHT-SHIFTING MECHANISM FOR CONVOLUTIONAL NEURAL NETWORKS 审中-公开
Title translation: 用于交互式神经网络的重量分配机制

公开(公告)号：US20160026912A1

公开(公告)日：2016-01-28

申请号：US14337979

申请日：2014-07-22

Applicant: Intel Corporation

Inventor： Ayose J. Falcon , Marc Lupon , Enric Herrero Abellanas , Pedro Lopez , Fernando Latorre , Frederico C. Pratas , Georgios Tournavitis

IPC: G06N3/06 , G06N3/08

CPC classification number: G06N3/06 , G06N3/0454 , G06N3/063 , G06N3/08

Abstract: A processor includes a processor core and a calculation circuit. The processor core includes logic determine a set of weights for use in a convolutional neural network (CNN) calculation and scale up the weights using a scale value. The calculation circuit includes logic to receive the scale value, the set of weights, and a set of input values, wherein each input value and associated weight of a same fixed size. The calculation circuit also includes logic to determine results from convolutional neural network (CNN) calculations based upon the set of weights applied to the set of input values, scale down the results using the scale value, truncate the scaled down results to the fixed size, and communicatively couple the truncated results to an output for a layer of the CNN.

Abstract translation: 处理器包括处理器核心和计算电路。处理器核心包括确定用于卷积神经网络（CNN）计算的一组权重的逻辑，并使用比例值来放大权重。计算电路包括接收比例值，权重集合和一组输入值的逻辑，其中每个输入值和相同固定大小的相关权重。计算电路还包括基于应用于输入值集合的权重集合来确定卷积神经网络（CNN）计算结果的逻辑，使用比例值缩小结果，将缩小的结果截断为固定大小，并将截断的结果通信地耦合到CNN的层的输出。

6.

发明授权
Method and apparatus for distributed and cooperative computation in artificial neural networks 有权

公开(公告)号：US12032653B2

公开(公告)日：2024-07-09

申请号：US17306877

申请日：2021-05-03

Applicant: Intel Corporation

Inventor： Frederico C. Pratas , Ayose J. Falcon , Marc Lupon , Fernando Latorre , Pedro Lopez , Enric Herrero Abellanas , Georgios Tournavitis

IPC: G06F17/15 , G06F12/0875 , G06N3/04 , G06N3/063

CPC classification number: G06F17/153 , G06F12/0875 , G06N3/04 , G06N3/063 , G06F2212/1024

Abstract: An apparatus and method are described for distributed and cooperative computation in artificial neural networks. For example, one embodiment of an apparatus comprises: an input/output (I/O) interface; a plurality of processing units communicatively coupled to the I/O interface to receive data for input neurons and synaptic weights associated with each of the input neurons, each of the plurality of processing units to process at least a portion of the data for the input neurons and synaptic weights to generate partial results; and an interconnect communicatively coupling the plurality of processing units, each of the processing units to share the partial results with one or more other processing units over the interconnect, the other processing units using the partial results to generate additional partial results or final results. The processing units may share data including input neurons and weights over the shared input bus.

7.

发明授权
Method and apparatus for distributed and cooperative computation in artificial neural networks 有权

公开(公告)号：US10997273B2

公开(公告)日：2021-05-04

申请号：US15521856

申请日：2015-11-19

Applicant: Intel Corporation

Inventor： Frederico C. Pratas , Ayose J. Falcon , Marc Lupon , Fernando Latorre , Pedro Lopez , Enric Herrero Abellanas , Georgios Tournavitis

IPC: G06F17/15 , G06N3/063 , G06F12/0862 , G06F12/0875 , G06N3/04

Abstract: An apparatus and method are described for distributed and cooperative computation in artificial neural networks. For example, one embodiment of an apparatus comprises: an input/output (I/O) interface; a plurality of processing units communicatively coupled to the I/O interface to receive data for input neurons and synaptic weights associated with each of the input neurons, each of the plurality of processing units to process at least a portion of the data for the input neurons and synaptic weights to generate partial results; and an interconnect communicatively coupling the plurality of processing units, each of the processing units to share the partial results with one or more other processing units over the interconnect, the other processing units using the partial results to generate additional partial results or final results. The processing units may share data including input neurons and weights over the shared input bus.

8.

发明授权
Storage device and method for performing convolution operations 有权

公开(公告)号：US09971540B2

公开(公告)日：2018-05-15

申请号：US14861701

申请日：2015-09-22

Applicant: INTEL CORPORATION

Inventor： Enric Herrero Abellanas , Georgios Tournavitis , Frederico C. Pratas , Marc Lupon , Fernando Latorre , Pedro Lopez , Ayose J. Falcon

IPC: G06F3/06 , G06F17/15 , G06T1/60 , G06N3/063

CPC classification number: G06F3/0644 , G06F3/0604 , G06F3/0683 , G06F17/153 , G06N3/063 , G06T1/60

Abstract: A storage device and method are described for performing convolution operations. For example, one embodiment of an apparatus to perform convolution operations comprises a plurality of processing units to execute convolution operations on input data and partial results; a unified scratchpad memory comprising a plurality of memory banks communicatively coupled to the plurality of processing units through a plurality of read/write ports, each of the plurality of memory banks partitioned to store both the input data and partial results; a control unit to allocate the input data and partial results to the memory banks to ensure a minimum quality of service in accordance with the specified number of read/write ports and the specified convolution operation to be performed.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification