Deep learning inference efficiency technology with early exit and speculative execution

Invention Grant

US12211260B2 Deep learning inference efficiency technology with early exit and speculative execution 有权

Please log in to see more content

Patent Title: Deep learning inference efficiency technology with early exit and speculative execution
Application No.: US18519674

Application Date: 2023-11-27
Publication No.: US12211260B2

Publication Date: 2025-01-28
Inventor: Haim Barad , Barak Hurwitz , Uzi Sarel , Eran Geva , Eli Kfir , Moshe Island
Applicant: Intel Corporation
Applicant Address: US CA Santa Clara
Assignee: Intel Corporation
Current Assignee: Intel Corporation
Current Assignee Address: US CA Santa Clara
Agency: Akona IP PC
Main IPC: G06V10/82
IPC: G06V10/82 ; G06F30/33 ; G06N3/04 ; G06V10/44 ; G06V10/94

Deep learning inference efficiency technology with early exit and speculative execution

Abstract:

Systems, apparatuses and methods may provide for technology that processes an inference workload in a first subset of layers of a neural network that prevents or inhibits data dependent branch operations, conducts an exit determination as to whether an output of the first subset of layers satisfies one or more exit criteria, and selectively bypasses processing of the output in a second subset of layers of the neural network based on the exit determination. The technology may also speculatively initiate the processing of the output in the second subset of layers while the exit determination is pending. Additionally, when the inference workloads include a plurality of batches, the technology may mask one or more of the plurality of batches from processing in the second subset of layers.

Public/Granted literature

US20240104916A1 DEEP LEARNING INFERENCE EFFICIENCY TECHNOLOGY WITH EARLY EXIT AND SPECULATIVE EXECUTION Public/Granted day:2024-03-28

Information query

Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06V	图像或视频识别或理解
G06V10/00	图像或视频识别或理解的安排（图像或视频中的字符识别 G06V30/10）
G06V10/70	.使用模式识别或机器学习（光学模式识别或电子计算 G06V10/88）
G06V10/82	..使用神经网络