SYSTEMS AND METHODS FOR ROUTING WITHIN MULTITASK MIXTURE-OF-EXPERTS MODELS

Invention Application

US20220237435A1 SYSTEMS AND METHODS FOR ROUTING WITHIN MULTITASK MIXTURE-OF-EXPERTS MODELS 有权

Please log in to see more content

Patent Title: SYSTEMS AND METHODS FOR ROUTING WITHIN MULTITASK MIXTURE-OF-EXPERTS MODELS
Application No.: US17159437

Application Date: 2021-01-27
Publication No.: US20220237435A1

Publication Date: 2022-07-28
Inventor: Yanping Huang , Dmitry Lepikhin , Maxim Krikun , Orhan Firat , Ankur Bapna , Thang Luong , Sneha Kudugunta
Applicant: Google LLC
Applicant Address: US CA Mountain View
Assignee: Google LLC
Current Assignee: Google LLC
Current Assignee Address: US CA Mountain View
Main IPC: G06N3/04
IPC: G06N3/04 ; G06N3/08

SYSTEMS AND METHODS FOR ROUTING WITHIN MULTITASK MIXTURE-OF-EXPERTS MODELS

Abstract:

Systems and methods for routing in mixture-of-expert models. In some aspects of the technology, a transformer may have at least one Mixture-of-Experts (“MoE”) layer in each of its encoder and decoder, with the at least one MoE layer of the encoder having a learned gating function configured to route each token of a task to two or more selected expert feed-forward networks, and the at least one MoE layer of the decoder having a learned gating function configured to route each task to two or more selected expert feed-forward networks.

Public/Granted literature

US12242948B2 Systems and methods for routing within multitask mixture-of-experts models Public/Granted day:2025-03-04

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G06	计算；推算或计数
G06N	基于特定计算模型的计算机系统
G06N3/00	基于生物学模型的计算机系统
G06N3/02	.采用神经网络模型
G06N3/04	..体系结构，例如，互连拓扑