Patent search ap:("Google LLC") AND inv:"Wenqian Huang" Page 1

1.

发明公开
Semantic Segmentation With Language Models For Long-Form Automatic Speech Recognition 审中-公开

公开(公告)号：US20240290320A1

公开(公告)日：2024-08-29

申请号：US18585020

申请日：2024-02-22

Applicant: Google LLC

Inventor： Wenqian Huang , Hao Zhang , Shankar Kumar , Shuo-yiin Chang , Tara N. Sainath

IPC: G10L15/06 , G06F40/30 , G10L15/26

CPC classification number: G10L15/063 , G06F40/30 , G10L15/26

Abstract: A joint segmenting and ASR model includes an encoder to receive a sequence of acoustic frames and generate, at each of a plurality of output steps, a higher order feature representation for a corresponding acoustic frame. The model also includes a decoder to generate based on the higher order feature representation at each of the plurality of output steps a probability distribution over possible speech recognition hypotheses, and an indication of whether the corresponding output step corresponds to an end of segment (EOS). The model is trained on a set of training samples, each training sample including audio data characterizing multiple segments of long-form speech; and a corresponding transcription of the long-form speech, the corresponding transcription annotated with ground-truth EOS labels obtained via distillation from a language model teacher that receives the corresponding transcription as input and injects the ground-truth EOS labels into the corresponding transcription between semantically complete segments.

Search Results

Country/Region

Patent validity

Application date

Publication (announcement) day

applicant

The country/region where the applicant is located

Inventor

IPC

IPC Department

IPC class

IPC subclass

IPC group

IPC team

Appearance classification