SMART AUDIO SEGMENTATION USING LOOK-AHEAD BASED ACOUSTO-LINGUISTIC FEATURES

Invention Application

US20250054491A1 SMART AUDIO SEGMENTATION USING LOOK-AHEAD BASED ACOUSTO-LINGUISTIC FEATURES 有权

Please log in to see more content

Patent Title: SMART AUDIO SEGMENTATION USING LOOK-AHEAD BASED ACOUSTO-LINGUISTIC FEATURES
Application No.: US18721121

Application Date: 2021-12-22
Publication No.: US20250054491A1

Publication Date: 2025-02-13
Inventor: Sayan Dev PATHAK , Hosam Adel KHALIL , Naveen PARIHAR , Piyush BEHRE , Shuangyu CHANG , Christopher Hakan BASOGLU , Sharman W TAN , Eva SHARMA , Jian WU , Yang LIU , Edward C LIN , Amit Kumar AGARWAL
Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
Applicant Address: US WA Redmond
Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
Current Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
Current Assignee Address: US WA Redmond
International Application: PCT/CN2021/140296 WO 20211222
Main IPC: G10L15/04
IPC: G10L15/04 ; G10L15/01

Abstract:

Systems and methods are provided for smart audio segmentation using look-ahead based acousto-linguistic features. For example, systems and methods are provided for obtaining audio, processing the audio, identifying a potential segmentation boundary within the audio, and determining whether to generate a segment break at the potential segmentation boundary. One or more look-ahead words occurring after the potential segmentation boundary are identified, wherein an acoustic segmentation score and a language segmentation score associated with the potential segmentation boundary and the one or more look-ahead words are generated. Systems then either refrain from generating a segment break at the potential segmentation boundary or generate the segment break at the potential segmentation boundary based on the acoustic and/or language segmentation score at least meeting or exceeding a segmentation score threshold.

Information query

Global Dossier Espacenet

IPC分类:

G	物理
G10	乐器；声学
G10L	语音分析或合成；语音识别；语音或声音处理；语音或音频编码或解码
G10L15/00	语音识别（G10L17/00优先）
G10L15/04	.分段；字极限检测