Invention Application
- Patent Title: SMART AUDIO SEGMENTATION USING LOOK-AHEAD BASED ACOUSTO-LINGUISTIC FEATURES
-
Application No.: US18721121Application Date: 2021-12-22
-
Publication No.: US20250054491A1Publication Date: 2025-02-13
- Inventor: Sayan Dev PATHAK , Hosam Adel KHALIL , Naveen PARIHAR , Piyush BEHRE , Shuangyu CHANG , Christopher Hakan BASOGLU , Sharman W TAN , Eva SHARMA , Jian WU , Yang LIU , Edward C LIN , Amit Kumar AGARWAL
- Applicant: MICROSOFT TECHNOLOGY LICENSING, LLC
- Applicant Address: US WA Redmond
- Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
- Current Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
- Current Assignee Address: US WA Redmond
- International Application: PCT/CN2021/140296 WO 20211222
- Main IPC: G10L15/04
- IPC: G10L15/04 ; G10L15/01

Abstract:
Systems and methods are provided for smart audio segmentation using look-ahead based acousto-linguistic features. For example, systems and methods are provided for obtaining audio, processing the audio, identifying a potential segmentation boundary within the audio, and determining whether to generate a segment break at the potential segmentation boundary. One or more look-ahead words occurring after the potential segmentation boundary are identified, wherein an acoustic segmentation score and a language segmentation score associated with the potential segmentation boundary and the one or more look-ahead words are generated. Systems then either refrain from generating a segment break at the potential segmentation boundary or generate the segment break at the potential segmentation boundary based on the acoustic and/or language segmentation score at least meeting or exceeding a segmentation score threshold.
Information query