-
公开(公告)号:US10679621B1
公开(公告)日:2020-06-09
申请号:US15927764
申请日:2018-03-21
Applicant: Amazon Technologies, Inc.
Inventor: Shiva Kumar Sundaram , Minhua Wu , Anirudh Raju , Spyridon Matsoukas , Arindam Mandal , Kenichi Kumatani
IPC: G10L15/22 , G10L15/187 , G10L15/26 , G10L15/30 , H04R3/00 , G10L21/0208 , G06F40/40 , H04W4/02 , G10L21/0216 , G10L15/08
Abstract: Systems and methods for utilizing microphone array information for acoustic modeling are disclosed. Audio data may be received from a device having a microphone array configuration. Microphone configuration data may also be received that indicates the configuration of the microphone array. The microphone configuration data may be utilized as an input vector to an acoustic model, along with the audio data, to generate phoneme data. Additionally, the microphone configuration data may be utilized to train and/or generate acoustic models, select an acoustic model to perform speech recognition with, and/or to improve trigger sound detection.
-
公开(公告)号:US11043214B1
公开(公告)日:2021-06-22
申请号:US16204670
申请日:2018-11-29
Applicant: Amazon Technologies, Inc.
Inventor: Behnam Hedayatnia , Anirudh Raju , Ankur Gandhe , Chandra Prakash Khatri , Ariya Rastrow , Anushree Venkatesh , Arindam Mandal , Raefer Christopher Gabriel , Ahmad Shikib Mehri
Abstract: Described herein is a system for rescoring automatic speech recognition hypotheses for conversational devices that have multi-turn dialogs with a user. The system leverages dialog context by incorporating data related to past user utterances and data related to the system generated response corresponding to the past user utterance. Incorporation of this data improves recognition of a particular user utterance within the dialog.
-
公开(公告)号:US20210312914A1
公开(公告)日:2021-10-07
申请号:US17340378
申请日:2021-06-07
Applicant: Amazon Technologies, Inc.
Inventor: Behnam Hedayatnia , Anirudh Raju , Ankur Gandhe , Chandra Prakash Khatri , Ariya Rastrow , Anushree Venkatesh , Arindam Mandal , Raefer Christopher Gabriel , Ahmad Shikib Mehri
Abstract: Described herein is a system for rescoring automatic speech recognition hypotheses for conversational devices that have multi-turn dialogs with a user. The system leverages dialog context by incorporating data related to past user utterances and data related to the system generated response corresponding to the past user utterance. Incorporation of this data improves recognition of a particular user utterance within the dialog.
-
公开(公告)号:US11935525B1
公开(公告)日:2024-03-19
申请号:US16895377
申请日:2020-06-08
Applicant: Amazon Technologies, Inc.
Inventor: Shiva Kumar Sundaram , Minhua Wu , Anirudh Raju , Spyridon Matsoukas , Arindam Mandal , Kenichi Kumatani
IPC: G10L15/22 , G06F40/40 , G10L15/187 , G10L15/26 , G10L15/30 , G10L21/0208 , H04R3/00 , G10L15/08 , G10L21/0216 , H04W4/02
CPC classification number: G10L15/22 , G06F40/40 , G10L15/187 , G10L15/26 , G10L15/30 , G10L21/0208 , H04R3/005 , G10L2015/088 , G10L2015/223 , G10L2021/02166 , H04W4/025
Abstract: Systems and methods for utilizing microphone array information for acoustic modeling are disclosed. Audio data may be received from a device having a microphone array configuration. Microphone configuration data may also be received that indicates the configuration of the microphone array. The microphone configuration data may be utilized as an input vector to an acoustic model, along with the audio data, to generate phoneme data. Additionally, the microphone configuration data may be utilized to train and/or generate acoustic models, select an acoustic model to perform speech recognition with, and/or to improve trigger sound detection.
-
-
-