-
公开(公告)号:US11335346B1
公开(公告)日:2022-05-17
申请号:US16215061
申请日:2018-12-10
Applicant: Amazon Technologies, Inc.
Inventor: Chengwei Su , Spyridon Matsoukas , Sankaranarayanan Ananthakrishnan , Shirin Saleem , Chungnam Chan , Yugang Li , Mallory McManamon , Rahul Gupta , Luca Soldaini
IPC: G10L15/26 , G06K9/62 , G06N20/10 , G06N7/00 , G06F40/295
Abstract: Techniques for processing a user input are described. Text data representing a user input is processed with respect to at least one finite state transducer (FST) to generate at least one FST hypothesis. Context information may be required to traverse one or more paths of the at least one FST. The text data is also processed using at least one statistical model (e.g., perform intent classification, named entity recognition, and/or domain classification processing) to generate at least one statistical model hypothesis. The at least one FST hypothesis and the at least one statistical model hypothesis are input to a reranker that determines a most likely interpretation of the user input.
-
公开(公告)号:US20210358497A1
公开(公告)日:2021-11-18
申请号:US17321999
申请日:2021-05-17
Applicant: Amazon Technologies, Inc.
Inventor: Ming Sun , Thibaud Senechal , Yixin Gao , Anish N. Shah , Spyridon Matsoukas , Chao Wang , Shiv Naga Prasad Vitaladevuni
Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.
-
公开(公告)号:US10679621B1
公开(公告)日:2020-06-09
申请号:US15927764
申请日:2018-03-21
Applicant: Amazon Technologies, Inc.
Inventor: Shiva Kumar Sundaram , Minhua Wu , Anirudh Raju , Spyridon Matsoukas , Arindam Mandal , Kenichi Kumatani
IPC: G10L15/22 , G10L15/187 , G10L15/26 , G10L15/30 , H04R3/00 , G10L21/0208 , G06F40/40 , H04W4/02 , G10L21/0216 , G10L15/08
Abstract: Systems and methods for utilizing microphone array information for acoustic modeling are disclosed. Audio data may be received from a device having a microphone array configuration. Microphone configuration data may also be received that indicates the configuration of the microphone array. The microphone configuration data may be utilized as an input vector to an acoustic model, along with the audio data, to generate phoneme data. Additionally, the microphone configuration data may be utilized to train and/or generate acoustic models, select an acoustic model to perform speech recognition with, and/or to improve trigger sound detection.
-
-