-
公开(公告)号:US11823659B2
公开(公告)日:2023-11-21
申请号:US16711046
申请日:2019-12-11
Applicant: Amazon Technologies, Inc.
Inventor: Julia Reinspach , Oleg Rokhlenko , Ramakanthachary Gottumukkala , Giovanni Clemente , Ankit Agrawal , Swayam Bhardwaj , Guy Michaeli , Vaidyanathan Puthucode Krishnamoorthy , Costantino Vlachos , Nalledath P. Vinodkrishnan , Shaun M. Vickers , Sethuraman Ramachandran , Charles C. Moore
IPC: G10L15/06 , G10L15/01 , G10L15/02 , G10L15/187 , G10L15/22
CPC classification number: G10L15/063 , G10L15/01 , G10L15/02 , G10L15/187 , G10L15/22 , G10L2015/025 , G10L2015/0635 , G10L2015/223
Abstract: A request including audio data is received from a voice-enabled device. A string of phonemes present in the utterance is determined through speech recognition. At a later time, a subsequent user input corresponding to the request may be received, in which the user input is associated with one or more text keywords. The subsequent user input may be obtained in response to an active request. Alternatively, feedback may not be actively elicited, but rather collected passively. However it is obtained, the one or more keywords associated with the subsequent user input may be associated with the string of phonemes to indicate that the user is saying or mean those words when they product that string of phonemes. A user-specific speech recognition key for the user account is then updated to associate the string of phonemes with these words. A general speech recognition model can also be trained using the association.
-
公开(公告)号:US11694682B1
公开(公告)日:2023-07-04
申请号:US16711043
申请日:2019-12-11
Applicant: Amazon Technologies, Inc.
Inventor: Julia Reinspach , Oleg Rokhlenko , Ramakanthachary Gottumukkala , Giovanni Clemente , Ankit Agrawal , Swayam Bhardwaj , Guy Michaeli , Vaidyanathan Puthucode Krishnamoorthy , Costantino Vlachos , Nalledath P. Vinodkrishnan , Shaun M. Vickers , Sethuraman Ramachandran , Charles C. Moore
IPC: G10L15/22 , G10L15/30 , G06F3/16 , G06F3/0484 , G06Q30/06 , G06F3/0481 , G10L15/19 , G10L15/18 , G06Q30/0601
CPC classification number: G10L15/22 , G06F3/0481 , G06F3/0484 , G06F3/167 , G06Q30/0625 , G10L15/1815 , G10L15/19 , G10L15/222 , G10L15/30 , G10L2015/223
Abstract: In various embodiments, a voice command is associated with a plurality of processing steps to be performed. The plurality of processing steps may include analysis of audio data using automatic speech recognition, generating and selecting a search query from the utterance text, and conducting a search of database of items using a search query. The plurality of processing steps may include additional or different steps, depending on the type of the request. In performing one or more of these processing steps, an error or ambiguity may be detected. An error or ambiguity may either halt the processing step or create more than one path of actions. A model may be used to determine if and how to request additional user input to attempt to resolve the error or ambiguity. The voice-enabled device or a second client device is then causing to output a request for the additional user input.
-
公开(公告)号:US20210183366A1
公开(公告)日:2021-06-17
申请号:US16711046
申请日:2019-12-11
Applicant: Amazon Technologies, Inc.
Inventor: Julia Reinspach , Oleg Rokhlenko , Ramakanthachary Gottumukkala , Giovanni Clemente , Ankit Agrawal , Swayam Bhardwaj , Guy Michaeli , Vaidyanathan Puthucode Krishnamoorthy , Costantino Vlachos , Nalledath P. Vinodkrishnan , Shaun M. Vickers , Sethuraman Ramachandran , Charles C. Moore
IPC: G10L15/06 , G10L15/22 , G10L15/02 , G10L15/01 , G10L15/187
Abstract: A request including audio data is received from a voice-enabled device. A string of phonemes present in the utterance is determined through speech recognition. At a later time, a subsequent user input corresponding to the request may be received, in which the user input is associated with one or more text keywords. The subsequent user input may be obtained in response to an active request. Alternatively, feedback may not be actively elicited, but rather collected passively. However it is obtained, the one or more keywords associated with the subsequent user input may be associated with the string of phonemes to indicate that the user is saying or mean those words when they product that string of phonemes. A user-specific speech recognition key for the user account is then updated to associate the string of phonemes with these words. A general speech recognition model can also be trained using the association.
-
-