-
公开(公告)号:US20160180853A1
公开(公告)日:2016-06-23
申请号:US14578056
申请日:2014-12-19
Applicant: Amazon Technologies, Inc.
Inventor: Peter Spalding VanLund , Kurt Wesley Piersol , James David Meyers , Jacob Michael Simpson , Vikram Kumar Gundeti , David Robert Thomas , Andrew Christopher Miles
IPC: G10L17/22
CPC classification number: G10L17/22 , G06F9/5011 , G10L15/22 , G10L2015/223 , G10L2015/228
Abstract: A speech-based system includes an audio device in a user premises and a network-based service that supports use of the audio device by multiple applications. The audio device may be directed to play audio content such as music, audio books, etc. The audio device may also be directed to interact with a user through speech. The network-based service monitors event messages received from the audio device to determine which of the multiple applications currently has speech focus. When receiving speech from a user, the service first offers the corresponding meaning to the application, if any, that currently has primary speech focus. If there is no application that currently has primary speech focus, or if the application having primary speech focus is not able to respond to the meaning, the service then offers the user meaning to the application that currently has secondary speech focus.
Abstract translation: 基于语音的系统包括用户场所中的音频设备和支持通过多个应用使用音频设备的基于网络的服务。 音频设备可以被引导以播放诸如音乐,音频书籍等的音频内容。音频设备还可以被引导以通过语音与用户交互。 基于网络的服务监视从音频设备接收的事件消息,以确定当前具有语音焦点的多个应用中的哪一个。 当从用户接收到语音时,服务首先向当前具有主要语音焦点的应用(如果有的话)提供相应的含义。 如果没有目前具有主要语音焦点的应用程序,或者如果具有主要语音焦点的应用程序不能响应意义,则该服务然后向当前具有辅助语音焦点的应用程序提供用户意义。
-
公开(公告)号:US09240187B2
公开(公告)日:2016-01-19
申请号:US14642365
申请日:2015-03-09
Applicant: Amazon Technologies, Inc.
Inventor: Fred Torok , Frédéric Johan Georges Deramat , Vikram Kumar Gundeti
CPC classification number: G10L15/26 , G06F17/30684 , G06F17/3074 , G06F17/30746 , G06F17/30778 , G10L15/08 , G10L15/222 , G10L15/30
Abstract: Features are disclosed for generating markers for elements or other portions of an audio presentation so that a speech processing system may determine which portion of the audio presentation a user utterance refers to. For example, an utterance may include a pronoun with no explicit antecedent. The marker may be used to associate the utterance with the corresponding content portion for processing. The markers can be provided to a client device with a text-to-speech (“TTS”) presentation. The markers may then be provided to a speech processing system along with a user utterance captured by the client device. The speech processing system, which may include automatic speech recognition (“ASR”) modules and/or natural language understanding (“NLU”) modules, can generate hints based on the marker. The hints can be provided to the ASR and/or NLU modules in order to aid in processing the meaning or intent of a user utterance.
-
公开(公告)号:US20150179175A1
公开(公告)日:2015-06-25
申请号:US14642365
申请日:2015-03-09
Applicant: Amazon Technologies, Inc.
Inventor: Fred Torok , Frédéric Johan Georges Deramat , Vikram Kumar Gundeti
CPC classification number: G10L15/26 , G06F17/30684 , G06F17/3074 , G06F17/30746 , G06F17/30778 , G10L15/08 , G10L15/222 , G10L15/30
Abstract: Features are disclosed for generating markers for elements or other portions of an audio presentation so that a speech processing system may determine which portion of the audio presentation a user utterance refers to. For example, an utterance may include a pronoun with no explicit antecedent. The marker may be used to associate the utterance with the corresponding content portion for processing. The markers can be provided to a client device with a text-to-speech (“TTS”) presentation. The markers may then be provided to a speech processing system along with a user utterance captured by the client device. The speech processing system, which may include automatic speech recognition (“ASR”) modules and/or natural language understanding (“NLU”) modules, can generate hints based on the marker. The hints can be provided to the ASR and/or NLU modules in order to aid in processing the meaning or intent of a user utterance.
Abstract translation: 公开了用于为音频呈现的元件或其他部分生成标记的特征,使得语音处理系统可以确定用户话语所指的音频呈现的哪一部分。 例如,话语可能包括没有明确先行词的代词。 标记可以用于将话语与相应的内容部分相关联以进行处理。 可以将标记提供给具有文本到语音(“TTS”)呈现的客户端设备。 然后可以将标记与客户端设备捕获的用户话语一起提供给语音处理系统。 可以包括自动语音识别(“ASR”)模块和/或自然语言理解(“NLU”)模块的语音处理系统可以基于标记产生提示。 可以将提示提供给ASR和/或NLU模块,以帮助处理用户话语的含义或意图。
-
公开(公告)号:US08977555B2
公开(公告)日:2015-03-10
申请号:US13723026
申请日:2012-12-20
Applicant: Amazon Technologies, Inc.
Inventor: Fred Torok , Frédéric Johan Georges Deramat , Vikram Kumar Gundeti
CPC classification number: G10L15/26 , G06F17/30684 , G06F17/3074 , G06F17/30746 , G06F17/30778 , G10L15/08 , G10L15/222 , G10L15/30
Abstract: Features are disclosed for generating markers for elements or other portions of an audio presentation so that a speech processing system may determine which portion of the audio presentation a user utterance refers to. For example, an utterance may include a pronoun with no explicit antecedent. The marker may be used to associate the utterance with the corresponding content portion for processing. The markers can be provided to a client device with a text-to-speech (“TTS”) presentation. The markers may then be provided to a speech processing system along with a user utterance captured by the client device. The speech processing system, which may include automatic speech recognition (“ASR”) modules and/or natural language understanding (“NLU”) modules, can generate hints based on the marker. The hints can be provided to the ASR and/or NLU modules in order to aid in processing the meaning or intent of a user utterance.
Abstract translation: 公开了用于为音频呈现的元件或其他部分生成标记的特征,使得语音处理系统可以确定用户话语所指的音频呈现的哪一部分。 例如,话语可能包括没有明确先行词的代词。 标记可以用于将话语与相应的内容部分相关联以进行处理。 可以将标记提供给具有文本到语音(“TTS”)呈现的客户端设备。 然后可以将标记与客户端设备捕获的用户话语一起提供给语音处理系统。 可以包括自动语音识别(“ASR”)模块和/或自然语言理解(“NLU”)模块的语音处理系统可以基于标记产生提示。 可以将提示提供给ASR和/或NLU模块,以帮助处理用户话语的含义或意图。
-
-
-