Accelerating agent performance in a natural language processing system

    公开(公告)号:US11314942B1

    公开(公告)日:2022-04-26

    申请号:US16825856

    申请日:2020-03-20

    Abstract: A computer-implemented method for providing agent assisted transcriptions of user utterances. A user utterance is received in response to a prompt provided to the user at a remote client device. An automatic transcription is generated from the utterance using a language model based upon an application or context, and presented to a human agent. The agent reviews the transcription and may replace at least a portion of the transcription with a corrected transcription. As the agent inputs the corrected transcription, accelerants are presented to the user comprising suggested texted to be inputted. The accelerants may be determined based upon an agent input, an application or context of the transcription, the portion of the transcription being replaced, or any combination thereof. In some cases, the user provides textual input, to which the agent transcribes an intent associated with the input with the aid of one or more accelerants.

    System and method for audibly presenting selected text
    5.
    发明授权
    System and method for audibly presenting selected text 有权
    用于可听地呈现所选文本的系统和方法

    公开(公告)号:US09117445B2

    公开(公告)日:2015-08-25

    申请号:US13943242

    申请日:2013-07-16

    Abstract: Disclosed herein are methods for presenting speech from a selected text that is on a computing device. This method includes presenting text on a touch-sensitive display and having that text size within a threshold level so that the computing device can accurately determine the intent of the user when the user touches the touch screen. Once the user touch has been received, the computing device identifies and interprets the portion of text that is to be selected, and subsequently presents the text audibly to the user.

    Abstract translation: 这里公开的是用于从计算设备上的所选文本呈现语音的方法。 该方法包括在触敏显示器上呈现文本并使该文本大小在阈值水平内,使得当用户触摸触摸屏时计算设备可以准确地确定用户的意图。 一旦接收到用户触摸,计算设备就识别和解释要被选择的文本部分,并随后向用户呈现可听见的文本。

    System and method for recognizing speech with dialect grammars
    6.
    发明授权
    System and method for recognizing speech with dialect grammars 有权
    用方言语法识别语音的系统和方法

    公开(公告)号:US09082405B2

    公开(公告)日:2015-07-14

    申请号:US14554164

    申请日:2014-11-26

    CPC classification number: G10L15/19 G10L15/005 G10L15/1822 G10L15/183

    Abstract: Disclosed herein are systems, computer-implemented methods, and computer-readable media for recognizing speech. The method includes receiving speech from a user, perceiving at least one speech dialect in the received speech, selecting at least one grammar from a plurality of optimized dialect grammars based on at least one score associated with the perceived speech dialect and the perceived at least one speech dialect, and recognizing the received speech with the selected at least one grammar. Selecting at least one grammar can be further based on a user profile. Multiple grammars can be blended. Predefined parameters can include pronunciation differences, vocabulary, and sentence structure. Optimized dialect grammars can be domain specific. The method can further include recognizing initial received speech with a generic grammar until an optimized dialect grammar is selected. Selecting at least one grammar from a plurality of optimized dialect grammars can be based on a certainty threshold.

    Abstract translation: 这里公开了用于识别语音的系统,计算机实现的方法和计算机可读介质。 该方法包括从用户接收语音,感知所接收到的语音中的至少一个语音方言,基于与所感知的语音方言相关联的至少一个分数,从多个优化的方言语法中选择至少一个语法,以及感知的至少一个 语音方言,并用所选择的至少一种语法识别所接收的语音。 选择至少一个语法可以进一步基于用户简档。 可以混合多种语法。 预定义参数可以包括发音差异,词汇和句子结构。 优化的方言语法可以是域特定的。 该方法还可以包括用通用语法识别初始接收到的语音,直到选择优化的方言语法。 从多个优化方言语法中选择至少一个语法可以基于确定性阈值。

    System and method for standardized speech recognition infrastructure
    7.
    发明授权
    System and method for standardized speech recognition infrastructure 有权
    标准语音识别基础设施的系统和方法

    公开(公告)号:US09053704B2

    公开(公告)日:2015-06-09

    申请号:US14330739

    申请日:2014-07-14

    CPC classification number: G10L15/075 G10L15/063 G10L15/065 G10L15/07 G10L15/08

    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for selecting a speech recognition model in a standardized speech recognition infrastructure. The system receives speech from a user, and if a user-specific supervised speech model associated with the user is available, retrieves the supervised speech model. If the user-specific supervised speech model is unavailable and if an unsupervised speech model is available, the system retrieves the unsupervised speech model. If the user-specific supervised speech model and the unsupervised speech model are unavailable, the system retrieves a generic speech model associated with the user. Next the system recognizes the received speech from the user with the retrieved model. In one embodiment, the system trains a speech recognition model in a standardized speech recognition infrastructure. In another embodiment, the system handshakes with a remote application in a standardized speech recognition infrastructure.

    Abstract translation: 这里公开了用于在标准化语音识别基础设施中选择语音识别模型的系统,方法和计算机可读存储介质。 系统从用户接收语音,并且如果与用户相关联的用户特定的监督语音模型可用,则检索监督的语音模型。 如果用户特定的监督语音模型不可用,并且如果无人监督的语音模型可用,则系统检索无监督语音模型。 如果用户特定的监督语音模型和无监督语音模型不可用,则系统检索与用户相关联的通用语音模型。 接下来,系统使用所检索的模型识别来自用户的接收到的语音。 在一个实施例中,系统在标准化语音识别基础设施中训练语音识别模型。 在另一个实施例中,系统与标准语音识别基础设施中的远程应用握手。

    Apparatus and method for processing service interactions
    8.
    发明授权
    Apparatus and method for processing service interactions 有权
    用于处理服务交互的装置和方法

    公开(公告)号:US08332231B2

    公开(公告)日:2012-12-11

    申请号:US12551864

    申请日:2009-09-01

    Abstract: An interactive voice and data response system that directs input to a voice, text, and web-capable software-based router, which is able to intelligently respond to the input by drawing on a combination of human agents, advanced speech recognition and expert systems, connected to the router via a TCP/IP network. The digitized input is broken down into components so that the customer interaction is managed as a series of small tasks performed by a pool of human agents, rather than one ongoing conversation between the customer and a single agent. The router manages the interactions and keeps pace with a real-time conversation. The system utilizes both speech recognition and human intelligence for purposes of interpreting customer utterances or customer text, where the role of the human agent(s) is to input the intent of caller utterances, and where the computer system—not the human agent—determines which response to provide given the customer's stated intent (as interpreted/captured by the human agents). The system may use more than one human agent, or both human agents and speech recognition software, to interpret simultaneously the same component for error-checking and interpretation accuracy.

    Abstract translation: 交互式语音和数据响应系统,其将输入引导到基于语音,文本和基于web的基于软件的路由器,其能够通过利用人类代理,高级语音识别和专家系统的组合来智能地响应于输入, 通过TCP / IP网络连接到路由器。 数字化输入被分解为组件,以便客户交互作为由人类代理池执行的一系列小型任务进行管理,而不是客户与单个代理之间的一个正在进行的对话。 路由器管理交互,并与实时对话保持同步。 该系统利用语音识别和人类智能来解释客户话语或客户文本,其中人类代理人的角色是输入呼叫者话语的意图,以及计算机系统 - 而不是人类代理 - 确定 根据客户提出的意图(由人类代理人解释/捕获)提供的响应。 该系统可以使用多于一个的人类代理人,或人类代理人和语音识别软件,同时解释相同的组件以进行错误检查和解释精度。

    Hierarchical speech recognition decoder

    公开(公告)号:US10482876B2

    公开(公告)日:2019-11-19

    申请号:US16148884

    申请日:2018-10-01

    Abstract: A speech interpretation module interprets the audio of user utterances as sequences of words. To do so, the speech interpretation module parameterizes a literal corpus of expressions by identifying portions of the expressions that correspond to known concepts, and generates a parameterized statistical model from the resulting parameterized corpus. When speech is received the speech interpretation module uses a hierarchical speech recognition decoder that uses both the parameterized statistical model and language sub-models that specify how to recognize a sequence of words. The separation of the language sub-models from the statistical model beneficially reduces the size of the literal corpus needed for training, reduces the size of the resulting model, provides more fine-grained interpretation of concepts, and improves computational efficiency by allowing run-time incorporation of the language sub-models.

Patent Agency Ranking