System and Method for Talking Avatar

    公开(公告)号:US20210082452A1

    公开(公告)日:2021-03-18

    申请号:US17015902

    申请日:2020-09-09

    摘要: Aspects of this disclosure provide techniques for generating a viseme and corresponding intensity pair. In some embodiments, the method includes generating, by a server, a viseme and corresponding intensity pair based at least on one of a clean vocal track or corresponding transcription. The method may include generating, by the server, a compressed audio file based at least on one of the viseme, the corresponding intensity, music, or visual offset. The method may further include generating, by the server or a client end application, a buffer of raw pulse-code modulated (PCM) data based on decoding at least a part of the compressed audio file, where the viseme is scheduled to align with a corresponding phoneme.

    Method and system for reading fluency training

    公开(公告)号:US10210769B2

    公开(公告)日:2019-02-19

    申请号:US15243479

    申请日:2016-08-22

    摘要: A non-transitory processor-readable medium stores code representing instructions to be executed by a processor. The code causes the processor to receive a request from a user of a client device to initiate a speech recognition engine for a web page displayed at the client device. In response to the request, the code causes the processor to (1) download, from a server associated with a first party, the speech recognition engine into the client device; and then (2) analyze, using the speech recognition engine, content of the web page including text in an identified language to produce analyzed content based on the identified language, where the content of the web page is received from a server associated with a second party. The code further causes the processor to send a signal to cause the client device to present the analyzed content to the user at the client device.

    PRODUCING CONTROLLED VARIATIONS IN AUTOMATED TEACHING SYSTEM INTERACTIONS
    4.
    发明申请
    PRODUCING CONTROLLED VARIATIONS IN AUTOMATED TEACHING SYSTEM INTERACTIONS 审中-公开
    在自动化教学系统相互作用中产生控制变量

    公开(公告)号:US20140170629A1

    公开(公告)日:2014-06-19

    申请号:US14101073

    申请日:2013-12-09

    IPC分类号: G09B19/06

    CPC分类号: G09B19/06 G06F17/279 G09B7/02

    摘要: The content of an instructor-student interaction set in an automated teaching system is represented in a graph-based format. In a graph-based representation, not only can variations branch away from each other at a node (branching point), as in the tree-based representation, but they can also merge back together. Not only does this make the -structure more compact, but it increases the number of variations that can be represented in the content while simultaneously eliminating the need to individually author each variation.

    摘要翻译: 自动化教学系统中的教师 - 学生交互设置的内容以基于图表的格式表示。 在基于图形的表示中,不仅可以使变量在一个节点(分支点)上彼此分支,如在基于树的表示中,而且它们也可以合并在一起。 这不仅使得结构更紧凑,而且增加了可以在内容中表示的变化的数量,同时消除了对每个变体的单独编写的需要。

    METHOD AND SYSTEM FOR READING FLUENCY TRAINING
    5.
    发明申请
    METHOD AND SYSTEM FOR READING FLUENCY TRAINING 有权
    读取流动培训的方法和系统

    公开(公告)号:US20140067367A1

    公开(公告)日:2014-03-06

    申请号:US14020385

    申请日:2013-09-06

    IPC分类号: G10L15/00

    摘要: A non-transitory processor-readable medium stores code representing instructions to be executed by a processor. The code causes the processor to receive a request from a user of a client device to initiate a speech recognition engine for a web page displayed at the client device. In response to the request, the code causes the processor to (1) download, from a server associated with a first party, the speech recognition engine into the client device; and then (2) analyze, using the speech recognition engine, content of the web page including text in an identified language to produce analyzed content based on the identified language, where the content of the web page is received from a server associated with a second party. The code further causes the processor to send a signal to cause the client device to present the analyzed content to the user at the client device.

    摘要翻译: 非暂时处理器可读介质存储代表由处理器执行的指令的代码。 代码使得处理器接收来自客户端设备的用户的请求,以发起用于在客户端设备处显示的网页的语音识别引擎。 响应于该请求,代码使得处理器(1)从与第一方相关联的服务器将该语音识别引擎下载到客户端设备中; 然后(2)使用所述语音识别引擎分析包括所识别语言的文本的所述网页的内容,以基于所识别的语言来生成分析的内容,其中从与第二媒体相关联的服务器接收到所述网页的内容 派对。 该代码进一步导致处理器发送信号以使得客户机设备将分析的内容呈现给客户端设备处的用户。

    SYSTEMS AND METHODS FOR MODELING L1-SPECIFIC PHONOLOGICAL ERRORS IN COMPUTER-ASSISTED PRONUNCIATION TRAINING SYSTEM
    6.
    发明申请
    SYSTEMS AND METHODS FOR MODELING L1-SPECIFIC PHONOLOGICAL ERRORS IN COMPUTER-ASSISTED PRONUNCIATION TRAINING SYSTEM 审中-公开
    计算机辅助培训系统中L1特定语音错误的建模方法

    公开(公告)号:US20140006029A1

    公开(公告)日:2014-01-02

    申请号:US13932506

    申请日:2013-07-01

    IPC分类号: G10L15/19

    摘要: A non-transitory processor-readable medium storing code representing instructions to be executed by a processor includes code to cause the processor to receive acoustic data representing an utterance spoken by a language learner in a non-native language in response to prompting the language learner to recite a word in the non-native language and receive a pronunciation lexicon of the word in the non-native language. The pronunciation lexicon includes at least one alternative pronunciation of the word based on a pronunciation lexicon of a native language of the language learner. The code causes the processor to generate an acoustic model of the at least one alternative pronunciation in the non-native language and identify a mispronunciation of the word in the utterance based on a comparison of the acoustic data with the acoustic model. The code causes the processor to send feedback related to the mispronunciation of the word to the language learner.

    摘要翻译: 存储表示要由处理器执行的指令的代码的非暂时处理器可读介质包括代码,以使处理器响应于语言学习者的提示而接收表示语言学习者用非母语表达的话语的声学数据 用非母语背诵单词,并以非母语接收单词的发音词典。 发音词典包括基于语言学习者的母语的发音词典的单词的至少一个替代发音。 该代码使得处理器基于非原生语言的至少一个替代发音生成声学模型,并且基于声学数据与声学模型的比较来识别话语中的单词的错误发音。 该代码使处理器发送与语言学习者错误地发音相关的反馈。

    System and method for talking avatar

    公开(公告)号:US11600290B2

    公开(公告)日:2023-03-07

    申请号:US17015902

    申请日:2020-09-09

    摘要: Aspects of this disclosure provide techniques for generating a viseme and corresponding intensity pair. In some embodiments, the method includes generating, by a server, a viseme and corresponding intensity pair based at least on one of a clean vocal track or corresponding transcription. The method may include generating, by the server, a compressed audio file based at least on one of the viseme, the corresponding intensity, music, or visual offset. The method may further include generating, by the server or a client end application, a buffer of raw pulse-code modulated (PCM) data based on decoding at least a part of the compressed audio file, where the viseme is scheduled to align with a corresponding phoneme.

    Generating acoustic models of alternative pronunciations for utterances spoken by a language learner in a non-native language

    公开(公告)号:US10068569B2

    公开(公告)日:2018-09-04

    申请号:US13932506

    申请日:2013-07-01

    摘要: A non-transitory processor-readable medium storing code representing instructions to be executed by a processor includes code to cause the processor to receive acoustic data representing an utterance spoken by a language learner in a non-native language in response to prompting the language learner to recite a word in the non-native language and receive a pronunciation lexicon of the word in the non-native language. The pronunciation lexicon includes at least one alternative pronunciation of the word based on a pronunciation lexicon of a native language of the language learner. The code causes the processor to generate an acoustic model of the at least one alternative pronunciation in the non-native language and identify a mispronunciation of the word in the utterance based on a comparison of the acoustic data with the acoustic model. The code causes the processor to send feedback related to the mispronunciation of the word to the language learner.