Sequence classification for machine translation
    1.
    发明申请
    Sequence classification for machine translation 失效
    机器翻译序列分类

    公开(公告)号:US20080162111A1

    公开(公告)日:2008-07-03

    申请号:US11647080

    申请日:2006-12-28

    CPC classification number: G06F17/2818

    Abstract: Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word. The weights comprising the vectors are associated with respective ones of the features; each weight is a measure of the extent to which the presence of that feature for the source word makes it more probable that the target word in question is the correct one.

    Abstract translation: 使用独立假设进行序列分类,如自然语言句子的翻译。 独立性假设是将源语句正确翻译成特定目标句子词的概率与句子中其他单词的翻译无关的假设。 尽管这种假设不是正确的,但仍然会实现高水平的字翻译精度。 特别地,歧视性训练被用于基于训练句子中相应源词的一组特征来开发每个目标词汇词的模型,其中至少一个与源词的上下文有关的特征。 每个模型包括对应的目标词汇单词的权重向量。 包括向量的权重与相应的特征相关联; 每个权重是衡量源字符的该特征的存在程度使得所述目标词更可能是正确的。

    METHOD AND SYSTEM FOR PROVIDING AN AUTOMATED WEB TRANSCRIPTION SERVICE
    2.
    发明申请
    METHOD AND SYSTEM FOR PROVIDING AN AUTOMATED WEB TRANSCRIPTION SERVICE 有权
    提供自动WEB转录服务的方法和系统

    公开(公告)号:US20080059173A1

    公开(公告)日:2008-03-06

    申请号:US11469016

    申请日:2006-08-31

    CPC classification number: G10L15/26 G06F17/30893 G10L15/265

    Abstract: A system, method and computer readable medium that provides an automated web transcription service is disclosed. The method may include receiving input speech from a user using a communications network, recognizing the received input speech, understanding the recognized speech, transcribing the understood speech to text, storing the transcribed text in a database, receiving a request via a web page to display the transcribed text, retrieving transcribed text from the database, and displaying the transcribed text to the requester using the web page.

    Abstract translation: 公开了一种提供自动网页转录服务的系统,方法和计算机可读介质。 该方法可以包括使用通信网络从用户接收输入语音,识别接收的输入语音,理解识别的语音,将理解的语音转录为文本,将转录的文本存储在数据库中,经由网页接收请求以显示 转录的文本,从数据库检索转录的文本,以及使用网页将转录的文本显示给请求者。

    Efficient incremental modification of optimized finite-state transducers (FSTs) for use in speech applications

    公开(公告)号:US09837073B2

    公开(公告)日:2017-12-05

    申请号:US14346331

    申请日:2011-09-21

    CPC classification number: G10L15/08 G10L15/083

    Abstract: Methods of incrementally modifying a word-level finite state transducer (FST) are described for adding and removing sentences. A prefix subset of states and arcs in the FST is determined that matches a prefix portion of the sentence. A suffix subset of states and arcs in the FST is determined that matches a suffix portion of the sentence. A new sentence can then be added to the FST by appending a new sequence of states and arcs to the FST corresponding to a remainder of the sentence between the prefix and suffix. An existing sentence can be removed from the FST by removing any arcs and states between the prefix subset and the suffix subset. The resulting modified FST is locally efficient but does not satisfy global optimization criteria such as minimization.

    Discriminative training of models for sequence classification
    5.
    发明申请
    Discriminative training of models for sequence classification 审中-公开
    序列分类模型的辨别性训练

    公开(公告)号:US20080162117A1

    公开(公告)日:2008-07-03

    申请号:US11646983

    申请日:2006-12-28

    CPC classification number: G06F17/2818

    Abstract: Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word. The weights comprising the vectors are associated with respective ones of the features; each weight is a measure of the extent to which the presence of that feature for the source word makes it more probable that the target word in question is the correct one.

    Abstract translation: 使用独立假设进行序列分类,如自然语言句子的翻译。 独立性假设是将源语句正确翻译成特定目标句子词的概率与句子中其他单词的翻译无关的假设。 尽管这种假设不是正确的,但仍然会实现高水平的字翻译精度。 特别地,歧视性训练被用于基于训练句子中相应源词的一组特征来开发每个目标词汇词的模型,其中至少一个与源词的上下文有关的特征。 每个模型包括对应的目标词汇单词的权重向量。 包括向量的权重与相应的特征相关联; 每个权重是衡量源字符的该特征的存在程度使得所述目标词更可能是正确的。

    Efficient Incremental Modification of Optimized Finite-State Transducers (FSTs) for Use in Speech Applications
    6.
    发明申请
    Efficient Incremental Modification of Optimized Finite-State Transducers (FSTs) for Use in Speech Applications 有权
    用于语音应用的优化有限状态转换器(FST)的有效增量修改

    公开(公告)号:US20140229177A1

    公开(公告)日:2014-08-14

    申请号:US14346331

    申请日:2011-09-21

    CPC classification number: G10L15/08 G10L15/083

    Abstract: Methods of incrementally modifying a word-level finite state transducer (FST) are described for adding and removing sentences. A prefix subset of states and arcs in the FST is determined that matches a prefix portion of the sentence. A suffix subset of states and arcs in the FST is determined that matches a suffix portion of the sentence. A new sentence can then be added to the FST by appending a new sequence of states and arcs to the FST corresponding to a remainder of the sentence between the prefix and suffix. An existing sentence can be removed from the FST by removing any arcs and states between the prefix subset and the suffix subset. The resulting modified FST is locally efficient but does not satisfy global optimization criteria such as minimization.

    Abstract translation: 描述用于增加和移除句子的逐级修改字级有限状态传感器(FST)的方法。 确定FST中状态和弧的前缀子集与句子的前缀部分相匹配。 确定FST中的状态和弧的后缀子集与句子的后缀部分相匹配。 然后,可以通过向FST附加一个新的状态序列和弧,使之与前缀和后缀之间的句子的其余部分相对应,从而将新的句子添加到FST中。 通过删除前缀子集和后缀子集之间的任何弧和状态,可以从FST中删除现有句子。 所产生的修正FST是局部有效的,但不满足诸如最小化的全局优化标准。

    Sequence classification for machine translation
    8.
    发明授权
    Sequence classification for machine translation 失效
    机器翻译序列分类

    公开(公告)号:US07783473B2

    公开(公告)日:2010-08-24

    申请号:US11647080

    申请日:2006-12-28

    CPC classification number: G06F17/2818

    Abstract: Classification of sequences, such as the translation of natural language sentences, is carried out using an independence assumption. The independence assumption is an assumption that the probability of a correct translation of a source sentence word into a particular target sentence word is independent of the translation of other words in the sentence. Although this assumption is not a correct one, a high level of word translation accuracy is nonetheless achieved. In particular, discriminative training is used to develop models for each target vocabulary word based on a set of features of the corresponding source word in training sentences, with at least one of those features relating to the context of the source word. Each model comprises a weight vector for the corresponding target vocabulary word. The weights comprising the vectors are associated with respective ones of the features; each weight is a measure of the extent to which the presence of that feature for the source word makes it more probable that the target word in question is the correct one.

    Abstract translation: 使用独立假设进行序列分类,如自然语言句子的翻译。 独立性假设是将源语句正确翻译成特定目标句子词的概率与句子中其他单词的翻译无关的假设。 尽管这种假设不是正确的,但仍然会实现高水平的字翻译精度。 特别地,歧视性训练被用于基于训练句子中相应源词的一组特征来开发每个目标词汇词的模型,其中至少一个与源词的上下文有关的特征。 每个模型包括对应的目标词汇单词的权重向量。 包括向量的权重与相应的特征相关联; 每个权重是衡量源字符的该特征的存在程度使得所述目标词更可能是正确的。

Patent Agency Ranking