Methods for using manual phrase alignment data to generate translation models for statistical machine translation
    1.
    发明授权
    Methods for using manual phrase alignment data to generate translation models for statistical machine translation 有权
    使用手动短语对齐数据生成用于统计机器翻译的翻译模型的方法

    公开(公告)号:US08229728B2

    公开(公告)日:2012-07-24

    申请号:US11969518

    申请日:2008-01-04

    IPC分类号: G06F17/20 G06F17/21 G06F17/28

    CPC分类号: G06F17/2818 G06F17/2827

    摘要: The present invention adopts the fundamental architecture of a statistical machine translation system which utilizes statistical models learned from the training data and does not require expert knowledge for rule-based machine translation systems. Out of the training parallel data, a certain amount of sentence pairs are selected for manual alignment. These sentences are aligned at the phrase level instead of at the word level. Depending on the size of the training data, the optimal amount for manual alignment may vary. The alignment is done using an alignment tool with a graphical user interface which is convenient and intuitive to the users. Manually aligned data are then utilized to improve the automatic word alignment component. Model combination methods are also introduced to improve the accuracy and the coverage of statistical models for the task of statistical machine translation.

    摘要翻译: 本发明采用统计机器翻译系统的基础架构,该系统利用从训练数据中获得的统计模型,不需要基于规则的机器翻译系统的专业知识。 在训练并行数据中,选择一定量的句子对进行手动对齐。 这些句子在短语级别而不是单词级别对齐。 根据训练数据的大小,手动校准的最佳量可能会有所不同。 使用具有用户方便和直观的图形用户界面的对准工具进行对准。 然后使用手动对齐的数据来改进自动字对齐组件。 还引入了模型组合方法,以提高统计机器翻译任务的统计模型的准确性和覆盖率。

    Robust information extraction from utterances
    2.
    发明授权
    Robust information extraction from utterances 有权
    从言语中提取鲁棒的信息

    公开(公告)号:US08583416B2

    公开(公告)日:2013-11-12

    申请号:US11965711

    申请日:2007-12-27

    IPC分类号: G06F17/28 G10L15/00 G10L21/00

    CPC分类号: G10L15/1822 G10L15/1815

    摘要: The performance of traditional speech recognition systems (as applied to information extraction or translation) decreases significantly with, larger domain size, scarce training data as well as under noisy environmental conditions. This invention mitigates these problems through the introduction of a novel predictive feature extraction method which combines linguistic and statistical information for representation of information embedded in a noisy source language. The predictive features are combined with text classifiers to map the noisy text to one of the semantically or functionally similar groups. The features used by the classifier can be syntactic, semantic, and statistical.

    摘要翻译: 传统语音识别系统(应用于信息提取或翻译)的性能随着更大的域大小,稀缺的训练数据以及噪声环境条件而显着降低。 本发明通过引入一种新颖的预测特征提取方法来缓解这些问题,该方法结合语言和统计信息来表示以噪声源语言嵌入的信息。 预测特征与文本分类器组合,将嘈杂的文本映射到语义或功能相似的组之一。 分类器使用的特征可以是语法,语义和统计。

    Methods for Using Manual Phrase Alignment Data to Generate Translation Models for Statistical Machine Translation
    3.
    发明申请
    Methods for Using Manual Phrase Alignment Data to Generate Translation Models for Statistical Machine Translation 有权
    使用手动短语对齐数据生成统计机器翻译的翻译模型的方法

    公开(公告)号:US20090177460A1

    公开(公告)日:2009-07-09

    申请号:US11969518

    申请日:2008-01-04

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2818 G06F17/2827

    摘要: The present invention adopts the fundamental architecture of a statistical machine translation system which utilizes statistical models learned from the training data and does not require expert knowledge for rule-based machine translation systems. Out of the training parallel data, a certain amount of sentence pairs are selected for manual alignment. These sentences are aligned at the phrase level instead of at the word level. Depending on the size of the training data, the optimal amount for manual alignment may vary. The alignment is done using an alignment tool with a graphical user interface which is convenient and intuitive to the users. Manually aligned data are then utilized to improve the automatic word alignment component. Model combination methods are also introduced to improve the accuracy and the coverage of statistical models for the task of statistical machine translation.

    摘要翻译: 本发明采用统计机器翻译系统的基础架构,该系统利用从训练数据中获得的统计模型,不需要基于规则的机器翻译系统的专业知识。 在训练并行数据中,选择一定量的句子对进行手动对齐。 这些句子在短语级别而不是单词级别对齐。 根据训练数据的大小,手动校准的最佳量可能会有所不同。 使用具有用户方便和直观的图形用户界面的对准工具进行对准。 然后使用手动对齐的数据来改进自动字对齐组件。 还提出了模型组合方法,以提高统计机器翻译任务的统计模型的准确性和覆盖率。

    Robust Information Extraction from Utterances
    4.
    发明申请
    Robust Information Extraction from Utterances 有权
    强大的信息提取

    公开(公告)号:US20090171662A1

    公开(公告)日:2009-07-02

    申请号:US11965711

    申请日:2007-12-27

    IPC分类号: G10L15/00

    CPC分类号: G10L15/1822 G10L15/1815

    摘要: The performance of traditional speech recognition systems (as applied to information extraction or translation) decreases significantly with, larger domain size, scarce training data as well as under noisy environmental conditions. This invention mitigates these problems through the introduction of a novel predictive feature extraction method which combines linguistic and statistical information for representation of information embedded in a noisy source language. The predictive features are combined with text classifiers to map the noisy text to one of the semantically or functionally similar groups. The features used by the classifier can be syntactic, semantic, and statistical.

    摘要翻译: 传统语音识别系统(应用于信息提取或翻译)的性能随着更大的域大小,稀缺的训练数据以及噪声环境条件而显着降低。 本发明通过引入一种新颖的预测特征提取方法来缓解这些问题,该方法结合语言和统计信息来表示以噪声源语言嵌入的信息。 预测特征与文本分类器组合,将嘈杂的文本映射到语义或功能相似的组之一。 分类器使用的特征可以是语法,语义和统计。

    Chunk-based statistical machine translation system
    5.
    发明申请
    Chunk-based statistical machine translation system 审中-公开
    基于块的统计机器翻译系统

    公开(公告)号:US20080154577A1

    公开(公告)日:2008-06-26

    申请号:US11645926

    申请日:2006-12-26

    IPC分类号: G06F17/28

    CPC分类号: G06F17/2827 G06F17/2775

    摘要: Traditional statistical machine translation systems learn all information from a sentence aligned parallel text and are known to have problems translating between structurally diverse languages. To overcome this limitation, the present invention introduces two-level training, which incorporates syntactic chunking into statistical translation. A chunk-alignment step is inserted between the sentence-level and word-level training, which allows differing training for these two sources of information in order to learn lexical properties from the aligned chunks and learn structural properties from chunk sequences. The system consists of a linguistic processing step, two level training, and a decoding step which combines chunk translations of multiple sources and multiple language models.

    摘要翻译: 传统的统计机器翻译系统从句子对齐的并行文本中学习所有信息,并且已知在不同结构语言之间翻译有问题。 为了克服这个限制,本发明引入了将句法分块结合到统计翻译中的两级训练。 在句子级和词级训练之间插入块对齐步骤,其允许针对这两个信息源的不同训练,以便从对齐的块学习词汇属性并从块序列学习结构特性。 该系统由语言处理步骤,两级训练和解码步骤组成,该步骤结合了多个来源和多种语言模型的块转换。

    Display method, display controller and display terminal
    9.
    发明授权
    Display method, display controller and display terminal 有权
    显示方式,显示控制器和显示终端

    公开(公告)号:US08477155B2

    公开(公告)日:2013-07-02

    申请号:US12798392

    申请日:2010-04-02

    申请人: Jun Huang Yuan Ji

    发明人: Jun Huang Yuan Ji

    IPC分类号: G09G5/00

    摘要: A display method and apparatus are disclosed. The method includes: when a video layer needs to scale a video image, judging whether a preset policy is met; if so, using the offline mode; otherwise using the online mode; processing the video image in online mode or offline mode, and outputting the processed video image, where the online mode is a mode in which the video image frame is scaled in real time, and the offline mode is a mode in which the video image frame is scaled asynchronously. With the display method and apparatus, the source video image can be scaled in any ratio by selecting the online mode or offline mode, thus reducing the display power consumption.

    摘要翻译: 公开了一种显示方法和装置。 该方法包括:当视频层需要缩放视频图像时,判断是否满足预设策略; 如果是,使用离线模式; 否则使用在线模式; 在线模式或离线模式处理视频图像,并输出处理后的视频图像,其中在线模式是实时缩放视频图像帧的模式,离线模式是视频图像帧 以异步方式缩放。 利用显示方法和装置,可以通过选择在线模式或离线模式以任何比例缩放源视频图像,从而降低显示功耗。

    Method and Apparatus for Spectrum Monitoring
    10.
    发明申请
    Method and Apparatus for Spectrum Monitoring 有权
    频谱监测方法与装置

    公开(公告)号:US20130063608A1

    公开(公告)日:2013-03-14

    申请号:US13607916

    申请日:2012-09-10

    摘要: A system, such as a satellite reception assembly or customer premises gateway, may comprise an analog-to-digital converter operable to digitize a signal spanning an entire television spectrum (e.g., cable television spectrum or satellite television spectrum) comprising a plurality of television channels. The system may comprise a signal monitor operable to analyze a signal to determine a characteristic of the signal. The system may comprise a data processor operable to process a television channel to recover content carried on the television channel. The system may comprise a channelizer operable to select first and second portions of the signal, and concurrently output the first portion to the signal monitor and the second portion to the data processor.

    摘要翻译: 诸如卫星接收组件或客户驻地网关的系统可以包括可操作用于数字化跨越整个电视频谱(例如,有线电视频谱或卫星电视频谱)的信号的模数转换器,其包括多个电视频道 。 该系统可以包括可操作以分析信号以确定信号的特性的信号监视器。 该系统可以包括可操作以处理电视频道以恢复在电视频道上携带的内容的数据处理器。 该系统可以包括可操作以选择信号的第一和第二部分的信道化器,并且同时将第一部分输出到信号监视器,并将第二部分输出到数据处理器。