Weakly supervised part-of-speech tagging with coupled token and type constraints
    1.
    发明授权
    Weakly supervised part-of-speech tagging with coupled token and type constraints 有权
    弱化地监督了具有耦合令牌和类型限制的词性标注

    公开(公告)号:US09311299B1

    公开(公告)日:2016-04-12

    申请号:US13955491

    申请日:2013-07-31

    Applicant: Google Inc.

    CPC classification number: G06F17/28 G06F17/271 G06F17/2785 G06F17/2827

    Abstract: A method and system are provided for a part-of-speech tagger that may be particularly useful for resource-poor languages. Use of manually constructed tag dictionaries from dictionaries via bitext can be used as type constraints to overcome the scarcity of annotated data in some instances. Additional token constraints can be projected from a resource-rich source language via word-aligned bitext. Several example models are provided to demonstrate this such as a partially observed conditional random field model, where coupled token and type constraints may provide a partial signal for training. The disclosed method achieves a significant relative error reduction over the prior state of the art.

    Abstract translation: 为可能对资源贫乏的语言特别有用的词性标签器提供了一种方法和系统。 通过bitext使用手工构建的字典字典可用作类型约束来克服某些情况下注释数据的稀缺性。 额外的令牌约束可以从资源丰富的源语言通过字对齐的bitext进行投影。 提供了几个示例模型来证明这一点,例如部分观察到的条件随机场模型,其中耦合的令牌和类型约束可以提供用于训练的部分信号。 所公开的方法相对于现有技术的现有技术实现了显着的相对误差减小。

Patent Agency Ranking