System and method for learning latent representations for natural language tasks

    公开(公告)号:US09720907B2

    公开(公告)日:2017-08-01

    申请号:US14853053

    申请日:2015-09-14

    IPC分类号: G06F17/28

    CPC分类号: G06F17/28

    摘要: Disclosed herein are systems, methods, and non-transitory computer-readable storage media for learning latent representations for natural language tasks. A system configured to practice the method analyzes, for a first natural language processing task, a first natural language corpus to generate a latent representation for words in the first corpus. Then the system analyzes, for a second natural language processing task, a second natural language corpus having a target word, and predicts a label for the target word based on the latent representation. In one variation, the target word is one or more word such as a rare word and/or a word not encountered in the first natural language corpus. The system can optionally assigning the label to the target word. The system can operate according to a connectionist model that includes a learnable linear mapping that maps each word in the first corpus to a low dimensional latent space.