Parsimonious handling of word inflection via categorical stem + suffix N-gram language models

    公开(公告)号:US09886432B2

    公开(公告)日:2018-02-06

    申请号:US14839806

    申请日:2015-08-28

    Applicant: Apple Inc.

    CPC classification number: G06F17/276 G10L15/197

    Abstract: Systems and processes are disclosed for predicting words using a categorical stem and suffix word n-gram language model. A word prediction includes determining a stem probability using a stem language model. The word prediction also includes determining a suffix probability using suffix language model decoupled from the stem model, in view of one or more stem categories. The word prediction also includes determine a probability of the stem belonging to the stem category. A joint probability is determined based on the foregoing, and one or more word predictions having sufficient likelihood. In this way, the categorical stem and suffix language model constraints predicted suffixes to those that would be grammatically valid with predicted stems, thereby producing word predictions with grammatically valid stem and suffix combinations.

    Systems and methods for structured stem and suffix language models

    公开(公告)号:US09899019B2

    公开(公告)日:2018-02-20

    申请号:US14841047

    申请日:2015-08-31

    Applicant: Apple Inc.

    CPC classification number: G10L15/063 G06F3/023 G06F17/276 G10L15/197

    Abstract: Systems and methods are disclosed for predicting words using a structured stem and suffix n-gram language model. The systems and methods include determining, using a first n-gram word language model, a first probability of a stem based on a first portion of a previously-input word in the received input. Using a second n-gram language model, a second probability of a first suffix may be determined based at least on a second portion the previously-input word in the received input. Further, a third probability of a second suffix different from the first suffix may be determined using a third n-gram language model based at least on a third portion of the previously-input word in the received input. A fourth probability of a predicted word may be determined based on the first, second and third probabilities. One or more predicted words may be determined and provided as an output to the user.

Patent Agency Ranking