GROUPING SIMILAR WORDS IN A LANGUAGE MODEL
    1.
    发明公开

    公开(公告)号:US20240062752A1

    公开(公告)日:2024-02-22

    申请号:US17821431

    申请日:2022-08-22

    Applicant: Snap Inc.

    CPC classification number: G10L15/197

    Abstract: Systems and methods are provided for performing automated speech recognition. The systems and methods access a LM that includes a plurality of n-grams, each of the plurality of n-grams comprising a respective sequence of words and corresponding LM score and receive a list of words associated with a group classification, each word in the list of words being associated with a respective weight. The systems and method compute, based on the LM scores of the plurality of n-grams, a probability that a given word in the list of words associated with the group classification appears in an n-gram in the LM comprising an individual sequence of words and adds one or more new n-grams to the LM comprising one or more words in the list of words in combination with the individual sequence of words and associated with a particular LM score based on the computed probability.

    Grouping similar words in a language model

    公开(公告)号:US12236946B2

    公开(公告)日:2025-02-25

    申请号:US17821431

    申请日:2022-08-22

    Applicant: Snap Inc.

    Abstract: Systems and methods are provided for performing automated speech recognition. The systems and methods access a LM that includes a plurality of n-grams, each of the plurality of n-grams comprising a respective sequence of words and corresponding LM score and receive a list of words associated with a group classification, each word in the list of words being associated with a respective weight. The systems and method compute, based on the LM scores of the plurality of n-grams, a probability that a given word in the list of words associated with the group classification appears in an n-gram in the LM comprising an individual sequence of words and adds one or more new n-grams to the LM comprising one or more words in the list of words in combination with the individual sequence of words and associated with a particular LM score based on the computed probability.

    BOOSTING WORDS IN AUTOMATED SPEECH RECOGNITION

    公开(公告)号:US20240021195A1

    公开(公告)日:2024-01-18

    申请号:US17864937

    申请日:2022-07-14

    Applicant: Snap Inc.

    CPC classification number: G10L15/197 G10L15/22 G10L15/187 G10L15/10

    Abstract: Systems and methods are provided for performing automated speech recognition. The systems and methods perform operations comprising: accessing a language model that includes a plurality of n-grams, each of the plurality of n-grams comprising a respective sequence of words and corresponding LM score; selecting a target word to boost in the language model; receiving a boosting factor for the target word; identifying a target n-gram in the language model that includes the target word; identifying a subset of n-grams of the plurality of n-grams that include words in a portion of the target n-gram; and adjusting the LM score of the target n-gram based on the LM scores of the subset of n-grams and the boosting factor.

Patent Agency Ranking