Audio locale mismatch detection
    1.
    发明授权

    公开(公告)号:US10860648B1

    公开(公告)日:2020-12-08

    申请号:US16129567

    申请日:2018-09-12

    Abstract: Systems, methods, and computer-readable media are disclosed for detecting a mismatch between the spoken language in an audio file and the audio language that is tagged as the spoken language in the audio file metadata. Example methods may include receiving a media file including spoken language metadata. Certain methods include generating an audio sample from the media file. Certain methods include generating a text translation of the audio sample based on the spoken language metadata. Certain methods include determining that the spoken language metadata does not match a spoken language in the audio sample based on the text translation. Certain methods include sending an indication that the spoken language metadata does not match the spoken language.

    Text encoding issue detection
    2.
    发明授权

    公开(公告)号:US11423208B1

    公开(公告)日:2022-08-23

    申请号:US15826379

    申请日:2017-11-29

    Abstract: Method and apparatus for detecting text encoding errors caused by previously encoding the electronic document in multiple encoding formats. Non-word portions are removed from the electronic document. Embodiments determine whether words in the electronic document are likely to contain one or more text encoding errors, by dividing the first word into n-grams of length 2 or more. For each of the plurality of n-grams, a database is queried to determine a respective probability of the n-gram appearing in each of a plurality of recognized languages, and upon determining that the determined probabilities of two consecutive n-grams are each less than a predefined threshold probability, the first word is added to a list of words that likely contain text encoding errors. A confidence level that the first word includes the one or more text encoding errors is calculated, based on a lowest determined probably for the n-grams for the first word.

Patent Agency Ranking