发明申请
US20050203739A1 Generating large units of graphonemes with mutual information criterion for letter to sound conversion 失效
生成具有相互信息标准的大单位图形,用于字母转换

  • 专利标题: Generating large units of graphonemes with mutual information criterion for letter to sound conversion
  • 专利标题(中): 生成具有相互信息标准的大单位图形,用于字母转换
  • 申请号: US10797358
    申请日: 2004-03-10
  • 公开(公告)号: US20050203739A1
    公开(公告)日: 2005-09-15
  • 发明人: Mei-Yuh HwangLi Jiang
  • 申请人: Mei-Yuh HwangLi Jiang
  • 申请人地址: US WA Redmond
  • 专利权人: Microsoft Corporation
  • 当前专利权人: Microsoft Corporation
  • 当前专利权人地址: US WA Redmond
  • 主分类号: G10L13/06
  • IPC分类号: G10L13/06 G10L13/08 G06K9/00
Generating large units of graphonemes with mutual information criterion for letter to sound conversion
摘要:
A method and apparatus are provided for segmenting words into component parts. Under the invention, mutual information scores for pairs of graphoneme units found in a set of words are determined. Each graphoneme unit includes at least one letter. The graphoneme units of one pair of graphoneme units are combined based on the mutual information score. This forms a new graphoneme unit. Under one aspect of the invention, a syllable n-gram model is trained based on words that have been segmented into syllables using mutual information. The syllable n-gram model is used to segment a phonetic representation of a new word into syllables. Similarly, an inventory of morphemes is formed using mutual information and a morpheme n-gram is trained that can be used to segment a new word into a sequence of morphemes.
信息查询
0/0