-
公开(公告)号:US09852123B1
公开(公告)日:2017-12-26
申请号:US15165328
申请日:2016-05-26
Applicant: Google Inc.
Inventor: Richard Sproat , Ke Wu , Kyle Gorman
CPC classification number: G06F17/274 , G06F17/2264 , G06F17/277 , G06F17/2881 , G10L13/00
Abstract: A language processing system for text normalization of an input string of a semiotic class. In an aspect, a method includes receiving an input string; accessing, for a semiotic class of non-standard words, a language universal covering grammar for a plurality of languages that generates, for each language of the plurality of languages, one or more sequences of word-level components for each instance of the semiotic class in the language; for each of the plurality of languages, accessing a lexical map specific to the language and that maps each sequence of word-level components for each instance of the semiotic class in the language verbalizations in the language; generating, from the language universal grammar and the lexical maps, a lattice of possible verbalizations of the input string; and selecting one of the possible verbalizations as a selected verbalization for the input string.