摘要:
A computerized apparatus for reducing the size of a dictionary used in a text-to-speech synthesis system are provided. In an initial phase, the method and apparatus determine if entries in the dictionary, each containing a grapheme string and a corresponding phoneme string, can be fully matched by using at least one rule set used to synthesize words to phonemic data. If the entry can be fully matched using rule processing alone, the entry is indicated to be deleted from the dictionary. In a second phase, the method and apparatus determine if the entry, considered as a root word entry, is required in the dictionary in order to support phoneme synthesis of other entries containing the root word entry, and if so, the root word entry is indicated to be saved in the dictionary. If the other entries containing the root word entry can have correct phonemic data generated from a combination of the root word entries phonemic data and phonemes generated from rule set processing, then the other entries are indicated to be deleted from the dictionary. After all words have been processed by phase one and/or phase two, the entries indicated to be saved are aggregated to form a reduced dictionary.