摘要:
An automatic speech segmentation and verification system and method is disclosed, which has a known text script and a recorded speech corpus corresponding to the known text script. A speech unit segmentor segments the recorded speech corpus into N test speech unit segments referring to the phonetic information of the known text script. Then, a segmental verifier is applied to obtain a confidence measure of syllable segmentation for verifying the correctness of the cutting points of test speech unit segments. A phonetic verifier obtains a confidence measure of syllable verification by using verification models for verifying whether the recorded speech corpus is correctly recorded. Finally, a speech unit inspector integrates the confidence measure of syllable segmentation and the confidence measure of syllable verification to determine whether the test speech unit segment is accepted or not.
摘要:
An automatic speech segmentation and verification system and method is disclosed, which has a known text script and a recorded speech corpus corresponding to the known text script. A speech unit segmentor segments the recorded speech corpus into N test speech unit segments referring to the phonetic information of the known text script. Then, a segmental verifier is applied to obtain a confidence measure of syllable segmentation for verifying the correctness of the cutting points of test speech unit segments. A phonetic verifier obtains a confidence measure of syllable verification by using verification models for verifying whether the recorded speech corpus is correctly recorded. Finally, a speech unit inspector integrates the confidence measure of syllable segmentation and the confidence measure of syllable verification to determine whether the test speech unit segment is accepted or not.
摘要:
A method of pitch mark determination for a speech includes the following steps. First, a fundamental frequency and fundamental frequency passband signals are acquired by using an adaptable filter. Then, a number of passing zero positions of the fundamental frequency passband signals are detected. After that, at least a candidate set of pitch marks from a number of passing zero positions are generated. Lastly, the candidate set of pitch marks is estimated to generate the best set of pitch marks.