-
公开(公告)号:US20110320197A1
公开(公告)日:2011-12-29
申请号:US13095388
申请日:2011-04-27
Applicant: David CONEJERO , Helenca DUXANS , Gregorio ESCALADA
Inventor: David CONEJERO , Helenca DUXANS , Gregorio ESCALADA
IPC: G10L15/26
CPC classification number: G06F17/30746 , G06F17/3002 , G06F17/30026 , G06F17/30778 , G10L15/26
Abstract: It comprises analyzing audio content of multimedia files and performing a speech to text transcription thereof automatically by means of an ASR process, and selecting acoustic and language models adapted for the ASR process at least before the latter processes the multimedia file, i.e. “a priori”.The method is particularly applicable to the automatic indexing, aggregation and clustering of news from different sources and from different types of files, including text, audio and audiovisual documents without any manual annotation.
Abstract translation: 它包括分析多媒体文件的音频内容并通过ASR过程自动进行语音文本转录,并且至少在后者处理多媒体文件之前选择适用于ASR过程的声学和语言模型,即“先验” 。 该方法特别适用于来自不同来源和不同类型文件的新闻的自动索引,聚合和聚类,包括文本,音频和视听文档,无需任何手动注释。
-
公开(公告)号:US08812324B2
公开(公告)日:2014-08-19
申请号:US13254479
申请日:2010-12-21
Applicant: Miguel Angel Rodriguez Crespo , Jose Gregorio Escalada Sardina , Ana Armenta Lopez de Vicuna
Inventor: Miguel Angel Rodriguez Crespo , Jose Gregorio Escalada Sardina , Ana Armenta Lopez de Vicuna
IPC: G10L13/00
CPC classification number: G10L13/033 , G10L13/06 , G10L19/093
Abstract: The invention relates to a method for speech signal analysis, modification and synthesis comprising a phase for the location of analysis windows by means of an iterative process for the determination of the phase of the first sinusoidal component and comparison between the phase value of said component and a predetermined value, a phase for the selection of analysis frames corresponding to an allophone and readjustment of the duration and the fundamental frequency according to certain thresholds and a phase for the generation of synthetic speech from synthesis frames taking the information of the closest analysis frame as spectral information of the synthesis frame and taking as many synthesis frames as periods that the synthetic signal has. The method allows a coherent location of the analysis windows within the periods of the signal and the exact generation of the synthesis instants in a manner synchronous with the fundamental period.
Abstract translation: 本发明涉及一种用于语音信号分析,修改和合成的方法,其包括通过用于确定第一正弦分量的相位的迭代过程用于分析窗口的位置的相位以及所述分量的相位值与 预定值,用于选择对应于异音素的分析帧的相位,以及根据某些阈值重新调整持续时间和基本频率的相位,以及使用最接近的分析帧的信息从综合帧产生合成语音的相位作为 合成帧的频谱信息,并且获取与合成信号具有的周期一样多的合成帧。 该方法允许分析窗口在信号的周期内以与基本周期同步的方式精确地产生合成时刻的相干位置。
-
公开(公告)号:US08775174B2
公开(公告)日:2014-07-08
申请号:US13095388
申请日:2011-04-27
Applicant: David Conejero , Helenca Duxans , Gregorio Escalada
Inventor: David Conejero , Helenca Duxans , Gregorio Escalada
IPC: G10L15/26
CPC classification number: G06F17/30746 , G06F17/3002 , G06F17/30026 , G06F17/30778 , G10L15/26
Abstract: It comprises analyzing audio content of multimedia files and performing a speech to text transcription thereof automatically by means of an ASR process, and selecting acoustic and language models adapted for the ASR process at least before the latter processes the multimedia file, i.e. “a priori”.The method is particularly applicable to the automatic indexing, aggregation and clustering of news from different sources and from different types of files, including text, audio and audiovisual documents without any manual annotation.
Abstract translation: 它包括分析多媒体文件的音频内容并通过ASR过程自动进行语音文本转录,并且至少在后者处理多媒体文件之前选择适用于ASR过程的声学和语言模型,即“先验” 。 该方法特别适用于来自不同来源和不同类型文件的新闻的自动索引,聚合和聚类,包括文本,音频和视听文档,无需任何手动注释。
-
公开(公告)号:US20110320207A1
公开(公告)日:2011-12-29
申请号:US13254479
申请日:2010-12-21
Applicant: Miguel Angel Rodriguez Crespo , Jose Gregorio Escalada Sardina , Ana Armenta Lopez Vicuna
CPC classification number: G10L13/033 , G10L13/06 , G10L19/093
Abstract: The invention relates to a method for speech signal analysis, modification and synthesis comprising a phase for the location of analysis windows by means of an iterative process for the determination of the phase of the first sinusoidal component and comparison between the phase value of said component and a predetermined value, a phase for the selection of analysis frames corresponding to an allophone and readjustment of the duration and the fundamental frequency according to certain thresholds and a phase for the generation of synthetic speech from synthesis frames taking the information of the closest analysis frame as spectral information of the synthesis frame and taking as many synthesis frames as periods that the synthetic signal has. The method allows a coherent location of the analysis windows within the periods of the signal and the exact generation of the synthesis instants in a manner synchronous with the fundamental period.
Abstract translation: 本发明涉及一种用于语音信号分析,修改和合成的方法,其包括通过用于确定第一正弦分量的相位的迭代过程用于分析窗口的位置的相位以及所述分量的相位值与 预定值,用于选择对应于异音素的分析帧的相位,以及根据某些阈值重新调整持续时间和基本频率的相位,以及使用最接近的分析帧的信息从综合帧产生合成语音的相位作为 合成帧的频谱信息,并且获取与合成信号具有的周期一样多的合成帧。 该方法允许分析窗口在信号的周期内以与基本周期同步的方式精确地产生合成时刻的相干位置。
-
-
-