Estimating pitch by modeling audio as a weighted mixture of tone models for harmonic structures
    42.
    发明授权
    Estimating pitch by modeling audio as a weighted mixture of tone models for harmonic structures 有权
    通过将音频建模为谐波结构的音调模型的加权混合来估计音高

    公开(公告)号:US08543387B2

    公开(公告)日:2013-09-24

    申请号:US11849217

    申请日:2007-08-31

    IPC分类号: G10L11/04 G10L19/14 G10L19/00

    摘要: Disclosed herein is a pitch estimation apparatus and associated methods for estimating a fundamental frequency of an audio signal from a fundamental frequency probability density function by modeling the audio signal as a weighted mixture of a plurality of tone models corresponding respectively to harmonic structures of individual fundamental frequencies, so that the fundamental frequency probability density function of the audio signal is given as a distribution of respective weights of the plurality of the tone models.

    摘要翻译: 本文公开了一种音调估计装置和相关方法,用于通过对音频信号进行建模来模拟音频信号的基频,分别对应于分别对应于各个基本频率的谐波结构的多个音调模型 ,使得音频信号的基频概率密度函数被给出为多个音调模型的相应权重的分布。

    Speech recognition system and program therefor
    44.
    发明授权
    Speech recognition system and program therefor 有权
    语音识别系统及其程序

    公开(公告)号:US08401847B2

    公开(公告)日:2013-03-19

    申请号:US12516888

    申请日:2007-11-30

    IPC分类号: G10L15/26

    CPC分类号: G10L15/04 G10L15/065

    摘要: An unknown word is additionally registered in a speech recognition dictionary by utilizing a correction result, and a new pronunciation of the word that has been registered in a speech recognition dictionary is additionally registered in the speech recognition dictionary, thereby increasing the accuracy of speech recognition. The start time and finish time of each phoneme unit in speech data corresponding to each phoneme included in a phoneme sequence acquired by a phoneme sequence converting section 13 are added to the phoneme sequence. A phoneme sequence extracting section 15 extracts from the phoneme sequence a phoneme sequence portion composed of phonemes existing in a segment corresponding to the period from the start time to the finish time of the word segment of the word corrected by a word correcting section 9 and the extracted phoneme sequence portion is determined as the pronunciation of the corrected word. An additional registration section 17 combines the corrected word with the pronunciation determined by a pronunciation determining section 16 and additionally registers the combination as new word pronunciation data in the speech recognition dictionary 5 if it is determined that a word obtained after correction has not been registered in the speech recognition dictionary 5. The additional registration section 17 additionally registers the pronunciation determined by the pronunciation determining section 16 as another pronunciation of the corrected word if it is determined that the corrected word has been registered.

    摘要翻译: 通过利用校正结果,在语音识别字典中另外登记未知字,并且已经登记在语音识别词典中的单词的新发音被附加地登记在语音识别词典中,从而提高了语音识别的准确性。 对应于由音素序列转换部分13获取的音素序列中包括的每个音素的语音数据中的每个音素单元的开始时间和完成时间被添加到音素序列。 音素序列提取部分15从音素序列中提取音素序列部分,该音素序列部分由存在于由词语校正部分9修正的单词的开始时间到完成时间的时间段内的音素组成,并且 提取的音素序列部分被确定为校正字的发音。 附加注册部分17将校正的字与由发音确定部分16确定的发音组合,并且如果确定在校正之后获得的字未被登记在语音识别词典5中,则将该组合另外将该组合登记为新词发音数据 语音识别字典5.如果确定校正字已被注册,附加注册部分17另外将由发音确定部分16确定的发音作为校正字的另一发音登记。

    Music information retrieval system
    46.
    发明授权
    Music information retrieval system 有权
    音乐信息检索系统

    公开(公告)号:US08271112B2

    公开(公告)日:2012-09-18

    申请号:US12183432

    申请日:2008-07-31

    IPC分类号: G06F17/00

    摘要: A music information retrieval system of the present invention can retrieve unknown songs including singing voices having similar voice timbres. Voice timbre features of the songs and identifiers for the respective songs are stored in voice timbre feature storage section 2. When one of the songs is selected, similarity calculation section 3 calculates voice timbre similarities between the selected song and the respective remaining songs, based on voice timbre features of the selected song and the other songs. Similar song retrieval and display section 5 displays on a display 10 a plurality of identifiers for songs which are similar to the selected song in voice timbre. Song data reproduction section 6 reproduces song data corresponding to one or more identifiers selected from among the plurality of identifiers displayed on the display 10.

    摘要翻译: 本发明的音乐信息检索系统可以检索包括具有相似语音音色的歌声的未知歌曲。 各个歌曲的歌曲和标识符的语音音色特征被存储在语音音色特征存储部分2中。当相关度计算部分3中的一个被选择时,相似度计算部分3根据所选择的歌曲和相应的剩余歌曲之间的语音音色相似度,基于 所选歌曲和其他歌曲的语音音色功能。 类似的歌曲检索和显示部分5在显示器10上显示与语音中的所选歌曲类似的歌曲的多个标识符。 歌曲数据再现部分6再现与从显示器10上显示的多个标识符中选出的一个或多个标识符对应的歌曲数据。

    Singing synthesis parameter data estimation system
    47.
    发明授权
    Singing synthesis parameter data estimation system 有权
    唱歌合成参数数据估计系统

    公开(公告)号:US08244546B2

    公开(公告)日:2012-08-14

    申请号:US12470086

    申请日:2009-05-21

    IPC分类号: G10L19/00

    摘要: There is provided a singing synthesis parameter data estimation system that automatically estimates singing synthesis parameter data for automatically synthesizing a human-like singing voice from an audio signal of input singing voice. A pitch parameter estimating section 9 estimates a pitch parameter, by which the pitch feature of an audio signal of synthesized singing voice is got closer to the pitch feature of the audio signal of input singing voice based on at least both of the pitch feature and lyric data with specified syllable boundaries of the audio signal of input singing voice. A dynamics parameter estimating section 11 converts the dynamics feature of the audio signal of input singing voice to a relative value with respect to the dynamics feature of the audio signal of synthesized singing voice, and estimates a dynamics parameter, by which the dynamics feature of the audio signal of synthesized singing voice is got close to the dynamics feature of the audio signal of input singing voice that has been converted to the relative value.

    摘要翻译: 提供了一种歌唱综合参数数据估计系统,其自动地估计用于从输入歌唱声音的音频信号自动合成人声歌唱声音的歌唱合成参数数据。 音调参数估计部9基于至少两个音调特征和歌词来估计音调参数,通过该音调参数,合成歌唱声音的音频信号的音调特征越接近输入歌声的音频信号的音调特征 输入声音音频信号的指定音节边界的数据。 动态参数估计部分11将输入歌声的音频信号的动态特征转换为相对于合成歌声的音频信号的动态特征的相对值,并且估计动态参数,通过该动态参数, 合成歌声的音频信号接近已经转换为相对值的输入歌声的音频信号的动态特征。

    Music artist retrieval system and method of retrieving music artist
    49.
    发明授权
    Music artist retrieval system and method of retrieving music artist 有权
    音乐艺术家检索系统和检索音乐艺术家的方法

    公开(公告)号:US08117214B2

    公开(公告)日:2012-02-14

    申请号:US12444258

    申请日:2007-10-05

    IPC分类号: G06F17/30

    CPC分类号: G06F17/30758 G06F17/30749

    摘要: The present invention provides a music artist retrieval system which makes it possible for users to automatically retrieve an unknown music artist similar to the user's favorite artist while actually reproducing and confirming a piece of music of the unknown artist. A music artist similarity map storing section (13) computes a plurality of similarities for a plurality of music artists and makes a music artist similarity map for the plurality of music artists based on the plurality of similarities, then stores the music artist similarity map. Here, the similarities are computed between one of the plurality of music artists and the other music artists based on features of the respective music artists. A similar artists selecting and displaying section (17) displays on a display plurality of indications related to one music artist and two or more music artists whose similarities are close to the one music artist, based on the music artist similarity map. A music data playing section (19) reproduces music data of a music artist related to a selected artist indication when a play command is inputted.

    摘要翻译: 本发明提供了一种音乐艺术家检索系统,其使用户可以在实际再现和确认未知艺术家的一段音乐的同时自动检索与用户喜爱的艺术家相似的未知音乐作者。 音乐艺术家相似图存储部分(13)针对多个音乐艺术家计算多个相似度,并且基于多个相似度为多个音乐艺术家制作音乐艺术家相似图,然后存储音乐艺术家相似性图。 这里,基于各个音乐艺术家的特征,在多个音乐艺术家之一和其他音乐艺术家之间计算相似度。 类似的艺术家选择和显示部分(17)基于音乐艺术家相似性图,显示与一个音乐艺术家和两个或多个相似性接近于一个音乐人的音乐艺术家相关的多个指示。 当输入播放命令时,音乐数据播放部分(19)再现与所选艺术家指示相关的音乐艺术家的音乐数据。

    Automatic system for temporal alignment of music audio signal with lyrics
    50.
    发明授权
    Automatic system for temporal alignment of music audio signal with lyrics 有权
    音乐音频信号与歌词的时间对齐的自动系统

    公开(公告)号:US08005666B2

    公开(公告)日:2011-08-23

    申请号:US11834778

    申请日:2007-08-07

    CPC分类号: G10L15/26 G10L15/187

    摘要: An automatic system for temporal alignment between a music audio signal and lyrics is provided. The automatic system can prevent accuracy for temporal alignment from being lowered due to the influence of non-vocal sections. Alignment means of the system is provided with a phone model for singing voice that estimates phonemes corresponding to temporal-alignment features or features available for temporal alignment. The alignment means receives temporal-alignment features outputted from temporal-alignment feature extraction means, information on the vocal and non-vocal sections outputted from vocal section estimation means, and a phoneme network, and performs an alignment operation on condition that no phoneme exists at least in non-vocal sections.

    摘要翻译: 提供了一种用于音乐音频信号和歌词之间的时间对准的自动系统。 自动系统可以防止由于非声部的影响而使时间对准的精度降低。 系统的对准装置被提供有用于歌唱声音的电话模型,该模型估计对应于可用于时间对准的时间对准特征或特征的音素。 对准装置接收从时间对准特征提取装置输出的时间对准特征,从声部部分估计装置输出的声部和非声部的信息和音素网络,并且在没有音素存在的条件下执行对准操作 至少在非声部。