摘要:
The detecting unit detects a disconnection of communications that have been established with a display apparatus. When the disconnection of the communications is detected, the message generating unit generates a confirmation message that confirms whether the communications should be reestablished. The transmitting unit transmits the confirmation message to the display apparatus. The receiving unit receives a reply message that indicates whether the communications should be reestablished from the display apparatus. The main power supply controlling unit shuts down the main power supply when the reply message indicates that the communications should not be reestablished, or when no reply message is received.
摘要:
Disclosed herein is a pitch estimation apparatus and associated methods for estimating a fundamental frequency of an audio signal from a fundamental frequency probability density function by modeling the audio signal as a weighted mixture of a plurality of tone models corresponding respectively to harmonic structures of individual fundamental frequencies, so that the fundamental frequency probability density function of the audio signal is given as a distribution of respective weights of the plurality of the tone models.
摘要:
When a processor, which transits from a first mode that causes a guest operating system to operate to a second mode that causes a virtual machine monitor managing the guest operating system to operate, when previously set transition condition is satisfied, transits to the second mode, a determining unit determines a cause or the transition. When it is determined that an execution of a process related to a completion of writing the image information in an image storage unit on the guest operating system is the cause, a detecting unit detects an updated portion representing an unmatched portion of the image information between before and after writing.
摘要:
An unknown word is additionally registered in a speech recognition dictionary by utilizing a correction result, and a new pronunciation of the word that has been registered in a speech recognition dictionary is additionally registered in the speech recognition dictionary, thereby increasing the accuracy of speech recognition. The start time and finish time of each phoneme unit in speech data corresponding to each phoneme included in a phoneme sequence acquired by a phoneme sequence converting section 13 are added to the phoneme sequence. A phoneme sequence extracting section 15 extracts from the phoneme sequence a phoneme sequence portion composed of phonemes existing in a segment corresponding to the period from the start time to the finish time of the word segment of the word corrected by a word correcting section 9 and the extracted phoneme sequence portion is determined as the pronunciation of the corrected word. An additional registration section 17 combines the corrected word with the pronunciation determined by a pronunciation determining section 16 and additionally registers the combination as new word pronunciation data in the speech recognition dictionary 5 if it is determined that a word obtained after correction has not been registered in the speech recognition dictionary 5. The additional registration section 17 additionally registers the pronunciation determined by the pronunciation determining section 16 as another pronunciation of the corrected word if it is determined that the corrected word has been registered.
摘要:
A signaling unit of a display terminal establishes communication with a content server, video communication terminal, and PC server through a communication unit, and transmits/receives image data compressed by MPEG2, MPEG4, and JPEG. A media control unit switches compression schemes for decoding in a media processing unit on the basis of the received image data. The media processing unit performs decoding processing including inverse orthogonal transformation processing and dequantization processing in accordance with the switched compression scheme. At this time, a single processing circuit performs inverse orthogonal transformation processing and dequantization processing.
摘要:
A music information retrieval system of the present invention can retrieve unknown songs including singing voices having similar voice timbres. Voice timbre features of the songs and identifiers for the respective songs are stored in voice timbre feature storage section 2. When one of the songs is selected, similarity calculation section 3 calculates voice timbre similarities between the selected song and the respective remaining songs, based on voice timbre features of the selected song and the other songs. Similar song retrieval and display section 5 displays on a display 10 a plurality of identifiers for songs which are similar to the selected song in voice timbre. Song data reproduction section 6 reproduces song data corresponding to one or more identifiers selected from among the plurality of identifiers displayed on the display 10.
摘要:
There is provided a singing synthesis parameter data estimation system that automatically estimates singing synthesis parameter data for automatically synthesizing a human-like singing voice from an audio signal of input singing voice. A pitch parameter estimating section 9 estimates a pitch parameter, by which the pitch feature of an audio signal of synthesized singing voice is got closer to the pitch feature of the audio signal of input singing voice based on at least both of the pitch feature and lyric data with specified syllable boundaries of the audio signal of input singing voice. A dynamics parameter estimating section 11 converts the dynamics feature of the audio signal of input singing voice to a relative value with respect to the dynamics feature of the audio signal of synthesized singing voice, and estimates a dynamics parameter, by which the dynamics feature of the audio signal of synthesized singing voice is got close to the dynamics feature of the audio signal of input singing voice that has been converted to the relative value.
摘要:
A signaling unit of a display terminal establishes communication with a content server, video communication terminal, and PC server through a communication unit, and transmits/receives image data compressed by MPEG2, MPEG4, and JPEG. A media control unit switches compression schemes for decoding in a media processing unit on the basis of the received image data. The media processing unit performs decoding processing including inverse orthogonal transformation processing and dequantization processing in accordance with the switched compression scheme. At this time, a single processing circuit performs inverse orthogonal transformation processing and dequantization processing.
摘要:
The present invention provides a music artist retrieval system which makes it possible for users to automatically retrieve an unknown music artist similar to the user's favorite artist while actually reproducing and confirming a piece of music of the unknown artist. A music artist similarity map storing section (13) computes a plurality of similarities for a plurality of music artists and makes a music artist similarity map for the plurality of music artists based on the plurality of similarities, then stores the music artist similarity map. Here, the similarities are computed between one of the plurality of music artists and the other music artists based on features of the respective music artists. A similar artists selecting and displaying section (17) displays on a display plurality of indications related to one music artist and two or more music artists whose similarities are close to the one music artist, based on the music artist similarity map. A music data playing section (19) reproduces music data of a music artist related to a selected artist indication when a play command is inputted.
摘要:
An automatic system for temporal alignment between a music audio signal and lyrics is provided. The automatic system can prevent accuracy for temporal alignment from being lowered due to the influence of non-vocal sections. Alignment means of the system is provided with a phone model for singing voice that estimates phonemes corresponding to temporal-alignment features or features available for temporal alignment. The alignment means receives temporal-alignment features outputted from temporal-alignment feature extraction means, information on the vocal and non-vocal sections outputted from vocal section estimation means, and a phoneme network, and performs an alignment operation on condition that no phoneme exists at least in non-vocal sections.