摘要:
An audio feature is extracted from audio signal data for each analysis frame and stored in a storage part. Then, the audio feature is read from the storage part, and an emotional state probability of the audio feature corresponding to an emotional state is calculated using one or more statistical models constructed based on previously input learning audio signal data. Then, based on the calculated emotional state probability, the emotional state of a section including the analysis frame is determined.
摘要:
An audio feature is extracted from audio signal data for each analysis frame and stored in a storage part. Then, the audio feature is read from the storage part, and an emotional state probability of the audio feature corresponding to an emotional state is calculated using one or more statistical models constructed based on previously input learning audio signal data. Then, based on the calculated emotional state probability, the emotional state of a section including the analysis frame is determined.