摘要:
An audio signal encoding apparatus includes a processor to quantize an audio signal, to obtain a characteristic of reverberation masking that is exerted on a sound represented by the audio signal by reverberation of the sound generated in a reproduction environment by reproducing the sound, and to control a quantization step size of the audio signal that quantized based on the characteristic of the reverberation masking.
摘要:
An audio processing device includes a reverb property estimating unit that estimates a reverb property at each frequency on the basis of a first audio signal and a second audio signal representing sounds of the first audio signal output by an audio output unit and collected by an audio input unit, a gain calculating unit that determines an attenuating ratio for a component of the first audio signal at each frequency such that the larger the reverb property at the frequency is, the larger the attenuating ratio for the component at the frequency becomes, and a correcting unit that attenuates the first audio signal at the each frequency in accordance with the attenuating ratio determined for each frequency.
摘要:
A confused state determination device that includes: an audio receiver that receives input of call audio; a memory; and a processor that is connected to the memory and that is configured to detect a questioning utterance in a call-hold duration of the call audio, compute a frequency of the questioning utterance detected in the call-hold duration, and determine a user to be in a confused state in a case in which the computed questioning utterance frequency is a first threshold value or greater.
摘要:
A speech processing device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, cause the processor to execute: obtaining input speech, detecting a vowel segment contained in the input speech, estimating an accent segment contained in the input speech, calculating a first vowel segment length containing the accent segment and a second vowel segment length excluding the accent segment, and controlling at least one of the first vowel segment length and the second vowel segment length.
摘要:
A voice processing device includes: a processor; and a memory which stores a plurality of instructions, which when executed by the processor, cause the processor to execute, receiving a first signal including a plurality of voice segments; controlling such that a non-voice segment with a length equal to or greater than a predetermined first threshold value exists between at least one of the plurality of voice segments; and outputting a second signal including the plurality of voice segments and the controlled non-voice segment.
摘要:
An audio processing device includes a setting section that sets a reproduction sampling frequency Fplay and a recording sampling frequency Frec higher than Fplay, a digital-to-analogue converter that based on Fplay converts a sound source signal that is a digital signal into a reproduction signal that is an analogue signal, an analogue-to-digital converter that based on Frec converts a recording signal that is an analogue signal converter into an input signal that is a digital signal, a signal separator that separates the input signal into a low region signal contained in a band of less than Fplay and a high region signal contained in a band of the Fplay and higher, and a breakup detector that detects whether or not breakup is occurring in the reproduced sound based on power of the high region signal.
摘要:
A signal processing device includes a processor; and a memory which stores a plurality of instructions, which when executed by the processor, cause the processor to execute, receiving speech of a speaker as a first signal; detecting an expiration period included in the first signal; extracting a number of phonemes included in the expiration period; and controlling, a second signal, which is an output to the speaker, on the basis of the number of phonemes and a length of the expiration period.
摘要:
A computer-readable recording medium has stored therein an information processing program that causes a computer to execute a process including obtaining a video in which inside of a store is captured analyzing the obtained video identifying, based on a result of the analyzing, a first-type region that covers a product placed inside the store captured in the video, a second-type region that covers a person targeted for selling the product inside the store captured in the video, and a relationship that recognizes interaction between the first-type region and the second-type region and associating the identified relationship to the product covered in the first-type region.
摘要:
A sound processing device includes a processor configured to generate a first frequency spectrum of a first sound signal corresponding to a first sound received at a first input device and a second frequency spectrum of a second sound signal corresponding to the first sound received at a second input device, calculate a transfer characteristic based on a first difference between an intensity of the first frequency spectrum and an intensity of the second frequency spectrum, generate a third frequency spectrum of a third sound signal transmitted from the first input device and a fourth frequency spectrum of a fourth sound signal transmitted from the second input device, specify a suppression level of an intensity of the fourth frequency spectrum based on a second difference between an intensity of the third frequency spectrum and an intensity of the fourth frequency spectrum.
摘要:
An audio processing device includes a reverb property estimating unit that estimates a reverb property at each frequency on the basis of a first audio signal and a second audio signal representing sounds of the first audio signal output by an audio output unit and collected by an audio input unit, a gain calculating unit that determines an attenuating ratio for a component of the first audio signal at each frequency such that the larger the reverb property at the frequency is, the larger the attenuating ratio for the component at the frequency becomes, and a correcting unit that attenuates the first audio signal at the each frequency in accordance with the attenuating ratio determined for each frequency.